دورية أكاديمية

CLPred: a sequence-based protein crystallization predictor using BLSTM neural network.

التفاصيل البيبلوغرافية
العنوان: CLPred: a sequence-based protein crystallization predictor using BLSTM neural network.
المؤلفون: Xuan, Wenjing, Liu, Ning, Huang, Neng, Li, Yaohang, Wang, Jianxin
المصدر: Bioinformatics; 2020 Supplement, Vol. 36, pi709-i717, 9p
مصطلحات موضوعية: SOX transcription factors, DEEP learning, RECURRENT neural networks, VIRAL proteins, PROTEIN structure
مستخلص: Motivation Determining the structures of proteins is a critical step to understand their biological functions. Crystallography-based X-ray diffraction technique is the main method for experimental protein structure determination. However, the underlying crystallization process, which needs multiple time-consuming and costly experimental steps, has a high attrition rate. To overcome this issue, a series of in silico methods have been developed with the primary aim of selecting the protein sequences that are promising to be crystallized. However, the predictive performance of the current methods is modest. Results We propose a deep learning model, so-called CLPred, which uses a bidirectional recurrent neural network with long short-term memory (BLSTM) to capture the long-range interaction patterns between k -mers amino acids to predict protein crystallizability. Using sequence only information, CLPred outperforms the existing deep-learning predictors and a vast majority of sequence-based diffraction-quality crystals predictors on three independent test sets. The results highlight the effectiveness of BLSTM in capturing non-local, long-range inter-peptide interaction patterns to distinguish proteins that can result in diffraction-quality crystals from those that cannot. CLPred has been steadily improved over the previous window-based neural networks, which is able to predict crystallization propensity with high accuracy. CLPred can also be improved significantly if it incorporates additional features from pre-extracted evolutional, structural and physicochemical characteristics. The correctness of CLPred predictions is further validated by the case studies of Sox transcription factor family member proteins and Zika virus non-structural proteins. Availability and implementation https://github.com/xuanwenjing/CLPredTest. [ABSTRACT FROM AUTHOR]
Copyright of Bioinformatics is the property of Oxford University Press / USA and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:13674803
DOI:10.1093/bioinformatics/btaa791