دورية أكاديمية

A Novel Method for Protein Fold Recognition Using Sequential Pattern Mining and Optimization Algorithms

التفاصيل البيبلوغرافية
العنوان: A Novel Method for Protein Fold Recognition Using Sequential Pattern Mining and Optimization Algorithms
المؤلفون: Themis P. Exarchos, Costas Papaloukas, Markos G. Tsipouras, Christos Lampros, Dimitrios I, Fotiadis Member Ieee
المساهمون: The Pennsylvania State University CiteSeerX Archives
المصدر: http://medlab.cs.uoi.gr/itab2006/proceedings/ComputationalTest Biology and Bioinformatics/16.pdf.
المجموعة: CiteSeerX
الوصف: — Protein classification in terms of fold recognition can be used to determine the structural and functional properties of newly discovered proteins. In this work we propose a method for sequence-based fold recognition which utilizes sequential pattern mining and is implemented using a three stage schema. In the first stage the training set is divided into subsets, each one containing proteins from the same fold only. Then, sequential pattern mining is applied in each of the subsets, generating a set of sequential patterns for every fold under consideration. In the second step, a scoring function evaluates the extracted sequential patterns in order to classify the proteins of the training set. A modification of the Simplex local optimization technique, that takes into account the confusion matrix produced by the training set, is employed to assign a weight factor to each fold, in order to maximize the accuracy on the training set. Finally, in the third step, the test proteins are classified using the sequential patterns extracted from the training set and the scoring function with the optimal fold weights, calculated from the training set. In order to validate the proposed method, an appropriate group of primary protein sequences were taken from the Protein Data Bank. When applying the above method without the use of the optimization step the obtained overall accuracy was 35.9%. When considering the three stage methodology, the overall accuracy was increased to 41.3%. I.
نوع الوثيقة: text
وصف الملف: application/pdf
اللغة: English
العلاقة: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.3817Test; http://medlab.cs.uoi.gr/itab2006/proceedings/ComputationalTest Biology and Bioinformatics/16.pdf
الإتاحة: http://medlab.cs.uoi.gr/itab2006/proceedings/ComputationalTest Biology and Bioinformatics/16.pdf
حقوق: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
رقم الانضمام: edsbas.E5328274
قاعدة البيانات: BASE