An automatic tool to analyze and cluster macromolecular conformations based on self-organizing maps
العنوان: | An automatic tool to analyze and cluster macromolecular conformations based on self-organizing maps |
---|---|
المؤلفون: | Arnaud Blondel, Guillaume Bouvier, Michael Nilges, Mathias Ferber, Nathan Desdouits |
المساهمون: | Bioinformatique structurale - Structural Bioinformatics, Institut Pasteur [Paris] (IP)-Centre National de la Recherche Scientifique (CNRS), This work was funded by the European Union (FP7-IDEAS- ERC 294809 toM.N.). N.D. is supported by an AXA Research Fund doctoral fellowship., European Project: 294809,EC:FP7:ERC,ERC-2011-ADG_20110310,BAYCELLS(2012), Institut Pasteur [Paris]-Centre National de la Recherche Scientifique (CNRS) |
المصدر: | Bioinformatics Bioinformatics, 2015, 31 (9), pp.1490-1492. ⟨10.1093/bioinformatics/btu849⟩ Bioinformatics, Oxford University Press (OUP), 2015, 31 (9), pp.1490-1492. ⟨10.1093/bioinformatics/btu849⟩ |
بيانات النشر: | Oxford University Press (OUP), 2014. |
سنة النشر: | 2014 |
مصطلحات موضوعية: | Statistics and Probability, Self-organizing map, Protein Conformation, Computer science, Protein dynamics, Molecular Dynamics Simulation, computer.software_genre, 01 natural sciences, Biochemistry, Machine Learning, 03 medical and health sciences, 0103 physical sciences, Cluster Analysis, Cluster analysis, Molecular Biology, 030304 developmental biology, computer.programming_language, 0303 health sciences, [SDV.BBM.BS]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Structural Biology [q-bio.BM], 010304 chemical physics, Flooding algorithm, Python (programming language), [SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM], Self-Organizing Map (SOM), Computer Science Applications, Visualization, Data set, Computational Mathematics, Computational Theory and Mathematics, Data mining, computer, Algorithms, Software, Macromolecule |
الوصف: | Motivation: Sampling the conformational space of biological macromolecules generates large sets of data with considerable complexity. Data-mining techniques, such as clustering, can extract meaningful information. Among them, the self-organizing maps (SOMs) algorithm has shown great promise; in particular since its computation time rises only linearly with the size of the data set. Whereas SOMs are generally used with few neurons, we investigate here their behavior with large numbers of neurons. Results: We present here a python library implementing the full SOM analysis workflow. Large SOMs can readily be applied on heavy data sets. Coupled with visualization tools they have very interesting properties. Descriptors for each conformation of a trajectory are calculated and mapped onto a 3D landscape, the U-matrix, reporting the distance between neighboring neurons. To delineate clusters, we developed the flooding algorithm, which hierarchically identifies local basins of the U-matrix from the global minimum to the maximum. Availability and implementation: The python implementation of the SOM library is freely available on github: https://github.com/bougui505/SOMTest. Contact: michael.nilges@pasteur.fr or guillaume.bouvier@pasteur.fr Supplementary information: Supplementary data are available at Bioinformatics online. |
تدمد: | 1367-4811 1367-4803 |
الوصول الحر: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0afff4703578f3d467ad48ff5cbc0516Test https://doi.org/10.1093/bioinformatics/btu849Test |
حقوق: | OPEN |
رقم الانضمام: | edsair.doi.dedup.....0afff4703578f3d467ad48ff5cbc0516 |
قاعدة البيانات: | OpenAIRE |
تدمد: | 13674811 13674803 |
---|