Data integration by fuzzy similarity-based hierarchical clustering

التفاصيل البيبلوغرافية
العنوان: Data integration by fuzzy similarity-based hierarchical clustering
المؤلفون: Antonino Staiano, Davide Nardone, Angelo Ciaramella
المساهمون: Ciaramella, A., Nardone, D., Staiano, A.
المصدر: BMC Bioinformatics
BMC Bioinformatics, Vol 21, Iss S10, Pp 1-15 (2020)
سنة النشر: 2020
مصطلحات موضوعية: Data Analysis, Computer science, Fuzzy similarity, Data integration, Fuzzy aggregation, Hierarchical clustering, Multi-omics data, Algorithms, Cluster Analysis, Databases, Genetic, Humans, Neoplasms, Workflow, Fuzzy Logic, lcsh:Computer applications to medicine. Medical informatics, computer.software_genre, Biochemistry, Fuzzy logic, Databases, 03 medical and health sciences, 0302 clinical medicine, Genetic, Structural Biology, lcsh:QH301-705.5, Molecular Biology, Throughput (business), 030304 developmental biology, 0303 health sciences, Transitive relation, Measure (data warehouse), Cluster Analysi, Applied Mathematics, Research, Dendrogram, Computer Science Applications, Algorithm, lcsh:Biology (General), Data Analysi, 030220 oncology & carcinogenesis, Metric (mathematics), lcsh:R858-859.7, Neoplasm, Data mining, computer, Human
الوصف: Background High throughput methods, in biological and biomedical fields, acquire a large number of molecular parameters or omics data by a single experiment. Combining these omics data can significantly increase the capability for recovering fine-tuned structures or reducing the effects of experimental and biological noise in data. Results In this work we propose a multi-view integration methodology (named FH-Clust) for identifying patient subgroups from different omics information (e.g., Gene Expression, Mirna Expression, Methylation). In particular, hierarchical structures of patient data are obtained in each omic (or view) and finally their topologies are merged by consensus matrix. One of the main aspects of this methodology, is the use of a measure of dissimilarity between sets of observations, by using an appropriate metric. For each view, a dendrogram is obtained by using a hierarchical clustering based on a fuzzy equivalence relation with Łukasiewicz valued fuzzy similarity. Finally, a consensus matrix, that is a representative information of all dendrograms, is formed by combining multiple hierarchical agglomerations by an approach based on transitive consensus matrix construction. Several experiments and comparisons are made on real data (e.g., Glioblastoma, Prostate Cancer) to assess the proposed approach. Conclusions Fuzzy logic allows us to introduce more flexible data agglomeration techniques. From the analysis of scientific literature, it appears to be the first time that a model based on fuzzy logic is used for the agglomeration of multi-omic data. The results suggest that FH-Clust provides better prognostic value and clinical significance compared to the analysis of single-omic data alone and it is very competitive with respect to other techniques from literature.
تدمد: 1471-2105
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d2be2da6d4e997d058eaefa063848f80Test
https://pubmed.ncbi.nlm.nih.gov/32838739Test
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....d2be2da6d4e997d058eaefa063848f80
قاعدة البيانات: OpenAIRE