Structural variation calling and genotyping by moment-based deep convolutional neural networks

التفاصيل البيبلوغرافية
العنوان: Structural variation calling and genotyping by moment-based deep convolutional neural networks
المؤلفون: Timothy James Becker, Dong-Guk Shin
المصدر: International Journal of Data Mining and Bioinformatics. 25:37
بيانات النشر: Inderscience Publishers, 2021.
سنة النشر: 2021
مصطلحات موضوعية: business.industry, Computer science, Sample (statistics), Feature selection, Library and Information Sciences, Machine learning, computer.software_genre, External Data Representation, Convolutional neural network, General Biochemistry, Genetics and Molecular Biology, Moment (mathematics), Structural variation, Fraction (mathematics), Artificial intelligence, business, computer, Merge (linguistics), Information Systems
الوصف: Structural Variation (SV) calling and genotyping remain an ongoing challenge using next generation sequencing technologies. The gold standard approach for genome consortia has been to utilise multiple SV calling algorithms and then merge the results based on SV type and coordinates and more recently to make use of multiple sequencing technologies for each sample cell line. This ensemble strategy provides more comprehensive SV calling but comes at the cost of high-compute run time. We make use of popular open-source machine learning libraries to formulate a new data representation suitable for mining whole genome sequences in a fraction of the ensemble time. We then compare the results to several well-established methods and ensembles. Our pure machine learning method demonstrates a new direction in technique, where feature selection and region filtering are no longer required to achieve desirable false positive rates.
تدمد: 1748-5681
1748-5673
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::401da7910cf97dc5a46c3525e0c66c8dTest
https://doi.org/10.1504/ijdmb.2021.116880Test
رقم الانضمام: edsair.doi.dedup.....401da7910cf97dc5a46c3525e0c66c8d
قاعدة البيانات: OpenAIRE