دورية أكاديمية

Procrustes is a machine-learning approach that removes cross-platform batch effects from clinical RNA sequencing data

التفاصيل البيبلوغرافية
العنوان: Procrustes is a machine-learning approach that removes cross-platform batch effects from clinical RNA sequencing data
المؤلفون: Nikita Kotlov, Kirill Shaposhnikov, Cagdas Tazearslan, Madison Chasse, Artur Baisangurov, Svetlana Podsvirova, Dawn Fernandez, Mary Abdou, Leznath Kaneunyenye, Kelley Morgan, Ilya Cheremushkin, Pavel Zemskiy, Maxim Chelushkin, Maria Sorokina, Ekaterina Belova, Svetlana Khorkova, Yaroslav Lozinsky, Katerina Nuzhdina, Elena Vasileva, Dmitry Kravchenko, Kushal Suryamohan, Krystle Nomie, John Curran, Nathan Fowler, Alexander Bagaev
المصدر: Communications Biology, Vol 7, Iss 1, Pp 1-14 (2024)
بيانات النشر: Nature Portfolio, 2024.
سنة النشر: 2024
المجموعة: LCC:Biology (General)
مصطلحات موضوعية: Biology (General), QH301-705.5
الوصف: Abstract With the increased use of gene expression profiling for personalized oncology, optimized RNA sequencing (RNA-seq) protocols and algorithms are necessary to provide comparable expression measurements between exome capture (EC)-based and poly-A RNA-seq. Here, we developed and optimized an EC-based protocol for processing formalin-fixed, paraffin-embedded samples and a machine-learning algorithm, Procrustes, to overcome batch effects across RNA-seq data obtained using different sample preparation protocols like EC-based or poly-A RNA-seq protocols. Applying Procrustes to samples processed using EC and poly-A RNA-seq protocols showed the expression of 61% of genes (N = 20,062) to correlate across both protocols (concordance correlation coefficient > 0.8, versus 26% before transformation by Procrustes), including 84% of cancer-specific and cancer microenvironment-related genes (versus 36% before applying Procrustes; N = 1,438). Benchmarking analyses also showed Procrustes to outperform other batch correction methods. Finally, we showed that Procrustes can project RNA-seq data for a single sample to a larger cohort of RNA-seq data. Future application of Procrustes will enable direct gene expression analysis for single tumor samples to support gene expression-based treatment decisions.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2399-3642
العلاقة: https://doaj.org/toc/2399-3642Test
DOI: 10.1038/s42003-024-06020-z
الوصول الحر: https://doaj.org/article/1cf786b5cf964d7491ad4bc80679a979Test
رقم الانضمام: edsdoj.1cf786b5cf964d7491ad4bc80679a979
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:23993642
DOI:10.1038/s42003-024-06020-z