دورية أكاديمية

Small patient datasets reveal genetic drivers of non-small cell lung cancer subtypes using machine learning for hypothesis generation

التفاصيل البيبلوغرافية
العنوان: Small patient datasets reveal genetic drivers of non-small cell lung cancer subtypes using machine learning for hypothesis generation
المؤلفون: Moses Cook, Bessi Qorri, Amruth Baskar, Jalal Ziauddin, Luca Pani, Shashibushan Yenkanchi, Joseph Geraci
المصدر: Exploration of Medicine, Vol 4, Iss 4, Pp 428-440 (2023)
بيانات النشر: Open Exploration Publishing Inc., 2023.
سنة النشر: 2023
المجموعة: LCC:Other systems of medicine
مصطلحات موضوعية: artificial intelligence, small datasets, genetic subtypes, disease heterogeneity, squamous cell carcinoma, adenocarcinoma, Other systems of medicine, RZ201-999
الوصف: Aim: Many small datasets of significant value exist in the medical space that are being underutilized. Due to the heterogeneity of complex disorders found in oncology, systems capable of discovering patient subpopulations while elucidating etiologies are of great value as they can indicate leads for innovative drug discovery and development. Methods: Two small non-small cell lung cancer (NSCLC) datasets (GSE18842 and GSE10245) consisting of 58 samples of adenocarcinoma (ADC) and 45 samples of squamous cell carcinoma (SCC) were used in a machine intelligence framework to identify genetic biomarkers differentiating these two subtypes. Utilizing a set of standard machine learning (ML) methods, subpopulations of ADC and SCC were uncovered while simultaneously extracting which genes, in combination, were significantly involved in defining the subpopulations. A previously described interactive hypothesis-generating method designed to work with ML methods was employed to provide an alternative way of extracting the most important combination of variables to construct a new data set. Results: Several genes were uncovered that were previously implicated by other methods. This framework accurately discovered known subpopulations, such as genetic drivers associated with differing levels of aggressiveness within the SCC and ADC subtypes. Furthermore, phyosphatidylinositol glycan anchor biosynthesis, class X (PIGX) was a novel gene implicated in this study that warrants further investigation due to its role in breast cancer proliferation. Conclusions: The ability to learn from small datasets was highlighted and revealed well-established properties of NSCLC. This showcases the utility of ML techniques to reveal potential genes of interest, even from small datasets, shedding light on novel driving factors behind subpopulations of patients.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2692-3106
العلاقة: https://www.explorationpub.com/Journals/em/Article/1001153Test; https://doaj.org/toc/2692-3106Test
DOI: 10.37349/emed.2023.00153
الوصول الحر: https://doaj.org/article/dce635580c684506a07fbd6b5964fc05Test
رقم الانضمام: edsdoj.635580c684506a07fbd6b5964fc05
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:26923106
DOI:10.37349/emed.2023.00153