دورية أكاديمية

Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning?

التفاصيل البيبلوغرافية
العنوان: Can we infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes by machine learning?
المؤلفون: Hua-Ping Liu, Dongwen Wang, Hung-Ming Lai
المصدر: Computational and Structural Biotechnology Journal, Vol 20, Iss , Pp 2672-2679 (2022)
بيانات النشر: Elsevier, 2022.
سنة النشر: 2022
المجموعة: LCC:Biotechnology
مصطلحات موضوعية: Single cell transcriptomes, Circulating tumor cells, RNA-seq, Translational bioinformatics, Digit medicine, Biotechnology, TP248.13-248.65
الوصف: There is a growing need to build a model that uses single cell RNA-seq (scRNA-seq) to separate malignant cells from nonmalignant cells and to identify tumor of origin of single cells and/or circulating tumor cells (CTCs). Currently, it is infeasible to build a tumor of origin model learnt from scRNA-seq by machine learning (ML). We then wondered if an ML model learnt from bulk transcriptomes is applicable to scRNA-seq to infer single cells’ tumor presence and further indicate their tumor of origin. We used k-nearest neighbors, one-versus-all support vector machine, one-versus-one support vector machine, random forest and introduced scTumorTrace to conduct a pioneering experiment containing leukocytes and seven major cancer types where bulk RNA-seq and scRNA-seq data were available. 13 ML models learnt from bulk RNA-seq were all reliable to use (F-score > 96%) shown by a validation set of bulk transcriptomes, but none of them was applicable to scRNA-seq except scTumorTrace. Making inferences from bulk RNA-seq to scRNA-seq was impaired by feature selection and improved by log2-transformed TPM units. scTumorTrace with transcriptome-wide 2-tuples showed F-score beyond 98.74 and 94.29% in inferring tumor presence and tumor of origin at single-cell resolution and correctly identified 45 single candidate prostate CTCs but lineage-confirmed non-CTCs as leukocytes. We concluded that modern ML techniques are quantitative and could hardly address the raised questions. scTumorTrace with transcriptome-wide 2-tuples is qualitative, standardization-free and not subject to log2-transformed quantities, enabling us to infer tumor presence of single cell transcriptomes and their tumor of origin from bulk transcriptomes.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2001-0370
العلاقة: http://www.sciencedirect.com/science/article/pii/S200103702200191XTest; https://doaj.org/toc/2001-0370Test
DOI: 10.1016/j.csbj.2022.05.035
الوصول الحر: https://doaj.org/article/e9d0dca5838f4c37bf87d4985ec8c43cTest
رقم الانضمام: edsdoj.9d0dca5838f4c37bf87d4985ec8c43c
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:20010370
DOI:10.1016/j.csbj.2022.05.035