Machine Learning Analysis of Naïve B-Cell Receptor Repertoires Stratifies Celiac Disease Patients and Controls

التفاصيل البيبلوغرافية
العنوان: Machine Learning Analysis of Naïve B-Cell Receptor Repertoires Stratifies Celiac Disease Patients and Controls
المؤلفون: Or Shemesh, Pazit Polak, Knut E. A. Lundin, Ludvig M. Sollid, Gur Yaari
المصدر: Frontiers in Immunology
Frontiers in Immunology, Vol 12 (2021)
بيانات النشر: Frontiers Media SA, 2021.
سنة النشر: 2021
مصطلحات موضوعية: lcsh:Immunologic diseases. Allergy, Tissue transglutaminase, Genes, Immunoglobulin Heavy Chain, Immunology, Naive B cell, Receptors, Antigen, B-Cell, Human leukocyte antigen, Adaptive Immunity, Immunoglobulin light chain, immune response, Epitope, Machine Learning, 03 medical and health sciences, 0302 clinical medicine, Immune system, Databases, Genetic, Cluster Analysis, Data Mining, Humans, Immunology and Allergy, Original Research, 030304 developmental biology, B-Lymphocytes, 0303 health sciences, biology, breakpoint cluster region, naïve B-cells, 3. Good health, Celiac Disease, Case-Control Studies, biology.protein, Genes, Immunoglobulin Light Chain, BCR repertoire, Antibody, lcsh:RC581-607, 030215 immunology
الوصف: Celiac disease (CeD) is a common autoimmune disorder caused by an abnormal immune response to dietary gluten proteins. The disease has high heritability. HLA is the major susceptibility factor, and the HLA effect is mediated via presentation of deamidated gluten peptides by disease-associated HLA-DQ variants to CD4+ T cells. In addition to gluten-specific CD4+ T cells the patients have antibodies to transglutaminase 2 (autoantigen) and deamidated gluten peptides. These disease-specific antibodies recognize defined epitopes and they display common usage of specific heavy and light chains across patients. Interactions between T cells and B cells are likely central in the pathogenesis, but how the repertoires of naïve T and B cells relate to the pathogenic effector cells is unexplored. To this end, we applied machine learning classification models to naïve B cell receptor (BCR) repertoires from CeD patients and healthy controls. Strikingly, we obtained a promising classification performance with an F1 score of 85%. Clusters of heavy and light chain sequences were inferred and used as features for the model, and signatures associated with the disease were then characterized. These signatures included amino acid (AA) 3-mers with distinct bio-physiochemical characteristics and enriched V and J genes. We found that CeD-associated clusters can be identified and that common motifs can be characterized from naïve BCR repertoires. The results may indicate a genetic influence by BCR encoding genes in CeD. Analysis of naïve BCRs as presented here may become an important part of assessing the risk of individuals to develop CeD. Our model demonstrates the potential of using BCR repertoires and in particular, naïve BCR repertoires, as disease susceptibility markers.
تدمد: 1664-3224
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ed601ea0f57508d7e7a4ca9f778fc915Test
https://doi.org/10.3389/fimmu.2021.627813Test
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....ed601ea0f57508d7e7a4ca9f778fc915
قاعدة البيانات: OpenAIRE