DISCo-microbe: Design of an identifiable synthetic community of microbes

التفاصيل البيبلوغرافية
العنوان: DISCo-microbe: Design of an identifiable synthetic community of microbes
المؤلفون: Dana L. Carper, Alyssa A. Carrell, David J. Weston, Dale A. Pelletier, Travis J Lawrence
المصدر: PeerJ
PeerJ, Vol 8, p e8534 (2020)
بيانات النشر: PeerJ, 2019.
سنة النشر: 2019
مصطلحات موضوعية: Bioinformatics, Computer science, Taxonomic profiling, lcsh:Medicine, Sequence alignment, Computational biology, Microbiology, General Biochemistry, Genetics and Molecular Biology, 03 medical and health sciences, 0302 clinical medicine, Microbiome, 16S rRNA, Illumina dye sequencing, 030304 developmental biology, computer.programming_language, 0303 health sciences, General Neuroscience, lcsh:R, Constructed community, General Medicine, Python (programming language), Synthetic community, Workflow, Amplicon sequencing, 16s rrna gene sequencing, Community member, General Agricultural and Biological Sciences, computer, 030217 neurology & neurosurgery, In vivo experimentation
الوصف: Background Microbiomes are extremely important for their host organisms, providing many vital functions and extending their hosts’ phenotypes. Natural studies of host-associated microbiomes can be difficult to interpret due to the high complexity of microbial communities, which hinders our ability to track and identify individual members along with the many factors that structure or perturb those communities. For this reason, researchers have turned to synthetic or constructed communities in which the identities of all members are known. However, due to the lack of tracking methods and the difficulty of creating a more diverse and identifiable community that can be distinguished through next-generation sequencing, most such in vivo studies have used only a few strains. Results To address this issue, we developed DISCo-microbe, a program for the design of an identifiable synthetic community of microbes for use in in vivo experimentation. The program is composed of two modules; (1) create, which allows the user to generate a highly diverse community list from an input DNA sequence alignment using a custom nucleotide distance algorithm, and (2) subsample, which subsamples the community list to either represent a number of grouping variables, including taxonomic proportions, or to reach a user-specified maximum number of community members. As an example, we demonstrate the generation of a synthetic microbial community that can be distinguished through amplicon sequencing. The synthetic microbial community in this example consisted of 2,122 members from a starting DNA sequence alignment of 10,000 16S rRNA sequences from the Ribosomal Database Project. We generated simulated Illumina sequencing data from the constructed community and demonstrate that DISCo-microbe is capable of designing diverse communities with members distinguishable by amplicon sequencing. Using the simulated data we were able to recover sequences from between 97–100% of community members using two different post-processing workflows. Furthermore, 97–99% of sequences were assigned to a community member with zero sequences being misidentified. We then subsampled the community list using taxonomic proportions to mimic a natural plant host–associated microbiome, ultimately yielding a diverse community of 784 members. Conclusions DISCo-microbe can create a highly diverse community list of microbes that can be distinguished through 16S rRNA gene sequencing, and has the ability to subsample (i.e., design) the community for the desired number of members and taxonomic proportions. Although developed for bacteria, the program allows for any alignment input from any taxonomic group, making it broadly applicable. The software and data are freely available from GitHub (https://github.com/dlcarper/DISCo-microbeTest) and Python Package Index (PYPI).
اللغة: English
DOI: 10.7287/peerj.preprints.27898v1
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::26bcd31fbdfbf4ed8ad451e1250b4e80Test
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....26bcd31fbdfbf4ed8ad451e1250b4e80
قاعدة البيانات: OpenAIRE