دورية أكاديمية

A New Data Mining Approach for the Detection of Bacterial Promoters Combining Stochastic and Combinatorial Methods.

التفاصيل البيبلوغرافية
العنوان: A New Data Mining Approach for the Detection of Bacterial Promoters Combining Stochastic and Combinatorial Methods.
المؤلفون: Catherine Eng, Charu Asthana, Bertrand Aigle, Sébastien Hergalant, Jean-François Mari, Pierre Leblond
المصدر: Journal of Computational Biology. Sep2009, Vol. 16 Issue 9, p1211-1225. 15p.
مصطلحات موضوعية: *PROMOTERS (Genetics), *BACTERIAL genomes, *BINDING sites, *DATA mining, *TRANSCRIPTION factors, *STREPTOMYCES coelicolor, *NUCLEOTIDE sequence, *BACILLUS subtilis
مستخلص: AbstractWe present a new data mining method based on stochastic analysis (Hidden Markov Model [HMM]) and combinatorial methods for discovering new transcriptional factors in bacterial genome sequences. Sigma factor binding sites (SFBSs) were described as patterns of box1–spacer–box2corresponding to the −35 and −10 DNA motifs of bacterial promoters. We used a high-order HMM in which the hidden process is a second-order HMM chain. Applied on the genome of the model bacterium Streptomyces coelicolorA3(2), the a posterioristate probabilities revealed local maxima or peaks whose distribution was enriched in the intergenic sequences (“iPeaks”for intergenic peaks). Short DNA sequences underlying the iPeakswere extracted and clustered by a hierarchical classification algorithm based on the SmithWaterman local similarity. Some selected motif consensuses were used as box1(−35 motif ) in the search of a potential neighbouring box2(−10 motif ) using a word enumeration algorithm. This new SFBS mining methodology applied on Streptomyces coelicolorwas successful to retrieve already known SFBSs and to suggest new potential transcriptional factor binding sites (TFBSs). The well-defined SigR regulon (oxidative stress response) was also used as a test quorum to compare first- and second-order HMM. Our approach also allowed the preliminary detection of known SFBSs in Bacillus subtilis. [ABSTRACT FROM AUTHOR]
قاعدة البيانات: Academic Search Index
الوصف
تدمد:10665277
DOI:10.1089/cmb.2008.0122