التفاصيل البيبلوغرافية
العنوان: |
A New Data Mining Approach for the Detection of Bacterial Promoters Combining Stochastic and Combinatorial Methods. |
المؤلفون: |
Catherine Eng, Charu Asthana, Bertrand Aigle, Sébastien Hergalant, Jean-François Mari, Pierre Leblond |
المصدر: |
Journal of Computational Biology. Sep2009, Vol. 16 Issue 9, p1211-1225. 15p. |
مصطلحات موضوعية: |
*PROMOTERS (Genetics), *BACTERIAL genomes, *BINDING sites, *DATA mining, *TRANSCRIPTION factors, *STREPTOMYCES coelicolor, *NUCLEOTIDE sequence, *BACILLUS subtilis |
مستخلص: |
AbstractWe present a new data mining method based on stochastic analysis (Hidden Markov Model [HMM]) and combinatorial methods for discovering new transcriptional factors in bacterial genome sequences. Sigma factor binding sites (SFBSs) were described as patterns of box1–spacer–box2corresponding to the −35 and −10 DNA motifs of bacterial promoters. We used a high-order HMM in which the hidden process is a second-order HMM chain. Applied on the genome of the model bacterium Streptomyces coelicolorA3(2), the a posterioristate probabilities revealed local maxima or peaks whose distribution was enriched in the intergenic sequences (“iPeaks”for intergenic peaks). Short DNA sequences underlying the iPeakswere extracted and clustered by a hierarchical classification algorithm based on the SmithWaterman local similarity. Some selected motif consensuses were used as box1(−35 motif ) in the search of a potential neighbouring box2(−10 motif ) using a word enumeration algorithm. This new SFBS mining methodology applied on Streptomyces coelicolorwas successful to retrieve already known SFBSs and to suggest new potential transcriptional factor binding sites (TFBSs). The well-defined SigR regulon (oxidative stress response) was also used as a test quorum to compare first- and second-order HMM. Our approach also allowed the preliminary detection of known SFBSs in Bacillus subtilis. [ABSTRACT FROM AUTHOR] |
قاعدة البيانات: |
Academic Search Index |