Journal Article

A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles

Yanglan Gan, Jihong Guan and Shuigeng Zhou

in Bioinformatics

Volume 25, issue 16, pages 2006-2012
Published in print August 2009 | ISSN: 1367-4803
Published online June 2009 | e-ISSN: 1460-2059 | DOI: http://dx.doi.org/10.1093/bioinformatics/btp359
A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles

Show Summary Details

Preview

Motivation: Identification of core promoters is a key clue in understanding gene regulations. However, due to the diverse nature of promoter sequences, the accuracy of existing prediction approaches for non-CpG island (simply CGI)-related promoters is not as high as that for CGI-related promoters. This consequently leads to a low genome-wide promoter prediction accuracy.

Results: In this article, we first systematically analyze the similarities and differences between the two types of promoters (CGI- and non-CGI-related) from a novel structural perspective, and then devise a unified framework, called PNNP (Pattern-based Nearest Neighbor search for Promoter), to predict both CGI- and non-CGI-related promoters based on their structural features. Our comparative analysis on the structural characteristics of promoters reveals two interesting facts: (i) the structural values of CGI- and non-CGI-related promoters are quite different, but they exhibit nearly similar structural patterns; (ii) the structural patterns of promoters are obviously different from that of non-promoter sequences though the sequences have almost similar structural values. Extensive experiments demonstrate that the proposed PNNP approach is effective in capturing the structural patterns of promoters, and can significantly improve genome-wide performance of promoters prediction, especially non-CGI-related promoters prediction.

Availability: The implementation of the program PNNP is available at http://admis.tongji.edu.cn/Projects/pnnp.aspx.

Contact: jhguan@tongji.edu.cn; sgzhou@fudan.edu.cn

Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  5516 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.