Journal Article

SPICi: a fast clustering algorithm for large biological networks

Peng Jiang and Mona Singh

in Bioinformatics

Volume 26, issue 8, pages 1105-1111
Published in print April 2010 | ISSN: 1367-4803
Published online February 2010 | e-ISSN: 1460-2059 | DOI: http://dx.doi.org/10.1093/bioinformatics/btq078

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology

GO

Show Summary Details

Preview

Motivation: Clustering algorithms play an important role in the analysis of biological networks, and can be used to uncover functional modules and obtain hints about cellular organization. While most available clustering algorithms work well on biological networks of moderate size, such as the yeast protein physical interaction network, they either fail or are too slow in practice for larger networks, such as functional networks for higher eukaryotes. Since an increasing number of larger biological networks are being determined, the limitations of current clustering approaches curtail the types of biological network analyses that can be performed.

Results: We present a fast local network clustering algorithm SPICi. SPICi runs in time O(V log V+E) and space O(E), where V and E are the number of vertices and edges in the network, respectively. We evaluate SPICi's performance on several existing protein interaction networks of varying size, and compare SPICi to nine previous approaches for clustering biological networks. We show that SPICi is typically several orders of magnitude faster than previous approaches and is the only one that can successfully cluster all test networks within very short time. We demonstrate that SPICi has state-of-the-art performance with respect to the quality of the clusters it uncovers, as judged by its ability to recapitulate protein complexes and functional modules. Finally, we demonstrate the power of our fast network clustering algorithm by applying SPICi across hundreds of large context-specific human networks, and identifying modules specific for single conditions.

Availability: Source code is available under the GNU Public License at http://compbio.cs.princeton.edu/spici

Contact: mona@cs.princeton.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  5964 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.