Journal Article

Predicting citation count of <i>Bioinformatics</i> papers within four years of publication

Alfonso Ibáñez, Pedro Larrañaga and Concha Bielza

in Bioinformatics

Volume 25, issue 24, pages 3303-3309
Published in print December 2009 | ISSN: 1367-4803
Published online October 2009 | e-ISSN: 1460-2059 | DOI:
Predicting citation count of Bioinformatics papers within four years of publication

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: Nowadays, publishers of scientific journals face the tough task of selecting high-quality articles that will attract as many readers as possible from a pool of articles. This is due to the growth of scientific output and literature. The possibility of a journal having a tool capable of predicting the citation count of an article within the first few years after publication would pave the way for new assessment systems.

Results: This article presents a new approach based on building several prediction models for the Bioinformatics journal. These models predict the citation count of an article within 4 years after publication (global models). To build these models, tokens found in the abstracts of Bioinformatics papers have been used as predictive features, along with other features like the journal sections and 2-week post-publication periods. To improve the accuracy of the global models, specific models have been built for each Bioinformatics journal section (Data and Text Mining, Databases and Ontologies, Gene Expression, Genetics and Population Analysis, Genome Analysis, Phylogenetics, Sequence Analysis, Structural Bioinformatics and Systems Biology). In these new models, the average success rate for predictions using the naive Bayes and logistic regression supervised classification methods was 89.4% and 91.5%, respectively, within the nine sections and for 4-year time horizon.

Availability: Supplementary material on this experimental survey is available at


Journal Article.  6248 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.