Journal Article

High-performance gene name normalization with G<span class="smallCaps">e</span>N<span class="smallCaps">o</span>

Joachim Wermter, Katrin Tomanek and Udo Hahn

in Bioinformatics

Volume 25, issue 6, pages 815-821
Published in print March 2009 | ISSN: 1367-4803
Published online February 2009 | e-ISSN: 1460-2059 | DOI:
High-performance gene name normalization with GeNo

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: The recognition and normalization of textual mentions of gene and protein names is both particularly important and challenging. Its importance lies in the fact that they constitute the crucial conceptual entities in biomedicine. Their recognition and normalization remains a challenging task because of widespread gene name ambiguities within species, across species, with common English words and with medical sublanguage terms.

Results: We present GeNo, a highly competitive system for gene name normalization, which obtains an F-measure performance of 86.4% (precision: 87.8%, recall: 85.0%) on the BioCreAtIvE-II test set, thus being on a par with the best system on that task. Our system tackles the complex gene normalization problem by employing a carefully crafted suite of symbolic and statistical methods, and by fully relying on publicly available software and data resources, including extensive background knowledge based on semantic profiling. A major goal of our work is to present GeNo's architecture in a lucid and perspicuous way to pave the way to full reproducibility of our results.

Availability: GeNo, including its underlying resources, will be available from It is also currently deployed in the Semedico search engine at


Journal Article.  6828 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.