Journal Article

The Annotation-enriched non-redundant patent sequence databases

Weizhong Li, Bartosz Kondratowicz, Hamish McWilliam, Stephane Nauche and Rodrigo Lopez

in Database

Volume 2013, issue ISSN: 0000-0000
Published online February 2013 | e-ISSN: 1758-0463 | DOI: https://dx.doi.org/10.1093/database/bat005

More Like This

Show all results sharing these subjects:

  • Bioinformatics and Computational Biology
  • Ecology and Conservation
  • Evolutionary Biology

GO

Show Summary Details

Preview

The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases.

Database URL: http://www.ebi.ac.uk/patentdata/nr/

Journal Article.  2370 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology ; Ecology and Conservation ; Evolutionary Biology

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.