Article

Term Extraction and Automatic Indexing

Christian Jacquemin and Didier Bourigault

in The Oxford Handbook of Computational Linguistics

Published in print January 2005 | ISBN: 9780199276349
Published online September 2012 | | DOI: http://dx.doi.org/10.1093/oxfordhb/9780199276349.013.0033

Series: Oxford Handbooks in Linguistics

 Term Extraction and Automatic Indexing

More Like This

Show all results sharing these subjects:

  • Linguistics
  • Computational Linguistics

GO

Show Summary Details

Preview

Terms are pervasive in scientific and technical documents and their identification is a crucial issue for any application dealing with the analysis, understanding, generation, or translation of such documents. In particular, the ever-growing mass of specialized documentation available on-line, in industrial and governmental archives or in digital libraries, calls for advances in terminology processing for tasks such as information retrieval, cross-language querying, indexing of multimedia documents, translation aids, document routing and summarization, etc. This article presents a new domain of research and development in natural language processing (NLP) that is concerned with the representation, acquisition, and recognition of terms. It begins with presenting the basic notions about the concept of ‘terms’, ranging from the classical view, to the recent concepts. There are two main areas of research involving terminology in NLP, which are, term acquisition and term recognition. Finally, this article presents the recent advances and prospects in term acquisition and automatic indexing.

Keywords: terms; documentation; information retrieval; natural language processing; term acquisition; term recognition

Article.  5542 words. 

Subjects: Linguistics ; Computational Linguistics

Full text: subscription required

How to subscribe Recommend to my Librarian

Buy this work at Oxford University Press »

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.