Information Extraction

Ralph Grishman

in The Oxford Handbook of Computational Linguistics

Published in print January 2005 | ISBN: 9780199276349
Published online September 2012 | | DOI:

Series: Oxford Handbooks in Linguistics

Information Extraction


Information extraction (IE) is the automatic identification of selected types of entities, relations, or events in free text. This article appraises two specific strands of IE — name identification and classification, and event extraction. Conventional treatment of languages pays little attention to proper names, addresses etc. Presentations of language analysis generally look up words in a dictionary and identify them as nouns etc. The incessant presence of names in a text, makes linguistic analysis of the same difficult, in the absence of the names being identified by their types and as linguistic units. Name tagging involves creating, several finite-state patterns, each corresponding to some noun subset. Elements of the patterns would match specific/classes of tokens with particular features. Event extraction typically works by creating a series of regular expressions, customized to capture the relevant events. Enhancement of each expression is corresponded by a relevant, suitable enhancement in the event patterns.

Keywords: automatic; name; event; linguistic analysis; tagging; patterns

Article.  5267 words. 

Subjects: Linguistics ; Computational Linguistics

Full text: subscription required

How to subscribeRecommend to my Librarian

Buy this work at Oxford University Press »