Statistical Methods

Christer Samuelsson

in The Oxford Handbook of Computational Linguistics

Published in print January 2005 | ISBN: 9780199276349
Published online September 2012 | | DOI:

Series: Oxford Handbooks in Linguistics

Statistical Methods

More Like This

Show all results sharing these subjects:

  • Linguistics
  • Computational Linguistics



Statistical methods now belong to mainstream natural language processing. They have been successfully applied to virtually all tasks within language processing and neighbouring fields, including part-of-speech tagging, syntactic parsing, semantic interpretation, lexical acquisition, machine translation, information retrieval, and information extraction and language learning. This article reviews mathematical statistics and applies it to language modelling problems, leading up to the hidden Markov model and maximum entropy model. The real strength of maximum-entropy modelling lies in combining evidence from several rules, each one of which alone might not be conclusive, but which taken together dramatically affect the probability. Maximum-entropy modelling allows combining heterogeneous information sources to produce a uniform probabilistic model where each piece of information is formulated as a feature. The key ideas of mathematical statistics are simple and intuitive, but tend to be buried in a sea of mathematical technicalities. Finally, the article provides mathematical detail related to the topic of discussion.

Keywords: statistical methods; natural language processing; language modelling; hidden Markov model; maximum entropy modelling; probabilistic model; mathematical statistics

Article.  5509 words. 

Subjects: Linguistics ; Computational Linguistics

Full text: subscription required

How to subscribeRecommend to my Librarian

Buy this work at Oxford University Press »