Text-to-Speech Synthesis

Thierry Dutoit and Yannis Stylianou

in The Oxford Handbook of Computational Linguistics

Published in print January 2005 | ISBN: 9780199276349
Published online September 2012 | | DOI:

Series: Oxford Handbooks in Linguistics

Text-to-Speech Synthesis

More Like This

Show all results sharing these subjects:

  • Linguistics
  • Computational Linguistics
  • Phonetics and Phonology



This article gives an introduction to state-of-the-art text-to-speech (TTS) synthesis systems, showing both the natural language processing and the digital signal processing problems involved. Text-to-speech (TTS) synthesis is the art of designing talking machines. The article begins with brief user-oriented description of a general TTS system and comments on its commercial applications. It then gives a functional diagram of a modern TTS system, highlighting its components. It describes its morphosyntactic module. Furthermore, it examines why sentence-level phonetization cannot be achieved by a sequence of dictionary look-ups, and describes possible implementations of the phonetizer. Finally, the article describes prosody generation, outlining how intonation and duration can approximately be computed from text. Prosody refers to certain properties of the speech signal, which are related to audible changes in pitch, loudness, and syllable length. This article also introduces the two main existing categories of techniques for waveform generation: synthesis by rule and concatenative synthesis.

Keywords: text-to-speech synthesis; natural language processing; digital signal processing; morphosyntactic module; phonetization; prosody generation; waveform generation

Article.  5163 words. 

Subjects: Linguistics ; Computational Linguistics ; Phonetics and Phonology

Full text: subscription required

How to subscribeRecommend to my Librarian

Buy this work at Oxford University Press »