Chapter

Comparison of Component Models in Analysing the Distribution of Dialectal Features

Antti Leino and Saara HyvÖnen

in Computing and Language Variation

Published by Edinburgh University Press

Published in print December 2009 | ISBN: 9780748640300
Published online September 2012 | e-ISBN: 9780748671380 | DOI: http://dx.doi.org/10.3366/edinburgh/9780748640300.003.0010
Comparison of Component Models in Analysing the Distribution of Dialectal Features

Show Summary Details

Preview

Languages are traditionally subdivided into geographically distinct dialects, although any such division is just a coarse approximation of a more fine-grained variation. This underlying variation is usually visualised in the form of maps, where the distribution of various features is shown as isoglosses. Component models such as factor analysis can be used to analyse spatial distributions of a large number of different features — such as the isogloss data in a dialect atlas or the distributions of ethnological or archaeological phenomena — with the goal of finding dialects or similar cultural aggregates. However, there are several such methods, and it is not obvious how their differences affect their usability for computational dialectology. This chapter addresses this question by comparing five such methods (factor analysis, non-negative matrix factorisation, aspect Bernoulli, independent component analysis, and principal components analysis) with two data sets describing Finnish dialectal variation. There are some fundamental differences between these methods, and some of these have implications that affect the dialectological interpretation of the results.

Keywords: dialects; component models; Finnish; isoglosses; dialectology; factor analysis; non-negative matrix factorisation; aspect Bernoulli; independent component analysis; principal components analysis

Chapter.  4646 words.  Illustrated.

Subjects: Language Teaching and Learning

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.