## Quick Reference

A test suitable for the simultaneous testing of hypotheses concerning the equality of three or more population means. When samples have been taken from several populations, a question of interest is whether the populations all have the same mean. In the case of *m* populations, with the mean of population *j* denoted by *μ** _{j}*, the null hypothesis is

H_{0}: *μ*_{1}=*μ*_{2}=…=*μ** _{m}*,

with the alternative being that H_{0} is false.

In the case *m*=2, an appropriate test statistic (assuming the populations have the same variance) is *T* given by , where *y*¯* _{j}* is the mean of the

*n*

*values sampled from population*

_{j}*j*, and

*s*

^{2}is the pooled estimate of the common variance (see pooled estimate of common mean). The statistic

*T*has an approximate

*t*-distribution with

*ν*=(

*n*

_{1}+

*n*

_{2}−2) degrees of freedom (the approximation is exact for samples from normal distributions). Denoting the upper 100

*α*% point of a

*t*-distribution with

*ν*degrees of freedom by

*t*(

*α*,

*ν*), H

_{0}is rejected at the 200

*α*% level if |

*T*|>

*t*(

*α*,

*ν*).

In the case of *m* populations, the null hypothesis can be rewritten in the form:

H_{0}: *μ*_{1}=*μ*_{2}, *μ*_{1}=*μ*_{3}, …, *μ** _{m−1}*=

*μ*

*,*

_{m}which demonstrates that there are *c*=½*m*(*m*−1) pairs of populations that could be compared. However, if *c* independent *t*-tests are performed each at the *α* level then the overall significance level is 1−(1−*α*)* ^{c}* and is not

*α*.

In the case of equal sample sizes (all *n*), the quantity is called the **least significant difference** (**LSD**). If no differences are greater than this, then H_{0} may be accepted at the *α* level.

One way of reducing the overall significance level is to reduce the value of *α* for the individual tests. The Bonferroni inequality leads to the replacement of *α* by *α/c*: the resulting test is variously known as the Dunn test or as the Bonferroni *t*-test. A preferable alternative uses the Sidak correction, in which *α* is replaced by 1−(1−*α*)1*c*. However, both tests have rather low power when *m* is large.

Tukey suggested using the Studentized range distribution in place of the *t*-distribution. The resulting test is familiarly called either the Tukey test, the honestly significant difference test, or the **HSD** test. This test assumes equal sample sizes; modifications for unequal sizes are the Tukey–Kramer test which uses 1/*ni*+1/*nj* when comparing populations *i* and *j*, and the Spjotvoll–Stoline test which uses 2/*n**, where *n** is the smallest of the *m* sample sizes. The Tukey tests are probably the best choices of all the multiple comparison tests. Similar in spirit to the Tukey tests are the Hochberg test and the Gabriel test; their test statistics are compared with the distribution of the maximum absolute value rather than with that of the Studentized range. The Waller–Duncan test is a test based on the *F*-test for overall differences between treatments.

An alternative to comparing all pairs simultaneously is to use a multistage test. Suppose that the samples are labelled in order of their means, so that sample 1 has the least mean and sample *m* the greatest mean. Initially all *m* samples are compared. If H_{0} is accepted, then testing ceases. However, if it is rejected, then the hypotheses *μ*_{1}=*μ*_{2}=…=*μ** _{m−1}* and

*μ*

_{2}=

*μ*

_{3}=…=

*μ*

*are considered, using the Studentized range values for the comparison of*

_{m}*m*−1 populations. If a hypothesis is rejected, then comparisons of

*m*−2 populations are made. Successive reductions are made until acceptable hypotheses are found. Examples of this type are Duncan's test (which uses the significance level 1−(1−

*α*)

*when*

^{l−1}*l*means are compared), the Newman–Keuls test (which uses

*α*throughout), and the Ryan–Einot–Gabriel–Welsch (R–E–G–W) test which uses for

*l*<

*m*−1 and

*α*otherwise. A compromise between the Newman–Keuls test and the HSD test is the Tukey wholly significant difference test, which is also called the WSD test or Tukey's b-test.

[...]

*Subjects:*
Probability and Statistics.

## Related content in Oxford Index

##### Reference entries

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.