Quantities that express the amount of variation in a random variable (compare measures of location). Variation is sometimes described as spread or dispersion to distinguish it from systematic trends or differences. Measures of variation are either properties of a probability distribution or sample estimates of them.
The range of a sample is the difference between the largest and smallest value. The interquartile range is potentially more useful. If the sample is ranked in ascending order of magnitude two values of x may be found, the first of which is exceeded by 75% of the sample, the second by 25%; their difference is the interquartile range. An analogous definition applies to a probability distribution.
The variance is the expectation (or mean) of the square of the difference between a random variable and its mean; it is of fundamental importance in statistical analysis. The variance of a continuous distribution with mean μ is ∫(x − μ)2f(x) dx and is denoted by σ2. The variance of a discrete distribution is ∑ (x − μ)2p(x) and is also denoted by σ2. The sample variance of a sample of n observations with mean x̄ is ∑ (xi − x̄)2 / (n − 1) and is denoted by s2. The value (n − 1) corrects for bias.
∫(x − μ)2f(x) dx
∑ (x − μ)2p(x)
∑ (xi − x̄)2 / (n − 1)
The standard deviation is the square root of the variance, denoted by σ (for a distribution) or s (for a sample). The standard deviation has the same units of measurement as the mean, and for a normal distribution about 5% of the distribution lies beyond about two standard deviations each side of the mean. The standard deviation of the distribution of an estimated quantity is termed the standard error.
The mean deviation is the mean of the absolute deviations of the random variable from the mean.