If a set of numerical data has n elements and is arranged in increasing order
the lower quartile (Q1) may be taken to be the median of the lower half of the data, i.e. of x1, x2,…, x½(n−1) if n is odd, and the median of x1, x2,…, x½n if n is even. The upper quartile (Q3) may be taken to be the median of the upper half of the data, i.e. of x½(n+1), x½(n+3),…, xn if n is odd, and the median of x½(n+2), x½(n+4),…, xn if n is even. The difference Q3−Q1 is the interquartile range, a term introduced by Galton in 1882. An alternative term is the midspread.
As an example, consider the ordered data:
101, 103, 104, 105, 106, 107, 108, 109, 111, 111, 111, 115, 118, 121, 124, 127, 130, 156, 199.
There are nineteen observations. The tenth largest is 111, the median. Within the lower nine values, the fifth largest is 106 (=Q1). Within the upper nine values the fifth largest is 124 (=Q3). The inter-quartile range is 124−106=18.
When there are many observations it may be easier to read approximate values for the lower and upper quartiles from a cumulative frequency graph. These will be the values of the variable corresponding to cumulative relative frequencies of 25% and 75%, respectively.
For a continuous random variable X, the lower quartile of the distribution is such that P(X<Q1)=¼ and the upper quartile is such that P(X<Q3)=¾.
In his 1970 book on exploratory data analysis, Tukey referred (in the context of data) to the quartiles as hinges and he called the interquartile range the H-spread. Tukey defined a step as 1.5 × H-spread, and proposed that values one step beyond a hinge should be called inner fences and values two steps beyond a hinge should be called outer fences. Any data item beyond an outer fence would be called far out.
In the previous data the hinges are 106 and 124, thus the H-spread is 124−106=18 and the step is 1.5 × 18=27. The inner fences are at 106−27=79 and 124+27=151. The outer fences are at 79−27=52 and 151+27=178. The observation 199 is greater than 178 and is therefore far out.
See also boxplot; outlier; quantile; skewness; trimean.
Subjects: Probability and Statistics.