A statistic, introduced by Mallows in 1964, that is used as an aid in choosing between competing multiple regression models. With n observations and k explanatory variables (see regression), define s2 as the estimate of the experimental error variance. Then, for a model using just p of the k variables,, where y1, y2,…, yn are the observed values and ŷ1, ŷ2,…, ŷn are the corresponding fitted values. A model that fits well should have a Cp value close to p. An acceptable fit is provided by a model for which
where Fa, b (α) is the value exceeded by chance on 100α% of occasions by a random variable having an F-distribution with a and b degrees of freedom. Typically, α=0.05 or 0.01. For alternative approaches to model selection, see AIC, stepwise procedures.
Subjects: Probability and Statistics.