L.I.N.E. assumptions for regressions
We use various charts to interrogate our assumptions about linearity, independence, normality, and equal variance in a regression model. These assumptions are abbreviated as LINE.
Assumption | Supported | Violated |
---|---|---|
Linearity (or zero mean) of the residuals | In a scatterplot of residuals vs fitted values, the residuals bounce randomly around the zero line. | In a scatterplot of residuals vs fitted values, the residuals are mostly below zero for some fitted values, and mostly above zero for other fitted values. Oftentimes, we see residuals above zero for low and high fitted values, and below zero for medium fitted values. When looking from left to right, we will see residuals that are high, then low, then high again. |
Independence | In a scatterplot of residuals vs order, the residuals bounce randomly around the zero line. | In a scatterplot of residuals vs order, the residuals tend to follow one another closely over time. |
Normality | In a normal probability plot (QQ plot) of the residuals, the points lie close to the diagonal line. Oftentimes, a Ryan-Joiner test statistic and a corresponding P-value are provided with the QQ plot when using statistical software, for the null hypothesis of normal error terms, and the alternate hypothesis that errors terms are not normal. The P-value will exceed the alpha-value (for example, | In a normal probability plot (QQ plot) of the residuals, the points do not lie close to the diagonal line. The P-value will be less than the alpha-value (for example, |
Equal variance | In a scatterplot of residuals vs fitted values, the residuals roughly form a horizontal band around the zero line. | In a scatterplot of residuals vs fitted values, the residuals roughly form a widening megaphone shape around the zero line. When we look from left to right on the chart, we see that the residuals increase in magnitude. |