Intervals for linear regressions
A confidence interval is a range, within which you have a certain percentage of confidence that a particular value will actually occur. A normal percentage is 95%, so it is common to say, "The 95% confidence interval is from ___ to ___." When we have a sample, we can calculate the sample mean as a single value. From there, we estimate the population mean using a confidence interval. Also, we can say that for a particular predictor value
The formula for an interval is essentially the estimate
The multipliers are gathered from tables with rows for degrees of freedom, and columns for the percentage of the confidence interval. The most common multiplier is the "one at a time" multiplier, a.k.a. the t-multiplier.
Multiplier | Description | Spreadsheet |
---|---|---|
"One at a time" multiplier | The "One at a time" multiplier is a t-multiplier that is written as | =T.INV.2T(probability, degrees of freedom) Use one or the other depending on whether we are calculating a one-tailed confidence interval, or a two-tailed confidence interval. Also, be cautious to include the period, as Excel will return different results for TINV versus T.INV. However, T.INV = T.INV.2T if you double the probability for T.INV.2T, as expected. |
Bonferroni multiplier | The Bonferroni multiplier is a t-multiplier based on slightly different parameters. It is the same as the regular mutliplier, except that the probability is divided by the number of variables being considered. | =TINV(probability / number of variables, degrees of freedom) |
Simultaneous confidence region multiplier | The multiplier is calculated using the F critical value, which is then worked into this formula, Where p is the number of variables, n is the same size, and F is a value which must be collected either from a table or spreadsheet software. | F critical value = F.INV.RT(probability, number of variables, sample size - number of variables) And from there, input the F critical value into the formula described. For a two-tailed test, you do not need to divide the probability in half. |
Working-Hotelling multiplier |
Confidence intervals regarding a particular predictor value
Type | Description |
---|---|
Confidence interval for a mean | The standard error for the mean of the responses for a particular predictor value is, Where, as usual, The standard error is a combination of the mean square error, sample size, and two types of distance: the distance between the predictor value being considered The degrees of freedom for the multiplier are When we have the covariance available, then another formula for the standard error is, Generally, we just use the "SE Fit" result given by Minitab through Stat > Regression > Regression > Predict. The standard error used in the confidence interval for a mean is slightly different than the standard error used in the prediction interval of a response. They use the same estimate, just different standard errors to establish an interval around that estimate. |
Confidence interval of the y-intercept | The standard error for the |
Prediction interval for a response | The standard error for a prediction interval is almost the same, except that it has an added MSE term compared to the confidence interval, as below, Or alternatively, |
Boundary values |
Comparison with the population
The above calculations relate to calculating the width of a normal population distribution: for a population, then 95% of values are within two standard deviations of a mean. However, the mean
Confidence interval of the slope
The standard error for the slope
Alternatively, we can get it from the Coefficients table, as the Standard Error of the Coefficient for that particular X.
And the