Intervals for linear regressions

Published: 2021 February 05Modified: 2021 March 08, 01:26:56More details

A confidence interval is a range, within which you have a certain percentage of confidence that a particular value will actually occur. A normal percentage is 95%, so it is common to say, "The 95% confidence interval is from ___ to ___." When we have a sample, we can calculate the sample mean as a single value. From there, we estimate the population mean using a confidence interval. Also, we can say that for a particular predictor value that the response falls within another confidence interval.

The formula for an interval is essentially the estimate the product of the multiplier and the standard error.

The multipliers are gathered from tables with rows for degrees of freedom, and columns for the percentage of the confidence interval. The most common multiplier is the "one at a time" multiplier, a.k.a. the t-multiplier.

Multiplier	Description	Spreadsheet
"One at a time" multiplier	The "One at a time" multiplier is a t-multiplier that is written as in shorthand, meaning it is the t value for — for a 95% interval, then the , and the 95% interval has tails at both ends, each of which is . The degrees of freedom are usually something like or depending on what we are calculating the interval for.	=T.INV.2T(probability, degrees of freedom) or =T.INV(probability, degrees of freedom) Use one or the other depending on whether we are calculating a one-tailed confidence interval, or a two-tailed confidence interval. Also, be cautious to include the period, as Excel will return different results for TINV versus T.INV. However, T.INV = T.INV.2T if you double the probability for T.INV.2T, as expected.
Bonferroni multiplier	The Bonferroni multiplier is a t-multiplier based on slightly different parameters. It is the same as the regular mutliplier, except that the probability is divided by the number of variables being considered.	=TINV(probability / number of variables, degrees of freedom)
Simultaneous confidence region multiplier	The multiplier is calculated using the F critical value, which is then worked into this formula, Where p is the number of variables, n is the same size, and F is a value which must be collected either from a table or spreadsheet software.	F critical value = F.INV.RT(probability, number of variables, sample size - number of variables) And from there, input the F critical value into the formula described. For a two-tailed test, you do not need to divide the probability in half.
Working-Hotelling multiplier

Confidence intervals regarding a particular predictor value

Type	Description
Confidence interval for a mean	The standard error for the mean of the responses for a particular predictor value is, Where, as usual, The standard error is a combination of the mean square error, sample size, and two types of distance: the distance between the predictor value being considered and the mean of all predictor values ; and the distance between all predictor values and the mean of all predictor values . So if the mean of all predictor values is 5, and we are calculating the confidence interval of the response when the predictor is 3, then . The degrees of freedom for the multiplier are . When we have the covariance available, then another formula for the standard error is, Generally, we just use the "SE Fit" result given by Minitab through Stat > Regression > Regression > Predict. The standard error used in the confidence interval for a mean is slightly different than the standard error used in the prediction interval of a response. They use the same estimate, just different standard errors to establish an interval around that estimate.
Confidence interval of the y-intercept	The standard error for the is calculated using the above formula, with and . However, this is just one example of how we could test for the confidence interval of various value responses based on various value predictors.
Prediction interval for a response	The standard error for a prediction interval is almost the same, except that it has an added MSE term compared to the confidence interval, as below, Or alternatively,
Boundary values

Comparison with the population

The above calculations relate to calculating the width of a normal population distribution: for a population, then 95% of values are within two standard deviations of a mean. However, the mean and variance are seldom known for the population.

Confidence interval of the slope

The standard error for the slope is calculated as follows,

Alternatively, we can get it from the Coefficients table, as the Standard Error of the Coefficient for that particular X.

And the , due to two degrees of freedom being lost.