Hypothesis testing on linear regressions
The goal of a simple linear regression is to approximate the population as a whole, not just the sample given. There are two common ways of evaluating how well simple linear regression models are likely to resemble the overall population, and whether there is a relationship between two variables.
With all of the tests below, we calculate a t-statistic. From the t-statistic, we take the t-statistic and reference a t-distribution table. A t-distribution table is organized by degrees of freedom (rows) and levels of certainty (columns). If the absolute value of the t-statistic is greater than the number in the t-distribution table, we reject our null hypothesis and accept our hypothesis.
Description | |
---|---|
We will reject the null hypothesis. In other words, we will conclude: "There is sufficient evidence at this | |
We will accept the null hypothesis. In other words, we will conclude: "There is not enough evidence at this |
t-test for the population correlation coefficient
The correlation coefficient for the sample is
Hypothesis | Statement | Description |
---|---|---|
There is no relationship between the two variables. | ||
There is a relationship between the two variables. |
t-test to determine linear association
This t-test is a.k.a the "slope" test. It tests whether a relationship exists, making it similar to the previous test. We fundamentally have either a hypothesis
The hypothesis
Hypothesis | Statement | Description |
---|---|---|
There is no relationship between the two variables. | ||
There is a relationship between the two variables. |
Calculating the likelihood that the population slope is some value
In full notation,
Generally, when we test if a relationship exists, we will use
F-test to determine if a line or a curve is the best fit
The F-test is also called the "analysis of variance" (ANOVA) test.
Hypothesis | F-test |
---|---|