Simple linear regression
Regression measures the relationship between two variables. Both of the variables must be continuous — i.e., quantitative — meaning they include multiple samples of the same variable. For example, there may be two variables (
There are two types of regression,
Regression  Description 

Simple linear regression  This compares the relationship between one predictor variable 
Multiple linear regression  This compares the relationship with two or more predictor variables. For example, this could explore the regression between average daily caloric intake and average daily minutes spent exercising compared to frequency of annual doctor's visits. 
Variables can exhibit four aspects to their relationship,
Aspect  Description 

Deterministic  Some variables have a deterministic relationship, meaning they are totally related to one another by simple equations to convert between one and the other. For example, regression is not appropriate for measurements of temperature in Celsius compared to Fahrenheit, nor between measurements of height in inches compared to centimeters. Plotting these variables will result in a perfect line with a certain slope. 
Statistical  Some variables have a statistical relationship, meaning they are related to one another in somewhat complex ways that can be measured but are not simply conversions from one unit to another. For example, regression can measure the statistical relationship between average daily water consumption and annual incidence of kidney stones. Plotting these variables will result in charts with scatter and trend around an estimated trend line. 
Trend  How closely the two variables gather around a central line. 
Scatter  How scattered the two variables are away from a central line. 
Here are some variables and formulas,
Var. or form.  Description 

The experimental unit is the person, thing, or entity upon which an observation is made, e.g. a single patient, in this case patient  
Predictor variable.  
Response variable.  
Predictor value, e.g. that patient's amount of daily water consumption.  
Observed response, e.g. that patient's actual annual number of kidney stone incidents.  
Expected response based on the predictor value, e.g. that patient's expected annual number of kidney stone incidents.  
The residual error  
An ideal trend line — i.e. and ideal prediction model — will minimize the residual  
To quantify overall similarity or dissimilarity between predictions and observations, the "Least Squares Criterion" is used. This measures overall similarity or dissimilarity as the sum of all  
The quantity  
Least squares line  The "least squares line" is a.k.a. the "least squares regression line" or "estimated regression equation" and is the formula 
This formula is used to calculate the least squares line.  
This is the slope of the trend line of the scatter plot of the predictor variable and the response variable. The slope can be positive or negative. If  
This is a measurement of the shift of the trend line either up or down, vertically, to fit within the plot. It also indicates that every increase in n general, In general, we can expect the mean response to increase or decrease by 1 for  
This formula to calculate  
The sample variance measures how spreadout the observed results are. It is calculated by comparison each observed result to the mean for all observed results. Below is the formula, where  
Mean Square Error is similar to  
Scope  Sometimes, especially for hypothetical values of 
There are a few "sums of squares" approaches, each with their own insights,
Sum of squares  Formula  Description 

SSR  The "regression sum of squares" quantifies how far the slope of the regression line is from the horizontal line indicating there is no relationship. A lower SSR means there is less relationship between the predictor and reply variables, while a higher SSR means there is a stronger relationship. A lower value means less correlation.  
SSE  The "error sum of squares" or "sum of residual error squares" quantifies how much the observed results vary around the regression line of the predicted results. A higher value means more scatter, and if it substantially higher than SSR, then it indicates high scatter but also a high degree of correlation between the predictor variable and the reply variable.  
SSTO  The "total sum of squares" quantifies how much the observed results vary around their mean. Also, SSTO = SSR + SSE.  
The "coefficient of determination"

Square of correlation coefficient
, known as coefficient of determination, represents the proportion of variation in one variable that is accounted for by the variation in the other variable. For example, if height and weight of a group of persons have a correlation coefficient of 0.80, one can estimate that 64% (0.80 × 0.80 = 0.64) of variation in their weights is accounted for by the variation in their heights. Aggarwal and Ranganathan, 2016
When interpreting the sums of squares, compare them to one another. If we see that SSR is much lower than SSE, then this indicates that scatter is due to lack of correlation. If we see that SSE is much higher than SSR, then this indicates that scatter is due to variation around the regression line.
When working with an entire population, not just a sample from the population, there is some differences in syntax,
Var. or form.  Description 

Population regression line  In instances where we have an entire population in our sample — e.g. all persons living in an apartment complex, or all students at a particular school — then we can obtain the population regression line by the same techniques as those for a sample. It is equivalent to the least squares regression line, although the LSR line is an approximation of the population regression line that is used when it is not possible or feasible to obtain data for an entire population, and so only a sample is taken. 
This is the formula for the population regression line, comparable to the formula for the least squares line, and can be used when we contain all results in the entire population.  
The predicted value — equivalent to  
The error — equivalent to  
The vertical shift in the trend line — equivalent to  
The slope for the trend line — equivalent to  
The "common variance" quantifies in one number how much the observed responses vary around the predicted responses. This is useful because if we make our own predictions — e.g. forecasts for events in the future, or filling in missing data — then we can create a confidence interval with the 
There are four assumptions behind the simple linear regression model, which can be remembered by the mnemonic L.I.N.E. as below,
Assumption  Description 

Linearly related  This means that essentially the response variable can be calculated by multiplying the predictor variable by a coefficient. Also, the mean of the errors — the differences between the predicted and observed values — is zero. 
Independent  The errors are independent, meaning that they are not influencing one another. 
Normally distributed  The errors are normally distributed. 
Equal variance  For each value of the predictor variable, the errors for that value have variance that is equal to the variance for errors at other values. 
Remember that ultimately, the statistician is interested in drawing conclusions about the population as a whole, not just the observed sample. For this reason, there are separate but comparable parameters for populations and their samples. To recompare sample parameters and corresponding population parameters which they estimate,
Population parameter  Sample parameter 

Population regression line  Least squares regression line 
Endnotes
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5079093/
https://online.stat.psu.edu/stat501/lesson/1