What Is a Linear Regression Model?

A linear regression model describes the relationship between adependent variable,y, and one or moreindependent variables,X. The dependent variable is also called theresponse variable. Independent variables are also calledexplanatoryorpredictor variables. Continuous predictor variables are also calledcovariates, and categorical predictor variables are also calledfactors. The matrixXof observations on predictor variables is usually called thedesign matrix.

A multiple linear regression model is

$y_{i} = β_{0} + β_{1} X_{i 1} + β_{2} X_{i 2} + \dots + β_{p} X_{i p} + ε_{i}, i = 1, \dots, n,$

where

y_iis theith response.
β_kis thekth coefficient, whereβ₀is the constant term in the model. Sometimes, design matrices might include information about the constant term. However,fitlmorstepwiselmby default includes a constant term in the model, so you must not enter a column of 1s into your design matrixX.
X_ijis theith observation on thejth predictor variable,j= 1, ...,p.
ε_iis theith noise term, that is, random error.

If a model includes only one predictor variable (p= 1), then the model is called a simple linear regression model.

一般来说,一个线性再保险gression model can be a model of the form

$y_{i} = β_{0} + \sum_{k = 1}^{K} β_{k} f_{k} (X_{i 1}, X_{i 2}, \dots, X_{i p}) + ε_{i}, i = 1, \dots, n,$

wheref(.) is a scalar-valued function of the independent variables,X_ijs. The functions,f(X), might be in any form including nonlinear functions or polynomials. The linearity, in the linear regression models, refers to the linearity of the coefficientsβ_k. That is, the response variable,y, is a linear function of the coefficients,β_k.

Some examples of linear models are:

$\begin{array}{l} y_{i} = β_{0} + β_{1} X_{1 i} + β_{2} X_{2 i} + β_{3} X_{3 i} + ε_{i} \\ y_{i} = β_{0} + β_{1} X_{1 i} + β_{2} X_{2 i} + β_{3} X_{1 i}^{3} + β_{4} X_{2 i}^{2} + ε_{i} \\ y_{i} = β_{0} + β_{1} X_{1 i} + β_{2} X_{2 i} + β_{3} X_{1 i} X_{2 i} + β_{4} \log X_{3 i} + ε_{i} \end{array}$

The following, however, are not linear models since they are not linear in the unknown coefficients,β_k.

$\begin{array}{l} \log y_{i} = β_{0} + β_{1} X_{1 i} + β_{2} X_{2 i} + ε_{i} \\ y_{i} = β_{0} + β_{1} X_{1 i} + \frac{1}{β_{2} X_{2 i}} + e^{β_{3} X_{1 i} X_{2 i}} + ε_{i} \end{array}$

The usual assumptions for linear regression models are:

The noise terms,ε_i, are uncorrelated.
The noise terms,ε_i, have independent and identical normal distributions with mean zero and constant variance, σ². Thus,

$\begin{array}{l} E (y_{i}) = E (\sum_{k = 0}^{K} β_{k} f_{k} (X_{i 1}, X_{i 2}, \dots, X_{i p}) + ε_{i}) \\ = \sum_{k = 0}^{K} β_{k} f_{k} (X_{i 1}, X_{i 2}, \dots, X_{i p}) + E (ε_{i}) \\ = \sum_{k = 0}^{K} β_{k} f_{k} (X_{i 1}, X_{i 2}, \dots, X_{i p}) \end{array}$

and

$V (y_{i}) = V (\sum_{k = 0}^{K} β_{k} f_{k} (X_{i 1}, X_{i 2}, \dots, X_{i p}) + ε_{i}) = V (ε_{i}) = σ^{2}$

So the variance ofy_iis the same for all levels ofX_ij.
The responsesy_iare uncorrelated.

The fitted linear function is

${\hat{y}}_{i} = \sum_{k = 0}^{K} b_{k} f_{k} (X_{i 1}, X_{i 2}, \dots, X_{i p}), i = 1, \dots, n,$

where ${\hat{y}}_{i}$ is the estimated response andb_ks are the fitted coefficients. The coefficients are estimated so as to minimize the mean squared difference between the prediction vector $\hat{y}$ and the true response vector $y$ , that is $\hat{y} - y$ . This method is called themethod of least squares. Under the assumptions on the noise terms, these coefficients also maximize the likelihood of the prediction vector.

In a linear regression model of the formy=β₁X₁+β₂X₂+ ... +β_pX_p, the coefficientβ_kexpresses the impact of a one-unit change in predictor variable,X_j, on the mean of the response E(y), provided that all other variables are held constant. The sign of the coefficient gives the direction of the effect. For example, if the linear model is E(y) = 1.8 – 2.35X₁+X₂, then –2.35 indicates a 2.35 unit decrease in the mean response with a one-unit increase inX₁, givenX₂is held constant. If the model is E(y) = 1.1 + 1.5X₁²+X₂, the coefficient ofX₁²indicates a 1.5 unit increase in the mean ofYwith a one-unit increase inX₁²given all else held constant. However, in the case of E(y) = 1.1 + 2.1X₁+ 1.5X₁², it is difficult to interpret the coefficients similarly, since it is not possible to holdX₁constant whenX₁²changes or vice versa.

References

[1] Neter, J., M. H. Kutner, C. J. Nachtsheim, and W. Wasserman.Applied Linear Statistical Models. IRWIN, The McGraw-Hill Companies, Inc., 1996.

[2] Seber, G. A. F.Linear Regression Analysis. Wiley Series in Probability and Mathematical Statistics. John Wiley and Sons, Inc., 1977.

What Is a Linear Regression Model?

References

See Also

Related Topics

统计和机器学习的工具箱Documentation

金宝app

Mastering Machine Learning: A Step-by-Step Guide with MATLAB