Main Content

removeTerms

Remove terms from linear regression model

Description

example

NewMdl= removeTerms(mdl,terms)returns a linear regression model fitted using the input data and settings inmdlwith the termstermsremoved.

Examples

collapse all

Create a linear regression model using the哈尔德data set. Remove terms that have highp-values.

Load the data set.

load哈尔德X = ingredients;% predictor variablesy = heat;% response variable

Fit a linear regression model to the data.

mdl = fitlm(X,y)
mdl = Linear regression model: y ~ 1 + x1 + x2 + x3 + x4 Estimated Coefficients: Estimate SE tStat pValue ________ _______ ________ ________ (Intercept) 62.405 70.071 0.8906 0.39913 x1 1.5511 0.74477 2.0827 0.070822 x2 0.51017 0.72379 0.70486 0.5009 x3 0.10191 0.75471 0.13503 0.89592 x4 -0.14406 0.70905 -0.20317 0.84407 Number of observations: 13, Error degrees of freedom: 8 Root Mean Squared Error: 2.45 R-squared: 0.982, Adjusted R-Squared: 0.974 F-statistic vs. constant model: 111, p-value = 4.76e-07

Remove thex3andx4terms because theirp-values are high.

terms ='x3 + x4';% terms to removeNewMdl = removeTerms(mdl,terms)
NewMdl = Linear regression model: y ~ 1 + x1 + x2 Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 52.577 2.2862 22.998 5.4566e-10 x1 1.4683 0.1213 12.105 2.6922e-07 x2 0.66225 0.045855 14.442 5.029e-08 Number of observations: 13, Error degrees of freedom: 10 Root Mean Squared Error: 2.41 R-squared: 0.979, Adjusted R-Squared: 0.974 F-statistic vs. constant model: 230, p-value = 4.41e-09

NewMdlhas the same adjusted R-squared value (0.974) as the previous model, meaning the fit is as good in the new model. All the terms in the new model have extremely lowp-values.

Input Arguments

collapse all

Linear regression model, specified as aLinearModelobject created usingfitlmorstepwiselm.

Terms to remove from the regression modelmdl, specified as one of the following:

  • Character vector or string scalar formula inWilkinson Notation代表一个或多个terms. The variable names in the formula must be valid MATLAB®identifiers.

  • Terms matrixTof sizet-by-p, wheretis the number of terms andpis the number of predictor variables inmdl. The value ofT(i,j)is the exponent of variablejin termi.

    For example, supposemdlhas three variablesA,B, andCin that order. Each row ofTrepresents one term:

    • [0 0 0]— Constant term or intercept

    • [0 1 0]B; equivalently,A^0 * B^1 * C^0

    • [1 0 1]A*C

    • [2 0 0]A^2

    • [0 1 2]B*(C^2)

removeTerms把一群目前的指标变量rical predictor as a single variable. Therefore, you cannot specify an indicator variable to remove from the model. If you specify a categorical predictor to remove from the model,removeTermsremoves a group of indicator variables for the predictor in one step. SeeModify Linear Regression Model Using stepfor an example that describes how to create indicator variables manually and treat each one as a separate variable.

Output Arguments

collapse all

Linear regression model with fewer terms, returned as aLinearModelobject.NewMdlis a newly fitted model that uses the input data and settings inmdlwith the terms specified intermsremoved frommdl.

To overwrite the input argumentmdl, assign the newly fitted model tomdl:

mdl = removeTerms(mdl,terms);

More About

collapse all

Wilkinson Notation

Wilkinson notation describes the terms present in a model. The notation relates to the terms present in a model, not to the multipliers (coefficients) of those terms.

Wilkinson notation uses these symbols:

  • +means include the next variable.

  • means do not include the next variable.

  • :defines an interaction, which is a product of terms.

  • *defines an interaction and all lower-order terms.

  • ^raises the predictor to a power, exactly as in*repeated, so^includes lower-order terms as well.

  • ()groups terms.

This table shows typical examples of Wilkinson notation.

Wilkinson Notation Terms in Standard Notation
1 Constant (intercept) term
x1^k, wherekis a positive integer x1,x12, ...,x1k
x1 + x2 x1,x2
x1*x2 x1,x2,x1*x2
x1:x2 x1*x2only
–x2 Do not includex2
x1*x2 + x3 x1,x2,x3,x1*x2
x1 + x2 + x3 + x1:x2 x1,x2,x3,x1*x2
x1*x2*x3 – x1:x2:x3 x1,x2,x3,x1*x2,x1*x3,x2*x3
x1*(x2 + x3) x1,x2,x3,x1*x2,x1*x3

For more details, seeWilkinson Notation.

Algorithms

  • removeTermstreats a categorical predictor as follows:

    • A model with a categorical predictor that hasLlevels (categories) includesL– 1indicator variables. The model uses the first category as a reference level, so it does not include the indicator variable for the reference level. If the data type of the categorical predictor iscategorical, then you can check the order of categories by usingcategoriesand reorder the categories by usingreordercatsto customize the reference level. For more details about creating indicator variables, seeAutomatic Creation of Dummy Variables.

    • removeTermstreats the group ofL– 1indicator variables as a single variable. If you want to treat the indicator variables as distinct predictor variables, create indicator variables manually by usingdummyvar. Then use the indicator variables, except the one corresponding to the reference level of the categorical variable, when you fit a model. For the categorical predictorX, if you specify all columns ofdummyvar(X)and an intercept term as predictors, then the design matrix becomes rank deficient.

    • Interaction terms between a continuous predictor and a categorical predictor withLlevels consist of the element-wise product of theL– 1indicator variables with the continuous predictor.

    • Interaction terms between two categorical predictors withLandMlevels consist of the(L– 1)*(M– 1)indicator variables to include all possible combinations of the two categorical predictor levels.

    • You cannot specify higher-order terms for a categorical predictor because the square of an indicator is equal to itself.

Alternative Functionality

  • Usestepwiselmto specify terms in a starting model and continue improving the model until no single step of adding or removing a term is beneficial.

  • UseaddTermsto add specific terms to a model.

  • Usestepto optimally improve a model by adding or removing terms.

Extended Capabilities

Version History

Introduced in R2012a