Main Content

F统计和T统计

F-statistic

目的

在线性回归中,F-statistic is the test statistic for the analysis of variance (ANOVA) approach to test the significance of the model or the components in the model.

Definition

The F-statistic in the linear model output display is the statistic for testing the statistical significance of the model. The model propertymodelfitvsnullmodel包含相同的统计数据。

The F-statistic values in theanovadisplay allow you to assess the significance of the terms or components in the model.

How To

适合回归模型(mdl) 通过使用fitlmorstepwiselm。那么你也能:

  • 找出F-statistic vs. constant model一世n the output display or by using

    disp(mdl)
  • 通过输入显示模型的F统计量

    mdl.ModelFitVsNullModel
  • Display the ANOVA for the model using

    方差分析(MDL,'summary'
  • Obtain the F-statistic values for the components, except for the constant term using

    方差分析(MDL)
    For details, see theanova方法的方法LinearModel班级。

使用F统计量评估模型的拟合

此示例显示了如何使用F统计量评估模型的拟合度以及回归系数的重要性。

Load the sample data.

加载hospitaltbl = table(医院。...'variablenames',,,,{'Age',,,,'Weight',,,,'Smoker',,,,'BloodPressure'});tbl.smoker =分类(tbl.smoker);

拟合线性回归模型。

mdl = fitlm(tbl,'BloodPressure ~ Age*Weight + Smoker + Weight^2'
mdl = Linear regression model: BloodPressure ~ 1 + Smoker + Age*Weight + Weight^2 Estimated Coefficients: Estimate SE tStat pValue __________ _________ ________ __________ (Intercept) 168.02 27.694 6.067 2.7149e-08 Age 0.079569 0.39861 0.19962 0.84221 Weight -0.69041 0.3435 -2.0099 0.047305 Smoker_true 9.8027 1.0256 9.5584 1.5969e-15 Age:Weight 0.00021796 0.0025258 0.086294 0.93142 Weight^2 0.0021877 0.0011037 1.9822 0.050375 Number of observations: 100, Error degrees of freedom: 94 Root Mean Squared Error: 4.73 R-squared: 0.528, Adjusted R- 平方:0.503 F统计与常数模型:21,P值= 4.81E-14

线性拟合与常数模型的F统计量为21,p-value of 4.81e-14. The model is significant at the 5% significance level. The R-squared value of 0.528 means the model explains about 53% of the variability in the response. There might be other predictor (explanatory) variables that are not included in the current model.

You can also programmatically access the F-statistic of the model.

mdl.ModelFitVsNullModel
ans =struct with fields:FSTAT:21.0120 PVALUE:4.8099E-14 NULLMODEL:'常数'

显示拟合模型的ANOVA表。

方差分析(MDL,'summary'
ans=5×5 tablesumSq DF MeanSq F pValue ______ __ ______ ______ __________ Total 4461.2 99 45.062 Model 2354.5 5 470.9 21.012 4.8099e-14 . Linear 2263.3 3 754.42 33.663 7.2417e-15 . Nonlinear 91.248 2 45.624 2.0358 0.1363 Residual 2106.6 94 22.411

该显示将模型中的可变性分为线性和非线性项。由于有两个非线性术语(重量^2以及之间的相互作用重量and年龄),非线性自由度DF列为2。模型中有三个线性术语(一个吸烟者一世ndicator variable,重量,,,,and年龄)。The corresponding F-statistics in theF列用于测试线性和非线性项作为单独组的重要性。

When there are replicated observations, the residual term is also separated into two parts; first is the error due to the lack of fit, and second is the pure error independent from the model, obtained from the replicated observations. In that case, the F-statistic is for testing the lack of fit, that is, whether the fit is adequate or not. But, in this example, there are no replicated observations.

Display the ANOVA table for the model terms.

方差分析(MDL)
ans=6×5桌sumSq DF MeanSq F pValue ________ __ ________ _________ __________ Age 62.991 1 62.991 2.8107 0.096959 Weight 0.064104 1 0.064104 0.0028604 0.95746 Smoker 2047.5 1 2047.5 91.363 1.5969e-15 Age:Weight 0.16689 1 0.16689 0.0074466 0.93142 Weight^2 88.057 1 88.057 3.9292 0.050375 Error 2106.6 94 22.411

这显示方差分析表分解到model terms. The corresponding F-statistics in theFcolumn assess the statistical significance of each term. For example, the F-test for吸烟者测试指标变量的系数是否吸烟者与零不同。也就是说,F检验确定吸烟者是否对血压。The degrees of freedom for each model term is the numerator degrees of freedom for the corresponding F-test. All the terms have one degree of freedom. In the case of a categorical variable, the degrees of freedom is the number of indicator variables.吸烟者只有一个指标变量,因此它也具有一个自由度。

t-statistic

目的

在线性回归中,t- 统计对于对回归系数的推断很有用。系数的假设检验一世tests the null hypothesis that it is equal to zero – meaning the corresponding term is not significant – versus the alternate hypothesis that the coefficient is different from zero.

Definition

对于系数的假设测试一世,,,,with

H0β一世= 0

H1β一世≠0,

t- 统计是:

t = b 一世 s e (( b 一世 ,,,,

在哪里se((b一世)是估计系数的标准误差b一世

How To

在获得合适的模型后,说,mdl,,,,usingfitlmorstepwiselm,,,,you can:

  • 找出coefficient estimates, the standard errors of the estimates (se)和t- 相应系数的假设检验的统计值(TSTAT)一世n the output display.

  • 使用

    显示(MDL)

Assess Significance of Regression Coefficients Using t-statistic

This example shows how to test for the significance of the regression coefficients using t-statistic.

加载样品数据并拟合线性回归模型。

加载halMDL = FITLM(成分,热)
mdl = Linear regression model: y ~ 1 + x1 + x2 + x3 + x4 Estimated Coefficients: Estimate SE tStat pValue ________ _______ ________ ________ (Intercept) 62.405 70.071 0.8906 0.39913 x1 1.5511 0.74477 2.0827 0.070822 x2 0.51017 0.72379 0.70486 0.5009 x3 0.10191 0.75471 0.13503 0.89592 x4 -0.14406 0.70905 -0.20317 0.84407 Number of observations: 13, Error degrees of freedom: 8 Root Mean Squared Error: 2.45 R-squared: 0.982, Adjusted R-Squared: 0.974 F-statistic vs. constant model: 111, p-value = 4.76e-07

You can see that for each coefficient,TSTAT =估计/SE。The p - 假设测试的价值PVALUE柱子。每个 t - 统计测试对模型中其他术语的每个术语的重要性进行了统计测试。根据这些结果,尽管该模型的R平方值在0.97处确实很高,但这些系数似乎都没有显着水平。这通常表明预测变量之间可能的多重共线性。

使用逐步回归来确定在模型中包含哪些变量。

加载halmdl = stepwiselm(ingredients,heat)
1.添加x4,fstat = 22.7985,pvalue = 0.000576232 2.添加x1,fstat = 108.2239,pvalue = 1.105281E-06
mdl = Linear regression model: y ~ 1 + x1 + x4 Estimated Coefficients: Estimate SE tStat pValue ________ ________ _______ __________ (Intercept) 103.1 2.124 48.54 3.3243e-13 x1 1.44 0.13842 10.403 1.1053e-06 x4 -0.61395 0.048645 -12.621 1.8149e-07 Number of observations: 13, Error degrees of freedom: 10 Root Mean Squared Error: 2.73 R-squared: 0.972, Adjusted R-Squared: 0.967 F-statistic vs. constant model: 177, p-value = 1.58e-08

In this example,stepwiselmstarts with the constant model (default) and uses forward selection to incrementally addx4andx1。每个predictor variable in the final model is significant given the other one is in the model. The algorithm stops when adding none of the other predictor variables significantly improves in the model. For details on stepwise regression, seestepwiselm

也可以看看

|||||

Related Topics