在为了mation Criteria for Model Selection

Misspecification tests, such as the likelihood ratio (lratiotest），，，，lagrange multiplier (lmtest）和沃尔德（waldtest）tests, are appropriate only for comparing nested models. In contrast, information criteria are model selection tools to compare any models fit to the same data—the models being compared do not need to be nested.

在为了mation criteria are likelihood-based measures of model fit that include a penalty for complexity (specifically, the number of parameters). Different information criteria are distinguished by the form of the penalty, and can favor different models.

let $\log l （（ \hat{θ} ）$ 表示模型的最大loglikelihood目标函数的价值kparameters fit totdata points. TheAICBIC功能返回这些信息标准：

Akaike信息标准（AIC）。- AIC从信息熵的角度比较了模型，该模型通过Kullback-Leibler Divergence衡量。给定模型的AIC是

$- 2 \log l （（ \hat{θ} ） + 2 k 。$
Bayesian (Schwarz) information criterion (BIC)— The BIC compares models from the perspective of decision theory, as measured by expected loss. The BIC for a given model is

$- 2 \log l （（ \hat{θ} ） + k \log （（ t ）。$
校正AIC（AICC）- 在小样本中，AIC倾向于过度拟合。AICC在AIC中添加了二阶偏差校正项，以在小样本中更好地性能。给定模型的AICC是

$AIC + \frac{2 k （（ k + 1 ）}{t - k - 1} 。$

the bias-correction term increases the penalty on the number of parameters relative to the AIC. Because the term approaches 0 with increasing sample size, AICc approaches AIC asymptotically.
the analysis in[3]建议使用AICC何时numobs/numparam<40。
一致的AIC（CAIC）— The CAIC imposes an additional penalty for complex models, as compared to the BIC. The CAIC for a given model is

$- 2 \log l （（ \hat{θ} ） + k （（ \log （（ t ） + 1 ） = Bic + k 。$
Hannan-Quinn criterion (HQC)- HQC对复杂模型的惩罚比大型样品中的BIC施加了较小的惩罚。给定模型的HQC是

$- 2 \log l （（ \hat{θ} ） + 2 k \log （（ \log （（ t ））。$

无论信息标准如何，当您比较多个模型的值时，标准的较小值表明拟合更好，更简约。

Some experts scale information criteria values byt。AICBIC设置秤结果“正常化”name-value pair argument totrue。

Compute Information Criteria Using`AICBIC`

Open Live Script

此示例显示了如何使用AICBIC计算几种竞争性GARCH模型的信息标准，以模拟数据。虽然此示例使用AICBIC，某些统计数据和机器学习工具箱™和计量经济器Toolbox™模型拟合功能还会在其估计摘要中返回信息标准。

Simulate Data

从拱门（1）数据生成过程（DGP）模拟长度50的随机路径

$\begin{array}{rclrclrclrclrclrclrclrclrclrclrclrclrclrclrclrclrclrclrclrcl} y_{t} & = & ε_{t} \\ ε_{t}^{2} & = & 0 。 5 + 0 。 1 ε_{t - 1}^{2} ，，，， \end{array}$

在哪里 $ε_{t}$ is a random Gaussian series of innovations.

RNG（1）% For reproducibilitydgp = garch（'拱'，{0.1}，'Constant'，，，，0。5）；t=50; y = simulate(DGP,T); plot(y) ylabel('Innovation'）xlabel('时间'）

图包含一个轴对象。轴对象包含一个类型行的对象。

创建竞争模型

假设DGP未知，并且Arch（1），Garch（1,1），Arch（2）和Garch（1,2）模型适用于描述DGP。

For each competing model, create aGarchmodel template for estimation.

mdl（1）= Garch（0,1）;mdl（2）= Garch（1,1）;mdl（3）= Garch（0,2）;MDL（4）= Garch（1,2）;

估计模型

Fit each model to the simulated datay，，，，compute the loglikelihood, and suppress the estimation display.

numMdl = numel(Mdl); logL = zeros(numMdl,1);％preallocationnumparam = zeros（nummdl，1）;为了j = 1：nummdl [estmdl，〜，logl（j）] = estimate（mdl（j），y，'Display'，，，，'off'）；结果=总结（ESTMDL）;numparam（j）=结果。结尾

Compute and Compare Information Criteria

对于每个型号，计算所有可用的信息标准。将结果标准化样本量t。

[~,~,ic] = aicbic(logL,numParam,T,“正常化”，真的）

ic =struct with fields:aic: [1.7619 1.8016 1.8019 1.8416] bic: [1.8384 1.9163 1.9167 1.9946] aicc: [1.7670 1.8121 1.8124 1.8594] caic: [1.8784 1.9763 1.9767 2.0746] hqc: [1.7911 1.8453 1.8456 1.8999]

我知道了是一个1-D结构数组，每个信息标准都有一个字段。每个字段都包含一个测量向量；元素j对应于产生loglikelihood的模型logL(j）。

对于每个标准，确定产生最小值的模型。

[~,minIdx] = structfun(@min,ic); [Mdl(minIdx).Description]'

ans =5x1字符串“ Garch（0,1）条件方差模型（高斯分布）”“ Garch（0,1）条件方差模型（高斯分布）”“ Garch（0,1）条件方差模型（高斯分布）”“ Garch（0，0，，1）条件方差模型（高斯分布）““ Garch（0,1）条件方差模型（高斯分布）”

最小化所有标准的模型是Arch（1）模型，该模型具有与DGP相同的结构。

References

[1]Akaike，Hirotugu。“信息理论和最大似然原理的扩展。”在Hirotugu Akaike的精选论文，，，，edited by Emanuel Parzen, Kunio Tanabe, and Genshiro Kitagawa, 199–213. New York: Springer, 1998.https://doi.org/10.1007/978-1-4612-1694-0_15。

[2]Akaike，Hirotugu。“A New Look at the Statistical Model Identification.”IEEE自动控制交易19, no. 6 (December 1974): 716–23.https://doi.org/10.1109/TAC.1974.1100705。

[3]伯纳姆，肯尼斯·P·和大卫·安德森。模型选择and Multimodel Inference: A Practical Information-Theoretic Approach。2nd ed, New York: Springer, 2002.

[4]Hannan，Edward J.和Barry G. Quinn。“确定自动性的顺序。”皇家统计学会杂志：B系列（方法论）41, no. 2 (January 1979): 190–95.https://doi.org/10.1111/j.2517-6161.1979.tb01072.x。

[5]Lütkepohl，Helmut和MarkusKrätzig，编辑。Applied Time Series Econometrics。第一版。剑桥大学出版社，2004年。https://doi.org/10.1017/CBO9780511606885。

[6]施瓦兹，基甸。“估计模型的维度。”统计年鉴6，不。2（1978年3月）：461–64。https://doi.org/10.1214/aos/1176344136。