主要内容

贝叶斯优化工作流程

What Is Bayesian Optimization?

Optimization, in its most general form, is the process of locating a point that minimizes a real-valued function called theobjective function。贝叶斯优化是其中一个公关的名字ocess. Bayesian optimization internally maintains a Gaussian process model of the objective function, and uses objective function evaluations to train the model. One innovation in Bayesian optimization is the use of anacquisition function, which the algorithm uses to determine the next point to evaluate. The acquisition function can balance sampling at points that have low modeled objective functions, and exploring areas that have not yet been modeled well. For details, see贝叶斯优化算法

Bayesian optimization is part of Statistics and Machine Learning Toolbox™ because it is well-suited to optimizing超参数分类和回归算法。超参数是分类器或回归函数的内部参数,例如支持向量机的框约束或可靠分类集合的学习率。金宝app这些参数可以强烈影响分类器或回归器的性能,但是优化它们通常很困难或耗时。看贝叶斯优化特征

Typically, optimizing the hyperparameters means that you try to minimize the cross-validation loss of a classifier or regression.

执行贝叶斯优化的方法

You can perform a Bayesian optimization in several ways:

  • fitcautoandfitrauto— Pass predictor and response data to thefitcautoorfitrautofunction to optimize across a selection of model types and hyperparameter values. Unlike other approaches, usingfitcautoorfitrauto在优化之前,不需要您指定单个模型;模型选择是优化过程的一部分。优化可最大程度地减少交叉验证损失,该损失是使用多数TreeBaggermodel infitcautoand a multi-回归基model infitrauto, rather than a single Gaussian process regression model as used in other approaches. See贝叶斯优化forfitcautoand贝叶斯优化forfitrauto

  • 分类学习者and Regression Learner apps — ChooseOptimizablemodels in the machine learning apps and automatically tune their hyperparameter values by using Bayesian optimization. The optimization minimizes the model loss based on the selected validation options. This approach has fewer tuning options than using a fit function, but allows you to perform Bayesian optimization directly in the apps. See分类学习者应用程序中的超参数优化and回归学习者应用程序中的超参数优化

  • 适合功能 - 包括OptimizeHyperparameters许多拟合功能中的名称值参数自动应用贝叶斯优化。优化可最大程度地减少交叉验证损失。这种方法为您提供的调整选项比使用更少bayesopt,但是使您可以更轻松地执行贝叶斯优化。看贝叶斯优化Using a Fit Function

  • bayesopt— Exert the most control over your optimization by callingbayesoptdirectly. This approach requires you to write an objective function, which does not have to represent cross-validation loss. See贝叶斯优化Using bayesopt

贝叶斯优化Using a Fit Function

To minimize the error in a cross-validated response via Bayesian optimization, follow these steps.

  1. Choose your classification or regression solver among the fit functions that accept theOptimizeHyperparameters名称值参数。

  2. Decide on the hyperparameters to optimize, and pass them in theOptimizeHyperparameters名称值参数。For each fit function, you can choose from a set of hyperparameters. SeeEligible Hyperparameters for Fit Functions, or use the超参数function, or consult the fit function reference page.

    You can pass a cell array of parameter names. You can also set'auto'as theOptimizeHyperparametersvalue, which chooses a typical set of hyperparameters to optimize, or'all'to optimize all available parameters.

  3. For ensemble fit functionsfitcecoc,fitcensemble, andfitrensemble, also include parameters of the weak learners in theOptimizeHyperparameters单元阵列。

  4. Optionally, create an options structure for theHyperparameterOptimizationOptions名称值参数。看Hyperparameter Optimization Options for Fit Functions

  5. Call the fit function with the appropriate name-value arguments.

For examples, seeOptimize Classifier Fit Using Bayesian OptimizationandOptimize a Boosted Regression Ensemble。Also, every fit function reference page contains a Bayesian optimization example.

贝叶斯优化Usingbayesopt

To perform a Bayesian optimization usingbayesopt, follow these steps.

  1. 准备变量。看贝叶斯优化的变量

  2. Create your objective function. See贝叶斯优化目标功能。如果necessary, create constraints, too. SeeConstraints in Bayesian Optimization。To include extra parameters in an objective function, seeParameterizing Functions

  3. Decide on options, meaning theBayseoptName,Valuepairs. You are not required to pass any options tobayesoptbut you typically do, especially when trying to improve a solution.

  4. Callbayesopt

  5. 检查解决方案。您可以决定通过使用resume, or restart the optimization, usually with modified options.

For an example, seeOptimize Cross-Validated Classifier Using bayesopt

贝叶斯优化特征

贝叶斯优化算法最适合这些问题类型。

特征 Details
Low dimension

贝叶斯优化在较少数量的尺寸(通常为10或更少)中效果最好。虽然贝叶斯优化可以解决一些数十个变量的问题,但不建议将其用于高于50的尺寸。

Expensive objective

Bayesian optimization is designed for objective functions that are slow to evaluate. It has considerable overhead, typically several seconds for each iteration.

低精度

Bayesian optimization does not necessarily give very accurate results. If you have a deterministic objective function, you can sometimes improve the accuracy by starting a standard optimization algorithm from thebayesoptsolution.

Global solution

贝叶斯优化是一种全球技术。与许多其他算法不同,要搜索全局解决方案,您不必从各个初始点启动算法。

Hyperparameters

Bayesian optimization is well-suited to optimizing超参数of another function. A hyperparameter is a parameter that controls the behavior of a function. For example, theFITCSVMfunction fits an SVM model to data. It has hyperparametersBoxConstraintand内核尺度为此'rbf'KernelFunction。For an example of Bayesian optimization applied to hyperparameters, seeOptimize Cross-Validated Classifier Using bayesopt

可用于拟合功能的参数

Eligible Hyperparameters for Fit Functions

功能名称 合格的参数
fitcdiscr Delta
伽玛
鉴别
fitcecoc 编码
有资格的fitcdiscrparameters for'Learners','discriminant'
有资格的fitckernelparameters for“学习者”,“内核”
有资格的fitcknnparameters for'Learners','knn'
有资格的fitclinearparameters for'Learners','linear'
有资格的FITCSVMparameters for“学习者”,“ SVM”
有资格的fitctreeparameters for'Learners','tree'
fitcensemble Method
数值
LearnRate
有资格的fitcdiscrparameters for'Learners','discriminant'
有资格的fitcknnparameters for'Learners','knn'
有资格的fitctreeparameters for'Learners','tree'
fitcgam InitialLearnRateForInteractions
InitialLearnRateForPredictors
Interactions
MaxNumSplitsPerInteraction
MaxNumSplitsPerPredictor
numTreesPeraction
numtreesperpredictor
fitckernel 学习者
内核尺度
Lambda
NumExpansionDimensions
fitcknn NumNeighbors
Distance
距离重量
指数
标准化
fitclinear Lambda
学习者
Regularization
fitcnb DistributionNames
Width
Kernel
fitcnet Activations
Lambda
layerBiasesInitializer
LayerWeightsInitializer
LayerSizes
标准化
FITCSVM BoxConstraint
内核尺度
KernelFunction
PolynomialOrder
标准化
fitctree MinLeafSize
MaxNumSplits
SplitCriterion
numVariablestosame
fitrensemble Method
数值
LearnRate
有资格的Fitrtreeparameters for'Learners','tree':
MinLeafSize
MaxNumSplits
numVariablestosame
fitrgam InitialLearnRateForInteractions
InitialLearnRateForPredictors
Interactions
MaxNumSplitsPerInteraction
MaxNumSplitsPerPredictor
numTreesPeraction
numtreesperpredictor
fitrgp Sigma
功能
KernelFunction
内核尺度
标准化
Fitrkernel 学习者
内核尺度
Lambda
NumExpansionDimensions
Epsilon
fitrlinear Lambda
学习者
Regularization
fitrnet Activations
Lambda
layerBiasesInitializer
LayerWeightsInitializer
LayerSizes
标准化
fitrsvm BoxConstraint
内核尺度
Epsilon
KernelFunction
PolynomialOrder
标准化
Fitrtree MinLeafSize
MaxNumSplits
numVariablestosame

Hyperparameter Optimization Options for Fit Functions

使用拟合功能优化时,您可以在HyperparameterOptimizationOptions名称值参数。将值作为结构。结构中的所有字段都是可选的。

字段名称 Values Default
Optimizer
  • 'bayesopt'— Use Bayesian optimization. Internally, this setting callsbayesopt

  • 'gridsearch'- 使用网格搜索NumGridDivisionsvalues per dimension.

  • 'randomsearch'— Search at random amongMaxObjectiveEvaluations点。

'gridsearch'searches in a random order, using uniform sampling without replacement from the grid. After optimization, you can get a table in grid order by using the commandsortrows(Mdl.HyperparameterOptimizationResults)

'bayesopt'
AcquisitionFunctionName

  • “每秒预期改造”

  • 'expected-improvement'

  • “预期改造加”

  • 'expected-improvement-per-second'

  • “较低信心约束”

  • “改善概率”

采集功能的名称包括per-seconddo not yield reproducible results because the optimization depends on the runtime of the objective function. Acquisition functions whose names includeplusmodify their behavior when they are overexploiting an area. For more details, seeAcquisition Function Types

“每秒预期改造”
MaxObjectiveEvaluations 目标函数评估的最大数量。 30for'bayesopt'and'randomsearch', and the entire grid for'gridsearch'
maxtime

Time limit, specified as a positive real scalar. The time limit is in seconds, as measured byticandtoc。The run time can exceedmaxtimebecausemaxtimedoes not interrupt function evaluations.

inf
NumGridDivisions For'gridsearch', the number of values in each dimension. The value can be a vector of positive integers giving the number of values for each dimension, or a scalar that applies to all dimensions. This field is ignored for categorical variables. 10
展示图 逻辑值表示是否显示图。如果true, this field plots the best observed objective function value against the iteration number. If you use Bayesian optimization (Optimizeris'bayesopt'), then this field also plots the best estimated objective function value. The best observed objective function values and best estimated objective function values correspond to the values in theBestSoFar (observed)andBestSoFar (estim.)columns of the iterative display, respectively. You can find these values in the propertiesobjectiveminimumTraceandEstimatedObjectiveMinimumTraceofMdl.HyperparameterOptimizationResults。如果the problem includes one or two optimization parameters for Bayesian optimization, then展示图also plots a model of the objective function against the parameters. true
SaveIntermediateResults 逻辑值指示是否保存结果时是否保存结果Optimizeris'bayesopt'。如果true, this field overwrites a workspace variable named'BayesoptResults'at each iteration. The variable is aBayesianOptimizationobject. false
Verbose

Display at the command line:

  • 0— No iterative display

  • 1- 迭代显示

  • 2- 迭代显示with extra information

有关详细信息,请参阅thebayesoptVerbosename-value argument and the exampleOptimize Classifier Fit Using Bayesian Optimization

1
UseParallel 逻辑值表示是否在并行运行贝叶斯优化,这需要并行计算工具箱™。由于平行时序的可重复性不可再生,平行的贝叶斯优化不一定会产生可重复的结果。有关详细信息,请参阅Parallel Bayesian Optimization false
Repartition

Logical value indicating whether to repartition the cross-validation at every iteration. If this field isfalse, the optimizer uses a single partition for the optimization.

那个设定true通常给出最强大的结果,因为它考虑了分区噪声。但是,为了良好的结果,truerequires at least twice as many function evaluations.

false
使用以下三个选项中不超过一个。
CVPARTITION Acvpartition对象,由cvpartition 'Kfold',5如果您不指定交叉验证字段
Holdout 范围内的标量(0,1)表示持有分数
Kfold An integer greater than 1

看Also

|

Related Topics