FeatureSelectionNCARegression class

Feature selection for regression using neighborhood component analysis (NCA)

expand all in page

Description

FeatureSelectionNCARegressioncontains the data, fitting information, feature weights, and other model parameters of a neighborhood component analysis (NCA) model.fsrncalearns the feature weights using a diagonal adaptation of NCA and returns an instance ofFeatureSelectionNCARegressionobject. The function achieves feature selection by regularizing the feature weights.

Construction

Create aFeatureSelectionNCAClassificationobject usingfsrnca.

Properties

expand all

`NumObservations`—Number of observations in the training data
scalar

Number of observations in the training data (XandY) after removingNaNorInfvalues, stored as a scalar.

Data Types:double

`ModelParameters`—Model parameters
structure

Model parameters used for training the model, stored as a structure.

You can access the fields ofModelParametersusing dot notation.

For example, for a FeatureSelectionNCARegression object namedmdl, you can access theLossFunctionvalue usingmdl.ModelParameters.LossFunction.

Data Types:struct

`Lambda`—Regularization parameter
scalar

Regularization parameter used for training this model, stored as a scalar. Fornobservations, the bestLambdavalue that minimizes the generalization error of the NCA model is expected to be a multiple of 1/n.

Data Types:double

`FitMethod`—Name of the fitting method used to fit this model
`'exact'`|`'none'`|`'average'`

Name of the fitting method used to fit this model, stored as one of the following:

'exact'— Perform fitting using all of the data.
'none'— No fitting. Use this option to evaluate the generalization error of the NCA model using the initial feature weights supplied in the call tofsrnca.
'average'— The software divides the data into partitions (subsets), fits each partition using theexactmethod, and returns the average of the feature weights. You can specify the number of partitions using theNumPartitionsname-value pair argument.

`Solver`—Name of the solver used to fit this model
`'lbfgs'`|`'sgd'`|`'minibatch-lbfgs'`

Name of the solver used to fit this model, stored as one of the following:

'lbfgs'— Limited memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm
'sgd'— Stochastic gradient descent (SGD) algorithm
'minibatch-lbfgs'— stochastic gradient descent with LBFGS algorithm applied to mini-batches

`GradientTolerance`—Relative convergence tolerance on gradient norm
positive scalar

Relative convergence tolerance on the gradient norm for the'lbfgs'and'minibatch-lbfgs'solvers, stored as a positive scalar value.

Data Types:double

`IterationLimit`—Maximum number of iterations for optimization
positive integer

Maximum number of iterations for optimization, stored as a positive integer value.

Data Types:double

`PassLimit`—Maximum number of passes
positive integer

Maximum number of passes for'sgd'and'minibatch-lbfgs'solvers. Every pass processes all of the observations in the data.

Data Types:double

`InitialLearningRate`—Initial learning rate
positive real scalar

Initial learning rate for'sgd'and'minibatch-lbfgs'solvers. The learning rate decays over iterations starting at the value specified forInitialLearningRate.

Use theNumTuningIterationsandTuningSubsetSizeto control the automatic tuning of initial learning rate in the call tofsrnca.

Data Types:double

`Verbose`—Verbosity level indicator
nonnegative integer

Verbosity level indicator, stored as a nonnegative integer. Possible values are:

0 — No convergence summary
1 — Convergence summary, including norm of gradient and objective function value
>1 — More convergence information, depending on the fitting algorithm. When you use the'minibatch-lbfgs'solver and verbosity level > 1, the convergence information includes the iteration log from intermediate mini-batch LBFGS fits.

Data Types:double

`InitialFeatureWeights`—Initial feature weights
p1的向量of positive real scalars

Initial feature weights, stored as ap1的向量of positive real scalars, wherepis the number of predictors inX.

Data Types:double

`FeatureWeights`—Feature weights
p1的向量of real scalar values

Feature weights, stored as ap1的向量of real scalar values, wherepis the number of predictors inX.

For'FitMethod'equal to'average',FeatureWeightsis ap-by-mmatrix, wheremis the number of partitions specified via the'NumPartitions'name-value pair argument in the call tofsrnca.

The absolute value ofFeatureWeights(k)is a measure of the importance of predictork. IfFeatureWeights(k)is close to 0, then this indicates that predictorkdoes not influence the response inY.

Data Types:double

`FitInfo`—Fit information
structure

Fit information, stored as a structure with the following fields.

Field Name	Meaning
`Iteration`	Iteration index
`Objective`	正规化的目的function for minimization
`UnregularizedObjective`	Unregularized objective function for minimization
`Gradient`	Gradient of regularized objective function for minimization

For classification,UnregularizedObjectiverepresents the negative of the leave-one-out accuracy of the NCA classifier on the training data.
For regression,UnregularizedObjectiverepresents the leave-one-out loss between the true response and the predicted response when using the NCA regression model.
For the'lbfgs'solver,Gradientis the final gradient. For the'sgd'and'minibatch-lbfgs'solvers,Gradientis the final mini-batch gradient.
IfFitMethodis'average', thenFitInfois anm1结构数组,在那里mis the number of partitions specified via the'NumPartitions'name-value pair argument.

You can access the fields ofFitInfousing dot notation. For example, for a FeatureSelectionNCARegressionobject namedmdl, you can access theObjectivefield usingmdl.FitInfo.Objective.

Data Types:struct

`Mu`—Predictor means
p1的向量|`[]`

Predictor means, stored as ap1的向量for standardized training data. In this case, thepredictmethod centers predictor matrixXby subtracting the respective element ofMufrom every column.

If data is not standardized during training, thenMuis empty.

Data Types:double

`Sigma`—Predictor standard deviations
p1的向量|`[]`

Predictor standard deviations, stored as ap1的向量for standardized training data. In this case, thepredictmethod scales predictor matrixXby dividing every column by the respective element ofSigmaafter centering the data usingMu.

If data is not standardized during training, thenSigmais empty.

Data Types:double

`X`—Predictor values
n-by-pmatrix

预测的值用来训练这个model, stored as ann-by-pmatrix.nis the number of observations andpis the number of predictor variables in the training data.

Data Types:double

`Y`—Response values
numeric vector of sizen

Response values used to train this model, stored as a numeric vector of sizen, where n is the number of observations.

Data Types:double

`W`—Observation weights
numeric vector of sizen

Observation weights used to train this model, stored as a numeric vector of sizen. The sum of observation weights isn.

Data Types:double

Methods

loss	Evaluate accuracy of learned feature weights on test data
predict	Predict responses using neighborhood component analysis (NCA) regression model
refit	Refit neighborhood component analysis (NCA) model for regression

Examples

collapse all

Explore`FeatureSelectionNCARegression`Object

Open Live Script

Load the sample data.

loadimports-85

前15列包含连续作表语用tor variables, whereas the 16th column contains the response variable, which is the price of a car. Define the variables for the neighborhood component analysis model.

Predictors = X(:,1:15); Y = X(:,16);

Fit a neighborhood component analysis (NCA) model for regression to detect the relevant features.

mdl = fsrnca(Predictors,Y);

The returned NCA model,mdl, is aFeatureSelectionNCARegressionobject. This object stores information about the training data, model, and optimization. You can access the object properties, such as the feature weights, using dot notation.

Plot the feature weights.

figure() plot(mdl.FeatureWeights,'ro') xlabel('Feature Index') ylabel('Feature Weight') gridon

Figure contains an axes object. The axes object contains an object of type line.

The weights of the irrelevant features are zero. The'Verbose',1option in the call tofsrncadisplays the optimization information on the command line. You can also visualize the optimization process by plotting the objective function versus the iteration number.

figure() plot(mdl.FitInfo.Iteration,mdl.FitInfo.Objective,'ro-') gridonxlabel('Iteration Number') ylabel('Objective')

Figure contains an axes object. The axes object contains an object of type line.

TheModelParametersproperty is astructthat contains more information about the model. You can access the fields of this property using dot notation. For example, see if the data was standardized or not.

mdl.ModelParameters.Standardize

ans =logical0

0means that the data was not standardized before fitting the NCA model. You can standardize the predictors when they are on very different scales using the'Standardize',1name-value pair argument in the call tofsrnca.

Copy Semantics

Value. To learn how value classes affect copy operations, seeCopying Objects.

Version History

Introduced in R2016b

FeatureSelectionNCARegression class

Description

Construction

Properties

`NumObservations`—Number of observations in the training data
scalar

`ModelParameters`—Model parameters
structure

`Lambda`—Regularization parameter
scalar

`FitMethod`—Name of the fitting method used to fit this model
`'exact'`|`'none'`|`'average'`

`Solver`—Name of the solver used to fit this model
`'lbfgs'`|`'sgd'`|`'minibatch-lbfgs'`

`GradientTolerance`—Relative convergence tolerance on gradient norm
positive scalar

`IterationLimit`—Maximum number of iterations for optimization
positive integer

`PassLimit`—Maximum number of passes
positive integer

`InitialLearningRate`—Initial learning rate
positive real scalar

`Verbose`—Verbosity level indicator
nonnegative integer

`InitialFeatureWeights`—Initial feature weights
p1的向量of positive real scalars

`FeatureWeights`—Feature weights
p1的向量of real scalar values

`FitInfo`—Fit information
structure

`Mu`—Predictor means
p1的向量|`[]`

`Sigma`—Predictor standard deviations
p1的向量|`[]`

`X`—Predictor values
n-by-pmatrix

`Y`—Response values
numeric vector of sizen

`W`—Observation weights
numeric vector of sizen

Methods

Examples

Explore`FeatureSelectionNCARegression`Object

Copy Semantics

Version History

See Also

Topics

FeatureSelectionNCARegression class

Description

Construction

Properties

NumObservations—Number of observations in the training datascalar

ModelParameters—Model parametersstructure

Lambda—Regularization parameterscalar

FitMethod—Name of the fitting method used to fit this model'exact'|'none'|'average'

Solver—Name of the solver used to fit this model'lbfgs'|'sgd'|'minibatch-lbfgs'

GradientTolerance—Relative convergence tolerance on gradient normpositive scalar

IterationLimit—Maximum number of iterations for optimizationpositive integer

PassLimit—Maximum number of passespositive integer

InitialLearningRate—Initial learning ratepositive real scalar

Verbose—Verbosity level indicatornonnegative integer

InitialFeatureWeights—Initial feature weightsp1的向量of positive real scalars

FeatureWeights—Feature weightsp1的向量of real scalar values

FitInfo—Fit informationstructure

Mu—Predictor meansp1的向量|[]

Sigma—Predictor standard deviationsp1的向量|[]

X—Predictor valuesn-by-pmatrix

Y—Response valuesnumeric vector of sizen

W—Observation weightsnumeric vector of sizen

Methods

Examples

ExploreFeatureSelectionNCARegressionObject

Copy Semantics

Version History

See Also

Topics

`NumObservations`—Number of observations in the training data
scalar

`ModelParameters`—Model parameters
structure

`Lambda`—Regularization parameter
scalar

`FitMethod`—Name of the fitting method used to fit this model
`'exact'`|`'none'`|`'average'`

`Solver`—Name of the solver used to fit this model
`'lbfgs'`|`'sgd'`|`'minibatch-lbfgs'`

`GradientTolerance`—Relative convergence tolerance on gradient norm
positive scalar

`IterationLimit`—Maximum number of iterations for optimization
positive integer

`PassLimit`—Maximum number of passes
positive integer

`InitialLearningRate`—Initial learning rate
positive real scalar

`Verbose`—Verbosity level indicator
nonnegative integer

`InitialFeatureWeights`—Initial feature weights
p1的向量of positive real scalars

`FeatureWeights`—Feature weights
p1的向量of real scalar values

`FitInfo`—Fit information
structure

`Mu`—Predictor means
p1的向量|`[]`

`Sigma`—Predictor standard deviations
p1的向量|`[]`

`X`—Predictor values
n-by-pmatrix

`Y`—Response values
numeric vector of sizen

`W`—Observation weights
numeric vector of sizen

Explore`FeatureSelectionNCARegression`Object