mdscale
Nonclassical multidimensional scaling
Syntax
Y = mdscale(D,p)
[Y,stress] = mdscale(D,p)
[Y,stress,disparities] = mdscale(D,p)
[...] = mdscale(D,p,'Name
',value
)
Description
Y = mdscale(D,p)
performs nonmetric multidimensional scaling on then-by-ndissimilarity matrixD
, and returnsY
, a configuration ofnpoints (rows) inp
dimensions (columns). The Euclidean distances between points inY
approximate a monotonic transformation of the corresponding dissimilarities inD
. By default,mdscale
uses Kruskal's normalized stress1 criterion.
You can specifyD
as either a fulln-by-nmatrix, or in upper triangle form such as is output bypdist
. A full dissimilarity matrix must be real and symmetric, and have zeros along the diagonal and non-negative elements everywhere else. A dissimilarity matrix in upper triangle form must have real, non-negative entries.mdscale
treatsNaN
s inD
作为缺失值,忽略了这些元素。Inf
is not accepted.
You can also specifyD
as a full similarity matrix, with ones along the diagonal and all other elements less than one.mdscale
transforms a similarity matrix to a dissimilarity matrix in such a way that distances between the points returned inY
approximatesqrt(1-D)
. To use a different transformation, transform the similarities prior to callingmdscale
.
[Y,stress] = mdscale(D,p)
returns the minimized stress, i.e., the stress evaluated atY
.
[Y,stress,disparities] = mdscale(D,p)
returns the disparities, that is, the monotonic transformation of the dissimilaritiesD
.
[...] = mdscale(D,p,'
specifies one or more optional parameter name/value pairs that control further details ofName
',value
)mdscale
. SpecifyName
in single quotes. Available parameters are
Criterion
— The goodness-of-fit criterion to minimize. This also determines the type of scaling, either non-metric or metric, thatmdscale
performs. Choices for non-metric scaling are:'stress'
— Stress normalized by the sum of squares of the inter-point distances, also known as stress1. This is the default.'sstress'
— Squared stress, normalized with the sum of 4th powers of the inter-point distances.
Choices for metric scaling are:
'metricstress'
— Stress, normalized with the sum of squares of the dissimilarities.'metricsstress'
— Squared stress, normalized with the sum of 4th powers of the dissimilarities.'sammon'
— Sammon's nonlinear mapping criterion. Off-diagonal dissimilarities must be strictly positive with this criterion.'strain'
— A criterion equivalent to that used in classical multidimensional scaling.
Weights
— A matrix or vector the same size asD
, containing nonnegative dissimilarity weights. You can use these to weight the contribution of the corresponding elements ofD
in computing and minimizing stress. Elements ofD
corresponding to zero weights are effectively ignored.Note
When you specify weights as a full matrix, its diagonal elements are ignored and have no effect, since the corresponding diagonal elements of
D
do not enter into the stress calculation.Start
— Method used to choose the initial configuration of points for Y. The choices are'cmdscale'
— Use the classical multidimensional scaling solution. This is the default.'cmdscale'
is not valid when there are zero weights.'random'
— Choose locations randomly from an appropriately scaled p-dimensional normal distribution with uncorrelated coordinates.Ann-by-
p
matrix of initial locations, where n is the size of the matrixD
andp
is the number of columns of the output matrixY
. In this case, you can pass in[]
forp
andmdscale
infersp
from the second dimension of the matrix. You can also supply a 3-D array, implying a value for'Replicates'
from the array's third dimension.
Replicates
— Number of times to repeat the scaling, each with a new initial configuration. The default is1
.Options
— Options for the iterative algorithm used to minimize the fitting criterion. Pass in an options structure created bystatset
. For example,opts = statset(param1,val1,param2,val2, ...); [...] = mdscale(...,'Options',opts)
The choices of
statset
parameters are'Display'
— Level of display output. The choices are'off'
(the default),'iter'
, and“最后一次”
.'MaxIter'
— Maximum number of iterations allowed. The default is200
.'TolFun'
— Termination tolerance for the stress criterion and its gradient. The default is1e-4
.'TolX'
— Termination tolerance for the configuration location step size. The default is1e-4
.
Examples
load cereal.mat X = [Calories Protein Fat Sodium Fiber ... Carbo Sugars Shelf Potass Vitamins]; % Take a subset from a single manufacturer. X = X(strcmp('K',cellstr(Mfg)),:); % Create a dissimilarity matrix. dissimilarities = pdist(X); % Use non-metric scaling to recreate the data in 2D, % and make a Shepard plot of the results. [Y,stress,disparities] = mdscale(dissimilarities,2); distances = pdist(Y); [dum,ord] = sortrows([disparities(:) dissimilarities(:)]); plot(dissimilarities,distances,'bo', ... dissimilarities(ord),disparities(ord),'r.-'); xlabel('Dissimilarities'); ylabel('Distances/Disparities') legend({'Distances' 'Disparities'},'Location','NW');
% Do metric scaling on the same dissimilarities. figure [Y,stress] = ... mdscale(dissimilarities,2,'criterion','metricsstress'); distances = pdist(Y); plot(dissimilarities,distances,'bo', ... [0 max(dissimilarities)],[0 max(dissimilarities)],'r.-'); xlabel('Dissimilarities'); ylabel('Distances')