Main Content

mdscale

Nonclassical multidimensional scaling

Syntax

Y = mdscale(D,p)
[Y,stress] = mdscale(D,p)
[Y,stress,disparities] = mdscale(D,p)
[...] = mdscale(D,p,'Name',value)

Description

Y = mdscale(D,p)performs nonmetric multidimensional scaling on then-by-ndissimilarity matrixD, and returnsY, a configuration ofnpoints (rows) inpdimensions (columns). The Euclidean distances between points inYapproximate a monotonic transformation of the corresponding dissimilarities inD. By default,mdscaleuses Kruskal's normalized stress1 criterion.

You can specifyDas either a fulln-by-nmatrix, or in upper triangle form such as is output bypdist. A full dissimilarity matrix must be real and symmetric, and have zeros along the diagonal and non-negative elements everywhere else. A dissimilarity matrix in upper triangle form must have real, non-negative entries.mdscaletreatsNaNs inD作为缺失值,忽略了这些元素。Infis not accepted.

You can also specifyDas a full similarity matrix, with ones along the diagonal and all other elements less than one.mdscaletransforms a similarity matrix to a dissimilarity matrix in such a way that distances between the points returned inYapproximatesqrt(1-D). To use a different transformation, transform the similarities prior to callingmdscale.

[Y,stress] = mdscale(D,p)returns the minimized stress, i.e., the stress evaluated atY.

[Y,stress,disparities] = mdscale(D,p)returns the disparities, that is, the monotonic transformation of the dissimilaritiesD.

[...] = mdscale(D,p,'Name',value)specifies one or more optional parameter name/value pairs that control further details ofmdscale. SpecifyNamein single quotes. Available parameters are

  • Criterion— The goodness-of-fit criterion to minimize. This also determines the type of scaling, either non-metric or metric, thatmdscaleperforms. Choices for non-metric scaling are:

    • 'stress'— Stress normalized by the sum of squares of the inter-point distances, also known as stress1. This is the default.

    • 'sstress'— Squared stress, normalized with the sum of 4th powers of the inter-point distances.

    Choices for metric scaling are:

    • 'metricstress'— Stress, normalized with the sum of squares of the dissimilarities.

    • 'metricsstress'— Squared stress, normalized with the sum of 4th powers of the dissimilarities.

    • 'sammon'— Sammon's nonlinear mapping criterion. Off-diagonal dissimilarities must be strictly positive with this criterion.

    • 'strain'— A criterion equivalent to that used in classical multidimensional scaling.

  • Weights— A matrix or vector the same size asD, containing nonnegative dissimilarity weights. You can use these to weight the contribution of the corresponding elements ofDin computing and minimizing stress. Elements ofDcorresponding to zero weights are effectively ignored.

    Note

    When you specify weights as a full matrix, its diagonal elements are ignored and have no effect, since the corresponding diagonal elements ofDdo not enter into the stress calculation.

  • Start— Method used to choose the initial configuration of points for Y. The choices are

    • 'cmdscale'— Use the classical multidimensional scaling solution. This is the default.'cmdscale'is not valid when there are zero weights.

    • 'random'— Choose locations randomly from an appropriately scaled p-dimensional normal distribution with uncorrelated coordinates.

    • Ann-by-pmatrix of initial locations, where n is the size of the matrixDandpis the number of columns of the output matrixY. In this case, you can pass in[]forpandmdscaleinferspfrom the second dimension of the matrix. You can also supply a 3-D array, implying a value for'Replicates'from the array's third dimension.

  • Replicates— Number of times to repeat the scaling, each with a new initial configuration. The default is1.

  • Options— Options for the iterative algorithm used to minimize the fitting criterion. Pass in an options structure created bystatset. For example,

    opts = statset(param1,val1,param2,val2, ...); [...] = mdscale(...,'Options',opts)

    The choices ofstatsetparameters are

    • 'Display'— Level of display output. The choices are'off'(the default),'iter', and“最后一次”.

    • 'MaxIter'— Maximum number of iterations allowed. The default is200.

    • 'TolFun'— Termination tolerance for the stress criterion and its gradient. The default is1e-4.

    • 'TolX'— Termination tolerance for the configuration location step size. The default is1e-4.

Examples

load cereal.mat X = [Calories Protein Fat Sodium Fiber ... Carbo Sugars Shelf Potass Vitamins]; % Take a subset from a single manufacturer. X = X(strcmp('K',cellstr(Mfg)),:); % Create a dissimilarity matrix. dissimilarities = pdist(X); % Use non-metric scaling to recreate the data in 2D, % and make a Shepard plot of the results. [Y,stress,disparities] = mdscale(dissimilarities,2); distances = pdist(Y); [dum,ord] = sortrows([disparities(:) dissimilarities(:)]); plot(dissimilarities,distances,'bo', ... dissimilarities(ord),disparities(ord),'r.-'); xlabel('Dissimilarities'); ylabel('Distances/Disparities') legend({'Distances' 'Disparities'},'Location','NW');

% Do metric scaling on the same dissimilarities. figure [Y,stress] = ... mdscale(dissimilarities,2,'criterion','metricsstress'); distances = pdist(Y); plot(dissimilarities,distances,'bo', ... [0 max(dissimilarities)],[0 max(dissimilarities)],'r.-'); xlabel('Dissimilarities'); ylabel('Distances')

Version History

Introduced before R2006a