Documentation

zscore

Standardizedz-scores

Syntax

Z = zscore(X)
Z = zscore(X,flag)
Z = zscore(X,flag,dim)
[Z,mu,sigma] = zscore(___)

Description

example

Z= zscore(X)returns thez-scorefor each element ofXsuch that columns ofXare centered to have mean 0 and scaled to have standard deviation 1.Zis the same size asX.

  • IfXis a vector, thenZis a vector ofz-scores.

  • IfX是矩阵,然后Zis a matrix of the same size asX, and each column ofZhas mean 0 and standard deviation 1.

  • For多维阵列,z-scores inZare computed along thefirst nonsingleton dimensionofX.

example

Z= zscore(X,flag)scalesXusing the standard deviation indicated byflag.

  • Ifflag是0(默认),然后zscorescalesXusing the样品标准偏差, withn- 1 in the denominator of the standard deviation formula.zscore(X,0)is the same aszscore(X).

  • Ifflagis 1, thenzscorescalesXusing thepopulation standard deviation, withnin the denominator of standard deviation formula.

example

Z= zscore(X,flag,dim)standardizesXalong dimensiondim. For example, for a matrixX, ifdim= 1, thenzscoreuses the means and standard deviations along the columns ofX, ifdim= 2, thenzscoreuses the means and standard deviations along the rows ofX.

example

[Z,mu,sigma] = zscore(___)还返回用于中心和缩放的平均值和标准偏差,muandsigma, respectively. You can use any of the input arguments in the previous syntaxes.

Examples

collapse all

计算并绘制$z$-scores of two data vectors, and then compare the results.

Load the sample data.

loadlawdata

Two variables load into the workspace:gpaandLSAT.

Plot both variables on the same axes.

plot([gpa,lsat]) legend('gpa','lsat','Location','East')

很难比较这两个措施s because they are on a very different scale.

Plot the$z$-scores ofgpaandLSATon the same axes.

Zgpa = zscore(gpa); Zlsat = zscore(lsat); plot([Zgpa, Zlsat]) legend('gpa z-scores','lsat z-scores','Location','Northeast')

Now, you can see the relative performance of individuals with respect to both theirgpaandLSATresults. For example, the third individual'sgpaandLSATresults are both one standard deviation below the sample mean. The eleventh individual'sgpais around the sample mean but has anLSATscore almost 1.25 standard deviations above the sample average.

Check the mean and standard deviation of the$z$-scores you created.

意思是([Zgpa,Zlsat])
ans = 1.0e-14 * -0.1088 0.0357
std([Zgpa,Zlsat])
ans = 1 1

By definition,$z$-scores ofgpaandLSAThave mean 0 and standard deviation 1.

Load the sample data.

loadlawdata

Two variables load into the workspace:gpaandLSAT.

计算the$z$-scores ofgpausing the population formula for standard deviation.

Z1 = zscore(gpa,1);% population formulaZ0 = zscore(gpa,0);% sample formuladisp([Z1 Z0])
1.2554 1.2128 0.8728 0.8432 -1.2100 -1.1690 -0.2749 -0.2656 1.4679 1.4181 -0.1049 -0.1013 -0.4024 -0.3888 1.4254 1.3771 1.1279 1.0896 0.1502 0.1451 0.1077 0.1040 -1.5076 -1.4565 -1.4226 -1.3743 -0.9125 -0.8815 -0.5724 -0.5530

For a sample from a population, the population standard deviation formula with$n$in the denominator corresponds to the maximum likelihood estimate of the population standard deviation, and might be biased. The sample standard deviation formula, on the other hand, is the unbiased estimator of the population standard deviation for a sample.

计算$z$-scores using the mean and standard deviation computed along the columns or rows of a data matrix.

Load the sample data.

loadflu

The dataset arrayfluis loaded in the workplace.fluhas 52 observations on 11 variables. The first variable contains dates (in weeks). The other variables contain the flu estimates for different regions in the U.S.

Convert the dataset array to a data matrix.

flu2 = double(flu(:,2:end));

The new data matrix,flu2, is a 52-by-10 double data matrix. The rows correspond to the weeks and the columns correspond to the U.S. regions in the data set arrayflu.

Standardize the flu estimate for each region (thecolumnsofflu2).

Z1 = zscore(flu2,[ ],1);

你可以看到$z$-scores in the variable editor by double-clicking on the matrixZ1created in the workspace.

Standardize the flu estimate for each week (therowsofflu2).

Z2 = zscore(flu2,[ ],2);

返回意思是and standard deviation used to compute the$z$-scores.

Load the sample data.

loadlawdata

Two variables load into the workspace:gpaandLSAT.

返回$z$-scores, mean, and standard deviation ofgpa.

[Z,gpamean,gpastdev] = zscore(gpa)
Z = 1.2128 0.8432 -1.1690 -0.2656 1.4181 -0.1013 -0.3888 1.3771 1.0896 0.1451 0.1040 -1.4565 -1.3743 -0.8815 -0.5530 gpamean = 3.0947 gpastdev = 0.2435

Input Arguments

collapse all

Input data, specified as a vector, matrix, or multidimensional array.

Data Types:double|single

Indicator for the standard deviation used to compute thez-scores, specified as 0 or 1.

Dimension along which to calculate thez-scores ofX, specified as a positive integer. For example, for a matrixX, ifdim= 1, thenzscoreuses the means and standard deviations along the columns ofX, ifdim= 2, thenzscoreuses the means and standard deviations along the rows ofX.

Output Arguments

collapse all

z-scores, returned as a vector, matrix, or multidimensional array. A vector ofz-scores has mean 0 and variance 1.

  • IfXis a vector, thenZis a vector ofz-scores.

  • IfXis an array, thenzscoreis an array, with each column or row standardized to have mean 0 and variance 1 (depending ondim). Ifdimis not specified,zscorestandardizes along thefirst nonsingleton dimensionofX.

Mean ofX用于计算z-scores, returned as a scalar or vector.

  • IfXis a vector, thenmuis a scalar.

  • IfX是矩阵,然后muis a row vector ifzscorecalculates the means along the columns ofX(dim= 1), and a column vector ifzscorecalculates the means along the rows ofX(dim= 2).

Standard deviation ofX用于计算z-scores, returned as a scalar or vector.

  • IfXis a vector, thensigmais a scalar.

  • IfX是矩阵,然后sigmais a row vector ifzscorecalculates the standard deviations along the columns ofX(dim= 1), and a column vector ifzscorecalculates the standard deviations along the rows ofX(dim= 2).

More About

collapse all

Z-Score

For a random variableXwith mean μ and standard deviation σ, thez-score of a valuexis

z = ( x μ ) σ .

For sample data with mean X ¯ and standard deviationS,z-score of a data pointxis

z = ( x X ¯ ) S .

z-scores measure the distance of a data point from the mean in terms of the standard deviation. This is also calledstandardizationof data. The standardized data set has mean 0 and standard deviation 1, and retains the shape properties of the original data set (same skewness and kurtosis).

You can usez- 在进一步分析之前,将数据放在相同的尺度上。这使您可以将两个或多个数据集与不同单位进行比较。

Multidimensional Array

Amultidimensional arrayis an array with more than two dimensions. For example, if X is a 1-by-3-by-4 array, thenXis a three-dimensional array.

First Nonsingleton Dimension

Afirst nonsingleton dimensionis the first dimension of an array whose size is not equal to 1. For example, ifXis a 1-by-2-by-3-by-4 array, then the second dimension is the first nonsingleton dimension ofX.

Sample Standard Deviation

The样品标准偏差,S, is given by

S = i = 1 n ( x i X ¯ ) 2 n 1 .

Sis the square root of an unbiased estimator of the variance of the population from whichXis drawn, as long asXconsists of independent, identically distributed samples.

Notice that the denominator in this variance formula isn- 1。

Population Standard Deviation

If the data is the entire population of values, then you can use thepopulation standard deviation,

σ = i = 1 n ( x i μ ) 2 n .

IfXis a random sample from a population, thenμis estimated by the sample mean, andσis the biased maximum likelihood estimator of the population standard deviation.

Notice that the denominator in this variance formula isn.

Algorithms

zscorereturnsNaNs for any sample containingNaNs.

zscorereturns0s for any sample that is constant (all values are the same). For example, ifXis a vector of the same numeric value, thenZis a vector of0s. IfXis a matrix with a column of consisting of the same value, then that column ofZconsists of0s.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

See Also

|

Introduced before R2006a

Was this topic helpful?