Main Content

Linear Correlation

介绍

Correlationquantifies the strength of a linear relationship between two variables. When there is no correlation between two variables, then there is no tendency for the values of the variables to increase or decrease in tandem. Two variables that are uncorrelated are not necessarily independent, however, because they might have a nonlinear relationship.

您可以使用线性相关性来研究变量之间是否存在线性关系,而无需假设或将特定模型拟合到数据。两个具有较小或没有线性相关的变量可能具有很强的非线性关系。但是,在拟合模型之前计算线性相关是识别具有简单关系的变量的有用方法。探索变量如何相关的另一种方法是制作数据的散点图。

Covariancequantifies the strength of a linear relationship between two variables in units relative to their variances. Correlations are standardized covariances, giving a dimensionless quantity that measures the degree of a linear relationship, separate from the scale of either variable.

The following MATLAB®functions compute sample correlation coefficients and covariance. These sample coefficients are estimates of the true covariance and correlation coefficients of the population from which the data sample is drawn.

功能

Description

corrcoef

Correlation coefficient matrix

cov

协方差矩阵

xcorr

随机过程的互相关序列(包括自相关)

Covariance

Use the MATLABcovfunction to calculate the sample covariance matrix for a data matrix (where each column represents a separate quantity).

The sample covariance matrix has the following properties:

  • cov(X)is symmetric.

  • diag(cov(X))is a vector of variances for each data column. The variances represent a measure of the spread or dispersion of data in the corresponding column. (Thevarfunction calculates variance.)

  • sqrt(diag(cov(X)))是标准偏差的向量。(这stdfunction calculates standard deviation.)

  • 协方差矩阵的非对角元素表示各个数据列之间的协方差。

Here,Xcan be a vector or a matrix. For anm-by-nmatrix, the covariance matrix isn-by-n.

For an example of calculating the covariance, load the sample data incount.dat其中包含一个24 x-3矩阵:

load count.dat

Calculate the covariance matrix for this data:

cov(count)

MATLAB responds with the following result:

ans = 1.0e+003 * 0.6437 0.9802 1.6567 0.9802 1.7144 2.6908 1.6567 2.6908 4.6278

该数据的协方差矩阵具有以下形式:

[ s 2 11 s 2 12 s 2 13 s 2 21 s 2 22 s 2 23 s 2 31 s 2 32 s 2 33 ] s 2 i j = s 2 j i

Here,s2ijis the sample covariance between columniand columnjof the data. Because thecountmatrix contains three columns, the covariance matrix is 3-by-3.

Note

在特殊情况下,矢量是cov, the function returns the variance.

Correlation Coefficients

功能corrcoef产生数据矩阵的样品相关系数的矩阵(其中每列表示单独的数量)。相关系数从-1到1,其中

  • Values close to 1 indicate that there is a positive linear relationship between the data columns.

  • Values close to -1 indicate that one column of data has a negative linear relationship to another column of data (anticorrelation).

  • Values close to or equal to 0 suggest there is no linear relationship between the data columns.

For anm-by-nmatrix, the correlation-coefficient matrix isn-by-n. The arrangement of the elements in the correlation coefficient matrix corresponds to the location of the elements in the covariance matrix, as described inCovariance.

有关计算相关系数的示例,请将样本数据加载到count.dat其中包含一个24 x-3矩阵:

load count.dat

键入以下语法以计算相关系数:

Corrcoef(计数)

This results in the following 3-by-3 matrix of correlation coefficients:

ans = 1.0000 0.9331 0.9599 0.9331 1.0000 0.9553 0.9599 0.9553 1.0000

Because all correlation coefficients are close to 1, there is a strong positive correlation between each pair of data columns in thecountmatrix.