Main Content

svdsketch

Compute SVD of low-rank matrix sketch

Description

example

[U,S,V] = svdsketch(A)returns the singular value decomposition (SVD) of a low-rank matrix sketch of input matrixA. The matrix sketch is a low-rank approximation that only reflects the most important features ofA(up to a tolerance), which enables faster calculation of a partial SVD of large matrices compared to usingsvds.

example

[U,S,V] = svdsketch(A,tol)specifies a tolerance for the matrix sketch.svdsketchusestolto adaptively determine the rank of the matrix sketch approximation. As the tolerance gets larger, fewer features ofAare used in the matrix sketch.

example

[U,S,V] = svdsketch(A,tol,Name,Value)specifies additional options with one or moreName,Valuepair arguments. For example, you can specify'MaxIterations'and a scalar to adjust the number of iterations used to form the matrix sketch.

example

[U,S,V,apxErr] = svdsketch(___)additionally returns a vectorapxErrthat contains the relative approximation error in each iteration,norm(U*S*V'-A,'fro')/norm(A,'fro'). The final entry in the vectorapxErr(end)is the relative approximation error of the output returned bysvdsketch.

Examples

collapse all

Usesvdsketchto compute the SVD factors of a low-rank matrix approximation.

Usegalleryto create a 200-by-200 random matrix with geometrically distributed singular values.

A = gallery('randsvd',200);

Usesvdsketchto calculate the SVD of a low-rank approximation ofA.

[U,S,V] = svdsketch(A);

Check the size of the outputs.

size(S)
ans =1×2120 120

The results indicate that the low-rank matrix approximation ofAhas a rank of 120.

Specify a tolerance withsvdsketchto compute the SVD factors of a low-rank matrix approximation.svdsketchadaptively determines the appropriate rank of the matrix sketch based on the specified tolerance.

Usegalleryto create a 200-by-200 random matrix with geometrically distributed singular values.

A = gallery('randsvd',200);

Usesvdsketchto calculate the SVD of a low-rank approximation of A. Specify a tolerance of1e-2, and find the size of the outputSto determine the ranksvdsketchuses for the matrix sketch.

tol = 1e-2; [U,S,V] = svdsketch(A,tol); size(S)
ans =1×260 60

The results indicate that the low-rank matrix approximation ofAhas a rank of 60.

Examine how well the matrix sketch approximatesAby comparingAtoU*S*V'.

norm(U*S*V' - A,'fro')/norm(A,'fro')
ans = 0.0048

The result indicates that the matrix sketch approximatesAwithin the specified tolerance of1e-2.

Usesvdsketchwith theMaxSubspaceDimensionoption on a matrix that has slowly decaying singular values. You can use this option to forcesvdsketchto use a subset of the features ofAin the matrix sketch.

Create a 5000-by-5000 matrix with values drawn from the standard normal distribution. View the distribution of matrix singular values.

A = randn(5000); semilogy(svd(A),'-o')

Figure contains an axes object. The axes object contains an object of type line.

Since the singular values inAdecay slowly,Ahas many important features and does not lend itself to low-rank approximations. To form a matrix sketch that reasonably approximatesA, most or nearly all of the features need to be preserved.

Usesvdsketchon the matrix with a tolerance of1e-5. Specify four outputs to return the SVD factors as well as the relative approximation error in each iteration.

tol = 1e-5; [U1,S1,V1,apxError1] = svdsketch(A,tol); size(S1)
ans =1×25000 5000

The size ofSindicates that to satisfy the tolerance,svdsketchneeds to preserve all of the features ofA. For large sparse input matrices, this can present memory issues since the low-rank approximation ofAformed bysvdsketchis roughly the same size asAand thus might not fit in memory as a dense matrix.

Check the approximation error of the outputs. Sincesvdsketchpreserves everything inA, the computed answer is accurate, but the calculation was just an expensive way to calculatesvd(X).

apxError1(end)
ans = 1.9075e-08

Now, do the same calculation but specifyMaxSubspaceDimensionas 650 to limit the size of the subspace used to sketchA. This is useful to forcesvdsketchto use only a subset of the features inAto form the matrix sketch, reducing the size of the outputs.

[U2,S2,V2,apxError2] = svdsketch(A,tol,'MaxSubspaceDimension',650); size(S2)
ans =1×2650 650

The outputs now have a smaller size.

Check the approximation error of the new outputs. The tradeoff for forcing the outputs to be smaller is that many important features of A need to be omitted in the matrix sketch, and the resulting rank 650 approximation ofAdoes not meet the specified tolerance.

apxError2(end)
ans = 0.8214

Input Arguments

collapse all

Input matrix, specified as a sparse or full matrix.Ais typically, but not always, a large and sparse matrix.svdsketchis best suited to operate on rank-deficient matrices that have a relatively small number of features.

Data Types:single|double
Complex Number Support:Yes

Matrix sketch tolerance, specified as a real numeric scalar in the rangesqrt(eps(class(A))) <= tol < 1.

svdsketchuses the value oftolto adaptively determine which features ofAto use in the low-rank approximation (matrix sketch) ofA. As the value oftolincreases,svdsketchuses fewer features ofAto form the matrix sketch.

Example:[U,S,V] = svdsketch(A,1e-4)

Data Types:single|double

Name-Value Arguments

Specify optional comma-separated pairs ofName,Valuearguments.Nameis the argument name andValueis the corresponding value.Namemust appear inside quotes. You can specify several name and value pair arguments in any order asName1,Value1,...,NameN,ValueN.

Example:[U,S,V] = svdsketch(A,1e-10,'MaxIterations',100)uses 100 iterations in thesvdsketch算法。

Maximum subspace dimension, specified as a positive integer scalar. The subspace dimension controls the memory consumption of thesvdsketch算法。如果算法耗尽内存a specified tolerance, then you can specify a smaller value forMaxSubspaceDimensionso that the algorithm stays within memory. For example, adjusting this parameter can be useful whenAhas singular values that decay slowly.

When you specify theMaxSubspaceDimensionoption, you set a maximum value on the rank of the matrix sketch used bysvdsketch. Therefore, ifsvdsketchcannot satisfy the specified tolerance with a rank smaller thanMaxSubspaceDimension, it uses the maximum allowed rank, and the resulting outputs might not satisfy the specified tolerance.

Example:[U,S,V] = svdsketch(A,1e-7,'MaxSubspaceDimension',150)

Data Types:single|double

Initial algorithm block size, specified as a positive integer scalar. The block size is the number by which the rank of the matrix sketch increases each iteration. A larger block size reduces the number of needed iterations by doing more work per iteration, but might also add more information to the calculation than is necessary to achieve convergence.

As the algorithm proceeds,svdsketchmight adjust the block size from the initial value to speed up convergence if the relative approximation errorapxErris not decaying fast enough.

If you specifyBlockSize, the value should be smaller thanMaxSubspaceDimension.

Example:[U,S,V] = svdsketch(A,1e-7,'BlockSize',10)

Data Types:single|double

Maximum number of algorithm iterations, specified as a positive integer scalar. More iterations can produce a higher-quality matrix sketch at the cost of more execution time and higher memory consumption.

Example:[U,S,V] = svdsketch(A,1e-7,'MaxIterations',25)

Data Types:single|double

Number of power iterations, specified as a nonnegative integer scalar. Power iterations improve the orthogonality of theUandVoutputs. You should generally select the number of power iterations to be0,1, or2, as larger values can adversely contribute to round-off error.

Example:[U,S,V] = svdsketch(A,1e-7,'NumPowerIterations',2)

Data Types:single|double

输出参数

collapse all

Left singular vectors of matrix sketch, returned as a matrix. The columns ofUare orthonormal, and they form a set of basis vectors for the range of the matrix sketch ofA.

The size ofUdepends on the value oftol. Astol变大,svdsketchuses fewer features of the input matrix to form the matrix sketch, soUandValso have fewer columns.

Different machines and releases of MATLAB®can produce different singular vectors that are still numerically accurate. Corresponding columns inUandVcan flip their signs, since this does not affect the value of the expressionA = U*S*V'.

Singular values of matrix sketch, returned as a square diagonal matrix. The diagonal elements ofSare the strictly positive singular values of the matrix sketch in decreasing order.

The size ofSdepends on the value oftol. As the tolerance increases,svdsketchuses fewer features of the input matrix to form the matrix sketch, soShas correspondingly fewer rows and columns.

Right singular vectors of matrix sketch, returned as a matrix. The columns ofVare orthonormal, and they form a set of basis vectors for the null space of the matrix sketch ofA.

The size ofVdepends on the value oftol. Astol变大,svdsketchuses fewer features of the input matrix to form the matrix sketch, soUandValso have fewer columns.

Different machines and releases of MATLAB can produce different singular vectors that are still numerically accurate. Corresponding columns inUandVcan flip their signs, since this does not affect the value of the expressionA = U*S*V'.

Relative approximation error in each iteration, returned as a vector. The length ofapxErris equal to the number of iterations used in thesvdsketch算法。UseMaxIterationsto adjust the number of iterations.

The entries ofapxErrare the relative approximation errors in each iteration,norm(U*S*V'-A,'fro')/norm(A,'fro'). The final entry in the vectorapxErr(end)is the relative approximation error of the output returned bysvdsketch.

Tips

  • Usesvdsketchwhen you do not know ahead of time what rank to specify withsvds, but you know what tolerance the approximation of the SVD should satisfy.

  • svdscomputes the best possible rank-k approximation of the SVD (using the default"largest"method).svdsketchdoes not guarantee its rank-k approximation is the best one, which accounts for its speed advantage oversvds.

Algorithms

svdsketchapplies a tolerance to form a low-rank matrix approximation A Q B of the input matrixA. This low-rank approximation is called amatrix sketch. The matrix sketch only preserves important features fromA, filtering unnecessary information out. The relative approximation errorapxErrof the matrix sketch aims to satisfy the specified tolerancetol:

e sketch = Q B A F A F t o l

The processsvdsketchfollows to form the matrix sketch is:

  • svdsketchforms the matrix sketch iteratively, with each iteration adding new columns toQand new rows toB. The new columns and rows are created by extracting features fromAusing a random sample matrix. You can control the number of columns and rows added in each iteration with theBlockSizename-value pair.

  • During each iteration,svdsketchuses power iterations to improve the orthogonality of the new columns inQ. You can adjust the number of power iterations with theNumPowerIterationsname-value pair.

  • The iterations to form the matrix sketch stop when: the number of columns inQand rows inBreach the specified value forMaxSubspaceDimension, the number of iterations reachesMaxIterations, or the relative approximation error converges (apxErr <= tol).

  • To improve convergence speed,svdsketchmight increase the specified initial value forBlockSizefrom iteration to iteration if the decay inapxErris not sufficient.

After the matrix sketch A Q B is formed,svdsketch计算奇异值分解)the matrix sketch via[U1,S,V] = svd(B,'econ'), such that

A Q B = ( Q U 1 ) S V H = U S V H .

Ifsvdsketchis able to filter out some features ofAbased on the specified tolerance, then the resulting SVD factors contain fewer singular values and singular vectors than if a full SVD was performed onA.

References

[1] Yu, Wenjian, Yu Gu, and Yaohang Li. “Efficient Randomized Algorithms for the Fixed-Precision Low-Rank Matrix Approximation.”SIAM Journal on Matrix Analysis and Applications39, no. 3 (August 2018): 1339–1359.https://doi.org/10.1137/17M1141977.

Extended Capabilities

Introduced in R2020b