Main Content

Cubic Smoothing Splines

This example shows how to use thecsapsandspapscommands from Curve Fitting Toolbox™ to construct cubic smoothing splines.

The CSAPS Command

The commandcsapsprovides thesmoothingspline. This is a cubic spline that more or less follows the presumed underlying trend in noisy data. A smoothing parameter, to be chosen by you, determines just how closely the smoothing spline follows the given data. Here is the basic information, an abbreviated version of the documentation:

CSAPS Cubic smoothing spline.

VALUES = CSAPS(X, Y, P, XX)

Returns the values at XX of the cubic smoothing spline for the

given data (X,Y) and depending on the smoothing parameter P, chosen

from the interval [0 .. 1]. This smoothing spline f minimizes

P * sum_i W(i)(Y(i) - f(X(i)))^2 + (1-P) * integral (D^2 f)^2

Example: Noisy Data From a Cubic Polynomial

Here are some trial runs. We start with data from a simple cubic,q(x) := x^3, contaminate the values with some noise, and choose the value of the smoothing parameter to be .5. Then plot the resulting smoothed values, along with the underlying cubic, and the contaminated data.

xi = (0:.05:1); q = @(x) x.^3; yi = q(xi); randomStream = RandStream.create('mcg16807','Seed', 23 ); ybad = yi+.3*(rand(randomStream, size(xi))-.5); p = .5; xxi = (0:100)/100; ys = csaps(xi,ybad,p,xxi); plot(xi,yi,':',xi,ybad,'x',xxi,ys,'r-') title('Clean Data, Noisy Data, Smoothed Values') legend('Exact','Noisy','Smoothed','Location','NorthWest')

Figure contains an axes object. The axes object with title Clean Data, Noisy Data, Smoothed Values contains 3 objects of type line. These objects represent Exact, Noisy, Smoothed.

The smoothing is way overdone here. By choosing the smoothing parameterpcloser to 1, we obtain a smoothing spline closer to the given data. We tryp = .6, .7, .8, .9, 1, and plot the resulting smoothing splines.

yy = zeros(5,length(xxi)); p = [.6 .7 .8 .9 1];forj=1:5 yy(j,:) = csaps(xi,ybad,p(j),xxi);endholdonplot(xxi,yy); holdofftitle('Smoothing Splines for Various Values of the Smoothing Parameter') legend({'Exact','Noisy','p = 0.5','p = 0.6','p = 0.7','p = 0.8',...“p = 0.9”,'p = 1.0'},'Location','NorthWest')

Figure contains an axes object. The axes object with title Smoothing Splines for Various Values of the Smoothing Parameter contains 8 objects of type line. These objects represent Exact, Noisy, p = 0.5, p = 0.6, p = 0.7, p = 0.8, p = 0.9, p = 1.0.

We see that the smoothing spline can be very sensitive to the choice of the smoothing parameter. Even forp= 0.9, the smoothing spline is still far from the underlying trend, while forp= 1, we get the interpolant to the (noisy) data.

In fact, the formulation used bycsapi(p.235ff ofA Practical Guide to Splines) is very sensitive to scaling of the independent variable. A simple analysis of the equations used shows that the sensitive range forpis around1/(1+epsilon), withepsilon := h^3/16, andhthe average difference between neighboring sites. Specifically, you would expect a close following of the data whenp = 1/(1+epsilon/100)and some satisfactory smoothing whenp = 1/(1+epsilon*100).

The plot below shows the smoothing spline for values ofpnear this magic number1/(1+epsilon). For this case, it is more informative to look at1-psince the magic number,1/(1+epsilon), is very close to 1.

epsilon = ((xi(end)-xi(1))/(numel(xi)-1))^3/16; 1 - 1/(1+epsilon)
ans = 7.8124e-06
plot(xi,yi,':',xi,ybad,'x') holdonlabels = cell(1,5);forj = 1:5 p = 1 /(1 +ε* 10 ^ (j-3));yy (j) = csaps (xi,ybad,p,xxi); labels{j} = ['1-p= ',num2str(1-p)];endplot(xxi,yy) title('Smoothing Splines for Smoothing Parameter Near Its ''Magic'' Value') legend( [{'Exact','Noisy'}, labels],'Location','NorthWest') holdoff

Figure contains an axes object. The axes object with title Smoothing Splines for Smoothing Parameter Near Its 'Magic' Value contains 7 objects of type line. These objects represent Exact, Noisy, 1-p= 7.8125e-08, 1-p= 7.8125e-07, 1-p= 7.8124e-06, 1-p= 7.8119e-05, 1-p= 0.00078064.

In this example, the smoothing spline is very sensitive to variation of the smoothing parameter near the magic number. The one farthest from 1 seems the best choice from these, but you may prefer the one beyond that.

p = 1/(1+epsilon*10^3); yy = csaps(xi,ybad,p,xxi); holdonplot( xxi, yy,'y','LineWidth', 2 ) title( sprintf('The Smoothing Spline For 1-p = %s is Added, in Yellow', num2str(1-p) ) ) holdoff

Figure contains an axes object. The axes object with title The Smoothing Spline For 1-p = 0.0077519 is Added, in Yellow contains 8 objects of type line. These objects represent Exact, Noisy, 1-p= 7.8125e-08, 1-p= 7.8125e-07, 1-p= 7.8124e-06, 1-p= 7.8119e-05, 1-p= 0.00078064.

You can also supplycsaps与错误的重量,多注意一些data points than others. Also, if you do not supply the evaluation sitesxx, thencsapsreturns the ppform of the smoothing spline.

Finally,csapscan also handle vector-valued data and even multivariate, gridded data.

The SPAPS Command

The cubic smoothing spline provided by the commandspapsdiffers from the one constructed incsapsonly in the way it is selected. Here is an abbreviated version of the documentation forspaps:

SPAPS Smoothing spline.

[SP,VALUES] = SPAPS(X,Y,TOL) returns the B-form and, if asked,

the values at X, of a cubic smoothing spline f for the given

data (X(i),Y(:,i)), i=1,2, ..., n.

The smoothing spline f minimizes the roughness measure

F(D^2 f) := integral ( D^2 f(t) )^2 dt on X(1) < t < X(n)

over all functions f for which the error measure

E(f) := sum_j { W(j)*( Y(:,j) - f(X(j)) )^2 : j=1,...,n }

is no bigger than the given TOL. Here, D^M f denotes the M-th

derivative of f. The weights W are chosen so that E(f) is the

Composite Trapezoid Rule approximation for F(y-f).

f is constructed as the unique minimizer of

ρ* E (f) + f (f D ^ 2),

with the smoothing parameter RHO so chosen so that E(f) equals

TOL. Hence, FN2FM(SP,'pp') should be (up to roundoff) the same

as the output from CPAPS(X,Y,RHO/(1+RHO)).

Tolerance vs. Smoothing Parameter

It may be easier to supply a suitable tolerance forspapsthan the smoothing parameterprequired bycsaps. In our earlier example, we added uniformly-distributed random noise from the interval0.3*[-0.5 .. 0.5]. Hence, we can estimate a reasonable value fortolas the value of the error measureeat such noise.

tol = sum((.3*(rand(randomStream, size(yi))-.5)).^2);

This plot shows the resulting smoothing spline constructed byspaps. Note that the error weights are specified to be uniform, which is their default value incsaps.

[sp,ys,rho] = spaps(xi,ybad,tol,ones(size(xi))); plot(xi,yi,':',xi,ybad,'x',xi,ys,'r-') title( sprintf('Clean Data, Noisy Data, Smoothed Values (1-p = %s )', num2str(1/(1+rho)) ) ); legend( {'Exact','Noisy','Smoothed'},'location','NorthWest')

Figure contains an axes object. The axes object with title Clean Data, Noisy Data, Smoothed Values (1-p = 0.013761 ) contains 3 objects of type line. These objects represent Exact, Noisy, Smoothed.

The figure title shows the value ofpyou would use incsapsto obtain exactly this smoothing spline for these data.

Here, in addition, is the smoothing spline provided bycsapswhen not given a smoothing parameter. In this casecsapschooses the parameter by a certain ad hoc procedure that attempts to locate the region where the smoothing spline is most sensitive to the smoothing parameter (similar to the earlier discussion).

holdonplot(xxi,fnval(csaps(xi,ybad),xxi),'-') title('Clean Data, Noisy Data, Smoothed Values') legend({'Exact''Noisy''spaps, specified tolerance'...'csaps, default smoothing parameter'},'Location','NorthWest') holdoff

Figure contains an axes object. The axes object with title Clean Data, Noisy Data, Smoothed Values contains 4 objects of type line. These objects represent Exact, Noisy, spaps, specified tolerance, csaps, default smoothing parameter.

CSAPS vs. SPAPS

Thecsapsandspapscommands differ in the way in which you specify a particular smoothing spline, via a smoothing parameter vs. a tolerance. Another difference is thatspapscan provide a linear or a quintic smoothing spline, in addition to the cubic smoothing spline.

The quintic smoothing spline is better than the cubic smoothing spline in the situation when you would like the second derivative to move as little as possible.