ランダム部分空間の分類

次の例は、ランダム部分空間アンサンブルを使用して分類の精度を向上させる方法を示しています。また、交差検証を使用して、弱学習器テンプレートとアンサンブルの両方に適切なパラメーターを判別する方法も示します。

データの読み込み

ionosphereデータを読み込みます。このデータには 34 の予測子に対する 351 の二項反応が含まれます。

loadionosphere; [N,D] = size(X)

N = 351

D = 34

resp = unique(Y)

resp =2x1 cell{'b'} {'g'}

アンサンブルの作成

さまざまな次元数をもつ2最近傍分類のアンサンブルを作成し、結果アンサンブルの交差検証損失を調べます。

この手順の実行には時間がかかります。進捗を追跡するには、各次元が終了するごとにメッセージを表示させます。

NPredToSample = round(linspace(1,D,10));% linear spacing of dimensionscvloss = zeros(numel(NPredToSample),1); learner = templateKNN('NumNeighbors',2);fornpred=1:numel(NPredToSample) subspace = fitcensemble(X,Y,'Method','Subspace',“学习者”,learner,...'NPredToSample',NPredToSample(npred),'CrossVal','On'); cvloss(npred) = kfoldLoss(subspace); fprintf('Random Subspace %i done.\n',npred);end

Random Subspace 1 done. Random Subspace 2 done. Random Subspace 3 done. Random Subspace 4 done. Random Subspace 5 done. Random Subspace 6 done. Random Subspace 7 done. Random Subspace 8 done. Random Subspace 9 done. Random Subspace 10 done.

figure;% plot the accuracy versus dimensionplot(NPredToSample,cvloss); xlabel('Number of predictors selected at random'); ylabel('10 fold classification error'); title(基于事例的抚慰心灵fication with Random Subspace');

Figure contains an axes object. The axes object with title k-NN classification with Random Subspace contains an object of type line.

学習器ごとに 5 つの予測子があるアンサンブルと 8 つの予測子があるアンサンブルは、交差検証誤差が最も小さくなっています。これらのアンサンブルでは誤差率が約 0.06 ですが、他のアンサンブルでは交差検証誤差率が約 0.1 以上になっています。

適切なアンサンブルサイズの検出

アンサンブル内で最も数が少なく、なおかつ適切な分類が可能な学習器を求めます。

ens = fitcensemble(X,Y,'Method','Subspace',“学习者”,learner,...'NPredToSample',5,'CrossVal','on'); figure;% Plot the accuracy versus number in ensembleplot(kfoldLoss(ens,“模式”,'Cumulative')) xlabel('Number of learners in ensemble'); ylabel('10 fold classification error'); title(基于事例的抚慰心灵fication with Random Subspace');

Figure contains an axes object. The axes object with title k-NN classification with Random Subspace contains an object of type line.

学習器が 50 を超えるアンサンブルには利点がないと考えられます。25 の学習器が含まれる場合に適正な予測ができる可能性があります。

最終的なアンサンブルの作成

50 の学習器を使用して最終的なアンサンブルを作成します。アンサンブルをコンパクト化し、コンパクト化したバージョンでそれなりのメモリ量が節約されるかどうかを確認します。

ens = fitcensemble(X,Y,'Method','Subspace','NumLearningCycles',50,...“学习者”,learner,'NPredToSample',5); cens = compact(ens); s1 = whos('ens'); s2 = whos('cens'); [s1.bytes s2.bytes]% si.bytes = size in bytes