Classify Text Data Using Custom Training Loop

此示例使用：

Open Live Script

This example shows how to classify text data using a deep learning bidirectional long short-term memory (BiLSTM) network with a custom training loop.

When training a deep learning network using thetrainNetwork功能，如果trainingOptions不提供所需的选项（例如，自定义学习率计划），那么您可以使用自动差异定义自己的自定义培训循环。对于一个示例，显示如何使用trainNetwork功能，请参阅使用深度学习对文本数据进行分类（深度学习工具箱）.

This example trains a network to classify text data with the基于时间的衰减learning rate schedule: for each iteration, the solver uses the learning rate given by $ρ_{t} = \frac{ρ_{0}}{1 + k t}$ ，在哪里t是the iteration number, $ρ_{0}$ 是the initial learning rate, andk是the decay.

Import Data

导入工厂报告数据。该数据包含标有出厂事件的文本描述。要导入文本数据作为字符串，请指定文本类型为"string".

filename ="factoryReports.csv"; data = readtable(filename,TextType="string"）；头（数据）

ans =8×5桌Description Category Urgency Resolution Cost _____________________________________________________________________ ____________________ ________ ____________________ _____ "Items are occasionally getting stuck in the scanner spools." "Mechanical Failure" "Medium" "Readjust Machine" 45 "Loud rattling and banging sounds are coming from assembler pistons." "Mechanical Failure" "Medium" "Readjust Machine" 35 "There are cuts to the power when starting the plant." "Electronic Failure" "High" "Full Replacement" 16200 "Fried capacitors in the assembler." "Electronic Failure" "High" "Replace Components" 352 "Mixer tripped the fuses." "Electronic Failure" "Low" "Add to Watch List" 55 "Burst pipe in the constructing agent is spraying coolant." "Leak" "High" "Replace Components" 371 "A fuse is blown in the mixer." "Electronic Failure" "Low" "Replace Components" 441 "Things continue to tumble off of the belt." "Mechanical Failure" "Low" "Readjust Machine" 38

The goal of this example is to classify events by the label in theCategorycolumn. To divide the data into classes, convert these labels to categorical.

data.Category = apcorical（data.category）;

View the distribution of the classes in the data using a histogram.

图直方图（数据。类别）;Xlabel（"Class") ylabel("Frequency") title("Class Distribution")

The next step is to partition it into sets for training and validation. Partition the data into a training partition and a held-out partition for validation and testing. Specify the holdout percentage to be 20%.

cvp = cvpartition（data.Stagory，holdout = 0.2）;datatrain = data（训练（CVP），:);datavalidation = data（test（cvp），:);

Extract the text data and labels from the partitioned tables.

textDataTrain = dataTrain.Description; textDataValidation = dataValidation.Description; TTrain = dataTrain.Category; TValidation = dataValidation.Category;

要检查您是否正确导入数据，请使用Word Cloud可视化训练文本数据。

图WordCloud（TextDatatrain）;标题（"Training Data")

查看类的数量。

classes = categories(TTrain); numClasses = numel(classes)

numClasses = 4

Preprocess Text Data

Create a function that tokenizes and preprocesses the text data. The functionpreprocessText，在示例的末尾列出，执行以下步骤：

使用tokenizedDocument.
Convert the text to lowercase usinglower.
Erase the punctuation usingerasePunctuation.

Preprocess the training data and the validation data using thepreprocessTextfunction.

documentsTrain = preprocessText(textDataTrain); documentsValidation = preprocessText(textDataValidation);

查看前几预处理training documents.

文档特工（1：5）

ANS = 5×1令牌图：9令牌：偶尔会陷入扫描仪杆子10代币：响起和敲击声音来自汇编器5令牌5令牌：汇编器4代币中的油炸电容器4代币：Mixer the Fuses 9代币：ives fips the Fuses 9 subkens：：构造剂中的爆裂管正在喷洒冷却液

Create a single datastore that contains both the documents and the labels by creatingarrayDatastoreobjects, then combining them using thecombinefunction.

dsDocumentsTrain = arrayDatastore（文档Train，outputType ="cell"）；dsTTrain = arrayDatastore(TTrain,OutputType="cell"）；dsTrain = combine(dsDocumentsTrain,dsTTrain);

Create an array datastore for the validation documents.

dsDocumentsValidation = arraydatastore（文档validation，outputType ="cell"）；

Create Word Encoding

To input the documents into a BiLSTM network, use a word encoding to convert the documents into sequences of numeric indices.

要创建一个单词编码，请使用文字编码function.

enc = WordEncoding（文档编码）

enc = wordEncoding with properties: NumWords: 417 Vocabulary: ["items" "are" "occasionally" "getting" "stuck" "in" "the" "scanner" "spools" "loud" "rattling" "and" "banging" "sounds" "coming" "from" "assembler" "pistons" "fried" … ]

定义网络

Define the BiLSTM network architecture. To input sequence data into the network, include a sequence input layer and set the input size to 1. Next, include a word embedding layer of dimension 25 and the same number of words as the word encoding. Next, include a BiLSTM layer and set the number of hidden units to 40. To use the BiLSTM layer for a sequence-to-label classification problem, set the output mode to"last". Finally, add a fully connected layer with the same size as the number of classes, and a softmax layer.

inputSize = 1; embeddingDimension = 25; numHiddenUnits = 40; numWords = enc.NumWords; layers = [ sequenceInputLayer(inputSize) wordEmbeddingLayer(embeddingDimension,numWords) bilstmLayer(numHiddenUnits,OutputMode="last") fullyConnectedLayer(numClasses) softmaxLayer]

layers = 5×1 Layer array with layers: 1 '' Sequence Input Sequence input with 1 dimensions 2 '' Word Embedding Layer Word embedding layer with 25 dimensions and 417 unique words 3 '' BiLSTM BiLSTM with 40 hidden units 4 '' Fully Connected 4 fully connected layer 5 '' Softmax softmax

将图层数组转换为dlnetworkobject.

net = dlnetwork（层）

net = dlnetwork with properties: Layers: [5×1 nnet.cnn.layer.Layer] Connections: [4×2 table] Learnables: [6×3 table] State: [2×3 table] InputNames: {'sequenceinput'} OutputNames: {'softmax'} Initialized: 1

Define Model Loss Function

创建功能modelLoss，在示例结尾处列出，dlnetworkobject, a mini-batch of input data with corresponding labels, and returns the loss and the gradients of the loss with respect to the learnable parameters in the network.

Specify Training Options

小型批量大小为16的30个时期的火车。

numepochs = 30;minibatchsize = 16;

指定亚当优化的选项。指定初始学习速率为0.001，衰减为0.01，梯度衰减因子0.9和平方梯度衰减因子0.999。

initialLearnRate = 0.001; decay = 0.01; gradientDecayFactor = 0.9; squaredGradientDecayFactor = 0.999;

Train Model

Create aminibatchqueue对象处理和管理微型数据的对象。对于每个迷你批次：

Use the custom mini-batch preprocessing functionpreprocessMiniBatch(defined at the end of this example) to convert documents to sequences and one-hot encode the labels. To pass the word encoding to the mini-batch, create an anonymous function that takes two inputs.
标签格式的预测因子维度“ BTC”(batch, time, channel). Theminibatchqueueobject, by default, converts the data todlarrayobjects with underlying type单身的.
Train on a GPU if one is available. Theminibatchqueueobject, by default, converts each output togpuarrayif a GPU is available. Using a GPU requires Parallel Computing Toolbox™ and a supported GPU device. For information on supported devices, seeGPU Support by Release（并行计算工具箱）.

mbq = minibatchqueue(dsTrain,...MiniBatchSize=miniBatchSize,...MiniBatchFcn=@(X,T) preprocessMiniBatch(X,T,enc),...MiniBatchFormat=[“ BTC”""）;

Create aminibatchqueueobject for the validation documents. For each mini-batch:

Use the custom mini-batch preprocessing function预处理前培养子(defined at the end of this example) to convert documents to sequences. This preprocessing function does not require label data. To pass the word encoding to the mini-batch, create an anonymous function that takes one input only.
标签格式的预测因子维度“ BTC”(batch, time, channel). Theminibatchqueueobject, by default, converts the data todlarrayobjects with underlying type单身的.
为了对所有观测值进行预测，请返回任何部分迷你批次。

mbqValidation = minibatchqueue (dsDocumentsValidation,...MiniBatchSize=miniBatchSize,...minibatchfcn =@（x）preprocessminibatchpredictors（x，enc），...MiniBatchFormat=“ BTC”,...PartialMiniBatch="return"）；

要轻松计算验证损失，请将验证标签转换为单热编码的向量并转换编码的标签以匹配网络输出格式。

TValidation = onehotencode(TValidation,2); TValidation = TValidation';

初始化培训进度图。

figure C = colororder; lineLossTrain = animatedline(Color=C(2,:)); lineLossValidation = animatedline(...linestyle =“  - ”,...Marker="o",...MarkerfaceColor ="black"）；ylim([0 inf]) xlabel(“迭代”) ylabel("Loss"） 网格on

Initialize the parameters for Adam.

trailingAvg = []; trailingAvgSq = [];

Train the network. For each epoch, shuffle the data and loop over mini-batches of data. At the end of each iteration, display the training progress. At the end of each epoch, validate the network using the validation data.

对于每个迷你批次：

将文档转换为整数的序列，并单热编码标签。
Convert the data todlarrayobjects with underlying type single and specify the dimension labels“ BTC”(batch, time, channel).
For GPU training, convert togpuarrayobjects.
Evaluate the model loss and gradients usingdlfeval和modelLossfunction.
Determine the learning rate for the time-based decay learning rate schedule.
Update the network parameters using theadamupdatefunction.
更新培训图。

iteration = 0; start = tic;％循环在时期。forepoch = 1:numEpochs% Shuffle data.洗牌（MBQ）;％循环在迷你批次上。whilehasdata（MBQ）迭代=迭代 + 1;％读取迷你数据的数据。[x，t] = next（mbq）;％使用DLFEVAL评估模型损失和梯度% modelLoss function.[loss,gradients] = dlfeval(@modelLoss,net,X,T);％确定基于时间的衰减学习率计划的学习率。learnRate = initialLearnRate/(1 + decay*iteration);% Update the network parameters using the Adam optimizer.[net,trailingAvg,trailingAvgSq] = adamupdate(net, gradients,...trailingAvg, trailingAvgSq, iteration, learnRate,...梯度decayFactor，SquardradientDecayFactor）;% Display the training progress.D = duration(0,0,toc(start),Format="hh:mm:ss"）；loss = double(loss); addpoints(lineLossTrain,iteration,loss) title(“时代：”+ epoch +", Elapsed: "+ string(D)) drawnow％验证网络。if迭代== 1 ||〜hasdata（mbq）[〜，scoresvalidation] = modelpredictions（net，mbqvalidation，class）;损失验证= crossentropy（得分，电视瓦化）；％更新图。lossValidation = double(lossValidation); addpoints(lineLossValidation,iteration,lossValidation) drawnowendendend

测试模型

Test the classification accuracy of the model by comparing the predictions on the validation set with the true labels.

使用验证数据分类modelPredictions函数，在示例的末尾列出。

YNew = modelPredictions(net,mbqValidation,classes);

To easily calculate the validation accuracy, convert the one-hot encoded validation labels to categorical and transpose.

TValidation = onehotdecode(TValidation,classes,1)';

Evaluate the classification accuracy.

accuracy = mean(YNew == TValidation)

accuracy = 0.8854

Predict Using New Data

Classify the event type of three new reports. Create a string array containing the new reports.

reportsNew = [“冷却液在混乱器下面汇总。”"Sorter blows fuses at start up.""There are some very loud rattling sounds coming from the assembler."];

Preprocess the text data using the preprocessing steps as the training documents.

documentsnew = preprocesstext（ReportsNew）;dsNew = arraydatastore（文档新闻，outputType ="cell"）；

Create aminibatchqueue对象处理和管理微型数据的对象。对于每个迷你批次：

Use the custom mini-batch preprocessing function预处理前培养子(defined at the end of this example) to convert documents to sequences. This preprocessing function does not require label data. To pass the word encoding to the mini-batch, create an anonymous function that takes one input only.
标签格式的预测因子维度“ BTC”(batch, time, channel). Theminibatchqueueobject, by default, converts the data todlarrayobjects with underlying type单身的.
为了对所有观测值进行预测，请返回任何部分迷你批次。

mbqNew = minibatchqueue(dsNew,...MiniBatchSize=miniBatchSize,...minibatchfcn =@（x）preprocessminibatchpredictors（x，enc），...MiniBatchFormat=“ BTC”,...PartialMiniBatch="return"）；

Classify the text data usingmodelPredictionsfunction, listed at the end of the example and find the classes with the highest scores.

ynew = modelpredictions（net，mbqnew，类）

YNew =3×1分类Leak Electronic Failure Mechanical Failure

Supporting Functions

文本预处理功能

The functionpreprocessTextperforms these steps:

使用tokenizedDocument.
Convert the text to lowercase usinglower.
Erase the punctuation usingerasePunctuation.

functiondocuments = preprocessText(textData)% Tokenize the text.documents = tokenizedDocument(textData);% Convert to lowercase.documents = lower(documents);% Erase punctuation.documents = erasePunctuation(documents);end

Mini-Batch Preprocessing Function

ThepreprocessMiniBatch函数将小批量文档转换为整数序列，一式式文档编码标签数据。

function[X,T] = preprocessMiniBatch(dataX,dataT,enc)% Preprocess predictors.X = preprocessMiniBatchPredictors(dataX,enc);% Extract labels from cell and concatenate.t = cat（1，datat {1：end}）;% One-hot encode labels.T = onehotencode(T,2);% Transpose the encoded labels to match the network output.t = t';end

Mini-Batch Predictors Preprocessing Function

The预处理前培养子函数将小批量文档转换为整数序列。

functionX = preprocessMiniBatchPredictors(dataX,enc)％从细胞中提取文档和连接酸盐。documents = cat(4,dataX{1:end});% Convert documents to sequences of integers.X = doc2sequence(enc,documents); X = cat(1,X{:});end

Model Loss Function

ThemodelLoss功能为dlnetworkobjectnet, a mini-batch of input dataXwith corresponding target labelsT并返回相对于可学习参数的损失梯度net, and the loss. To compute the gradients automatically, use thedlgradientfunction.

function[loss,gradients] = modelLoss(net,X,T) Y = forward(net,X); loss = crossentropy(Y,T); gradients = dlgradient(loss,net.Learnables);end

Model Predictions Function

ThemodelPredictions功能为dlnetworkobjectnet, a mini-batch queue, and outputs the model predictions and scores by iterating over mini-batches in the queue.

function[predictions,scores] = modelPredictions(net,mbq,classes)% Initialize predictions.predictions = []; scores = [];% Reset mini-batch queue.重置（MBQ）;％循环在迷你批次上。whilehasdata(mbq)％ 作出预测。x = next（MBQ）;y =预测（net，x）;得分= [得分y];y = oneHotDecode（y，class，1）';预测= [预测;y];endend