主要内容gydF4y2Ba

基于深度学习的视频和光流数据活动识别gydF4y2Ba

该示例首先展示了如何使用预训练的膨胀3-D (I3D)双流卷积神经网络视频分类器执行活动识别,然后展示了如何使用迁移学习来使用视频中的RGB和光流数据训练这样的视频分类器gydF4y2Ba[1]gydF4y2Ba.gydF4y2Ba

概述gydF4y2Ba

基于视觉的活动识别包括使用一组视频帧预测物体的动作,如行走、游泳或坐着。视频活动识别在人机交互、机器人学习、异常检测、监控、目标检测等方面有着广泛的应用。例如,在线预测来自多个摄像头的传入视频的多个动作对机器人学习很重要。与图像分类相比,使用视频的动作识别建模具有挑战性,因为视频数据集的地面真相数据不准确、视频中演员可以执行的各种手势、严重的类不平衡数据集,以及从头开始训练一个健壮的分类器所需的大量数据。深度学习技术,如I3D双流卷积网络gydF4y2Ba[1]gydF4y2Ba, r (2+1) d [gydF4y2Ba4gydF4y2Ba]和“慢速”[gydF4y2Ba5gydF4y2Ba]已经显示出在较小的数据集上使用迁移学习和在大型视频活动识别数据集上预训练的网络(如Kinetics-400)的性能提高[gydF4y2Ba6gydF4y2Ba].gydF4y2Ba

注意:gydF4y2Ba此示例需要用于膨胀3d视频分类的计算机视觉工具箱™模型。您可以从插件资源管理器中安装用于膨胀3d视频分类的计算机视觉工具箱模型。有关安装外接程序的详细信息,请参见gydF4y2Ba获取和管理外接组件gydF4y2Ba.gydF4y2Ba

使用预训练的膨胀- 3d视频分类器执行活动识别gydF4y2Ba

下载预训练的膨胀- 3d视频分类器以及视频文件,在其上执行活动识别。下载的zip文件大小约为89 MB。gydF4y2Ba

下载文件夹= fullfile(tempdir,gydF4y2Ba“hmdb51”gydF4y2Ba,gydF4y2Ba“pretrained”gydF4y2Ba,gydF4y2Ba“I3D”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~ isfolder (downloadFolder) mkdir (downloadFolder);gydF4y2Ba结束gydF4y2Ba文件名=gydF4y2Ba“activityRecognition-I3D-HMDB51-21b.zip”gydF4y2Ba;zipFile = fullfile(下载文件夹,文件名);gydF4y2Ba如果gydF4y2Ba~ isfile zipFile disp (gydF4y2Ba“下载预先训练好的网络……”gydF4y2Ba);downloadURL =gydF4y2Ba“https://ssd.mathworks.com/金宝appsupportfiles/vision/data/”gydF4y2Ba+文件名;websave (zipFile downloadURL);解压缩(zipFile downloadFolder);gydF4y2Ba结束gydF4y2Ba

加载预训练的膨胀- 3d视频分类器。gydF4y2Ba

pretrainedDataFile = fullfile(下载文件夹,gydF4y2Ba“inflated3d-FiveClasses-hmdb51.mat”gydF4y2Ba);pretrainedDataFile (pretrainedDataFile);inflated3dPretrained = pretrained.data.inflated3d;gydF4y2Ba

显示预训练的视频分类器的类标签名称。gydF4y2Ba

classes = inflated3dPretrained。类gydF4y2Ba
类=gydF4y2Ba5×1分类gydF4y2Ba吻笑捡倒俯卧撑gydF4y2Ba

阅读并显示视频gydF4y2Bapour.avigydF4y2Ba使用gydF4y2BaVideoReadergydF4y2Ba而且gydF4y2Ba愿景。V我deoPl一个yergydF4y2Ba.gydF4y2Ba

videoFilename = fullfile(下载文件夹,gydF4y2Ba“pour.avi”gydF4y2Ba);videereader = videereader (videoFilename);videoPlayer = vision.VideoPlayer;放像机。Name =gydF4y2Ba“倒”gydF4y2Ba;gydF4y2Ba而gydF4y2BahasFrame(视频阅读器)frame = readFrame(视频阅读器);gydF4y2Ba调整显示帧的大小。gydF4y2BaFrame = imresize(Frame, 1.5);步骤(放像机、框架);gydF4y2Ba结束gydF4y2Ba释放(放像机);gydF4y2Ba

随机选择10个视频序列对视频进行分类,统一覆盖整个文件,找到视频中占主导地位的动作类。gydF4y2Ba

numSequences = 10;gydF4y2Ba

方法对视频文件进行分类gydF4y2BaclassifyVideoFilegydF4y2Ba函数。gydF4y2Ba

[actionLabel,score] = classifyVideoFile(inflated3dPretrained, videoFilename,gydF4y2Ba“NumSequences”gydF4y2BanumSequences)gydF4y2Ba

actionLabel =gydF4y2Ba分类gydF4y2Ba倒gydF4y2Ba
分数=gydF4y2Ba单gydF4y2Ba0.4482gydF4y2Ba

训练一个用于手势识别的视频分类器gydF4y2Ba

示例的这一部分展示了如何使用迁移学习训练上面所示的视频分类器。设置gydF4y2BadoTraininggydF4y2Ba变量来gydF4y2Ba假gydF4y2Ba使用预先训练的视频分类器,而不必等待训练完成。或者,如果你想训练视频分类器,设置gydF4y2BadoTraininggydF4y2Ba变量来gydF4y2Ba真正的gydF4y2Ba.gydF4y2Ba

doTraining = false;gydF4y2Ba

下载培训和验证数据gydF4y2Ba

类训练一个膨胀3d (I3D)视频分类器gydF4y2BaHMDB51gydF4y2Ba数据集。使用gydF4y2BadownloadHMDB51gydF4y2Ba金宝app本例末尾列出的支持函数将HMDB51数据集下载到名为gydF4y2Bahmdb51gydF4y2Ba.gydF4y2Ba

下载文件夹= fullfile(tempdir,gydF4y2Ba“hmdb51”gydF4y2Ba);downloadHMDB51 (downloadFolder);gydF4y2Ba

下载完成后,解压缩RAR文件gydF4y2Bahmdb51_org.rargydF4y2Ba到gydF4y2Bahmdb51gydF4y2Ba文件夹中。接下来,使用gydF4y2BacheckForHMDB51FoldergydF4y2Ba金宝app本例末尾列出的支持函数,用于确认下载和提取的文件是否到位。gydF4y2Ba

allClasses = checkForHMDB51Folder(下载文件夹);gydF4y2Ba

该数据集包含约2 GB的视频数据,涵盖51个类别的7000个剪辑,例如gydF4y2Ba喝gydF4y2Ba,gydF4y2Ba运行gydF4y2Ba,gydF4y2Ba握手gydF4y2Ba.每个视频帧的高度为240像素,宽度最小为176像素。帧数从18到大约1000帧不等。gydF4y2Ba

为了减少训练时间,这个例子训练一个活动识别网络来分类5个动作类,而不是数据集中的全部51个类。集gydF4y2BauseAllDatagydF4y2Ba来gydF4y2Ba真正的gydF4y2Ba与所有51个班一起训练。gydF4y2Ba

useAllData = false;gydF4y2Ba如果gydF4y2BauseAllData classes = allClasses;gydF4y2Ba结束gydF4y2BadataFolder = fullfile(下载文件夹,gydF4y2Ba“hmdb51_org”gydF4y2Ba);gydF4y2Ba

将数据集分割为训练分类器的训练集和评估分类器的测试集。将80%的数据用于训练集,其余数据用于测试集。使用gydF4y2Bafolders2labelsgydF4y2Ba而且gydF4y2BasplitlabelsgydF4y2Ba通过从每个标签中随机选择一定比例的文件,从文件夹中创建标签信息,并根据每个标签将数据分成训练数据集和测试数据集。gydF4y2Ba

[labels,files] = folders2labels(fullfile(dataFolder,string(classes)),gydF4y2Ba...gydF4y2Ba“IncludeSubfolders”gydF4y2Ba,真的,gydF4y2Ba...gydF4y2Ba“FileExtensions”gydF4y2Ba,gydF4y2Ba“.avi”gydF4y2Ba);索引= splitlabels(标签,0.8,gydF4y2Ba“随机”gydF4y2Ba);trainfilename =文件(索引{1});testfilename =文件(索引{2});gydF4y2Ba

为了规范化网络的输入数据,MAT文件中提供了数据集的最小值和最大值gydF4y2BainputStatistics.matgydF4y2Ba,附在本例中。若要查找不同数据集的最小值和最大值,请使用gydF4y2BainputStatisticsgydF4y2Ba金宝app支持函数,在本例的末尾列出。gydF4y2Ba

inputStatsFilename =gydF4y2Ba“inputStatistics.mat”gydF4y2Ba;gydF4y2Ba如果gydF4y2Ba~存在(inputStatsFilenamegydF4y2Ba“文件”gydF4y2Ba) disp (gydF4y2Ba“读取所有训练数据以输入统计数据……”gydF4y2Ba) inputStats = inputStatistics(数据文件夹);gydF4y2Ba其他的gydF4y2Bad = load(inputStatsFilename);inputStats = d.inputStats;gydF4y2Ba结束gydF4y2Ba

加载数据集gydF4y2Ba

本例使用数据存储从视频文件中读取视频场景、相应的光流数据和相应的标签。gydF4y2Ba

指定每次从数据存储读取数据时,数据存储应配置为输出的视频帧数。gydF4y2Ba

numFrames = 64;gydF4y2Ba

这里使用64来平衡内存使用和分类时间。常见的值是16、32、64或128。使用更多的帧有助于捕获额外的临时信息,但需要更多的内存。您可能需要根据您的系统资源降低这个值。为了确定最佳帧数,需要进行实证分析。gydF4y2Ba

接下来,指定数据存储应该配置为输出的帧的高度和宽度。数据存储会自动调整原始视频帧的大小到指定的大小,以便批量处理多个视频序列。gydF4y2Ba

frameSize = [112,112];gydF4y2Ba

值[112 112]用于捕获视频场景中较长的时间关系,这有助于对长时间持续的活动进行分类。常用的大小值为[112 112]、[224 224]或[256 256]。较小的尺寸可以使用更多的视频帧,但以内存使用、处理时间和空间分辨率为代价。HMDB51数据集中视频帧的最小高度和宽度分别为240和176。如果您想要为要读取的数据存储指定大于最小值的帧大小,例如[256,256],首先使用gydF4y2BaimresizegydF4y2Ba.与帧数一样,需要通过实证分析来确定最优值。gydF4y2Ba

指定通道的数量为gydF4y2Ba3.gydF4y2Ba为RGB视频子网,和gydF4y2Ba2gydF4y2Ba为I3D视频分类器的光流子网。光流数据的两个通道是gydF4y2Ba xgydF4y2Ba 而且gydF4y2Ba ygydF4y2Ba 速度分量,gydF4y2Ba VgydF4y2Ba xgydF4y2Ba 而且gydF4y2Ba VgydF4y2Ba ygydF4y2Ba ,分别。gydF4y2Ba

rgbChannels = 3;flowChannels = 2;gydF4y2Ba

使用helper函数,gydF4y2BacreateFileDatastoregydF4y2Ba,以配置两个gydF4y2BaFileDatastoregydF4y2Ba用于加载数据的对象,一个用于训练,另一个用于验证。helper函数列在本例的末尾。每个数据存储读取一个视频文件,提供RGB数据和相应的标签信息。gydF4y2Ba

isDataForTraining = true;dsTrain = createFileDatastore(trainfilename,numFrames,rgbChannels,classes,isDataForTraining);isDataForTraining = false;dsVal = createFileDatastore(testfilename,numFrames,rgbChannels,classes,isDataForTraining);gydF4y2Ba

定义网络架构gydF4y2Ba

I3D网络gydF4y2Ba

使用3d CNN是一种从视频中提取时空特征的自然方法。您可以通过扩展2-D过滤器和将内核池化为3-D来从预先训练好的2-D图像分类网络(如Inception v1或ResNet-50)创建I3D网络。该过程重用从图像分类任务中学习到的权重来引导视频识别任务。gydF4y2Ba

下图是一个示例,展示了如何将2-D卷积层膨胀为3-D卷积层。膨胀包括通过添加第三个维度(时间维度)来扩大过滤器的大小、权重和偏差。gydF4y2Ba

graphic1.pnggydF4y2Ba

双流I3D网络gydF4y2Ba

视频数据可以被认为有两部分:空间部分和时间部分。gydF4y2Ba

  • 空间组件包括视频中物体的形状、纹理和颜色的信息。RGB数据包含此信息。gydF4y2Ba

  • 时间组件包括关于物体在帧间运动的信息,并描绘相机和场景中物体之间的重要运动。计算光流是一种从视频中提取时间信息的常用技术。gydF4y2Ba

双流CNN包括一个空间子网络和一个时间子网络gydF4y2Ba[2]gydF4y2Ba.在密集光流和视频数据流上训练的卷积神经网络可以在有限的训练数据下获得比原始堆叠RGB帧更好的性能。下图显示了一个典型的双流I3D网络。gydF4y2Ba

graphic2.pnggydF4y2Ba

配置用于迁移学习的I3D视频分类器gydF4y2Ba

在这个例子中,您创建了一个基于GoogLeNet架构的I3D视频分类器,这是一个在kineics -400数据集上预训练的3D卷积神经网络视频分类器。gydF4y2Ba

指定GoogLeNet作为I3D视频分类器的骨干卷积神经网络架构,它包含两个子网络,一个用于视频数据,另一个用于光流数据。gydF4y2Ba

baseNetwork =gydF4y2Ba“googlenet-video-flow”gydF4y2Ba;gydF4y2Ba

指定膨胀3d视频分类器的输入大小。gydF4y2Ba

inputSize = [frameSize, rgbChannels, numFrames];gydF4y2Ba

RGB和光流数据的最小值和光流数据的最大值gydF4y2BainputStatsgydF4y2Ba结构从gydF4y2BainputStatistics.matgydF4y2Ba文件。需要这些值来规范化输入数据。gydF4y2Ba

oflowMin = squeeze(inputStats.oflowMin)';oflowMax = squeeze(inputStats.oflowMax)';rgbMin = squeeze(inputStats.rgbMin)';rgbMax = squeeze(inputStats.rgbMax)';stats.Video.Min = rgbMin;stats.Video.Max = rgbMax;stats.Video.Mean = [];stats.Video.StandardDeviation = [];stats.OpticalFlow.Min = oflowMin(1:flowChannels);stats.OpticalFlow.Max = oflowMax(1:flowChannels); stats.OpticalFlow.Mean = []; stats.OpticalFlow.StandardDeviation = [];

属性创建I3D视频分类器gydF4y2Bainflated3dVideoClassifiergydF4y2Ba函数。gydF4y2Ba

i3d = inflated3dVideoClassifier(baseNetwork,string(classes),gydF4y2Ba...gydF4y2Ba“InputSize”gydF4y2BainputSize,gydF4y2Ba...gydF4y2Ba“InputNormalizationStatistics”gydF4y2Ba、统计数据);gydF4y2Ba

为视频分类器指定模型名称。gydF4y2Ba

i3d。ModelName =gydF4y2Ba基于视频和光流的膨胀三维活动识别器gydF4y2Ba;gydF4y2Ba

增强和预处理训练数据gydF4y2Ba

数据增强提供了一种使用有限数据集进行训练的方法。对于一组帧,即一个视频序列,基于网络输入大小,对视频数据的增强必须是相同的。微小的变化,如平移、裁剪或转换图像,可以提供新的、独特的和唯一的图像,您可以使用这些图像来训练一个健壮的视频分类器。数据存储是读取和增加数据集合的一种方便方法。增强训练视频数据gydF4y2BaaugmentVideogydF4y2Ba金宝app支持函数,在本例的末尾定义。gydF4y2Ba

dsTrain = transform(dsTrain, @augmentVideo);gydF4y2Ba

预处理训练视频数据以调整大小为膨胀- 3d视频分类器输入大小,通过使用gydF4y2BapreprocessVideoClipsgydF4y2Ba,在本例的末尾定义。指定gydF4y2BaInputNormalizationStatisticsgydF4y2Ba视频分类器的属性和预处理函数的输入大小作为结构中的字段值,gydF4y2BapreprocessInfogydF4y2Ba.的gydF4y2BaInputNormalizationStatisticsgydF4y2Ba属性用于将视频帧和光流数据在-1到1之间重新缩放。输入大小用于调整视频帧的大小gydF4y2BaimresizegydF4y2Ba基于gydF4y2BaSizingOptiongydF4y2Ba的价值gydF4y2Ba信息gydF4y2Ba结构体。或者,你可以用gydF4y2Ba“randomcrop”gydF4y2Ba或gydF4y2Ba“centercrop”gydF4y2Ba将输入数据随机裁剪或集中裁剪到视频分类器的输入大小。请注意,数据扩充并不应用于测试和验证数据。理想情况下,测试和验证数据应该代表原始数据,并且不进行修改,以便进行公正的评估。gydF4y2Ba

preprocessInfo。统计信息= i3d.InputNormalizationStatistics;preprocessInfo。InputSize=我nputSize; preprocessInfo.SizingOption =“调整”gydF4y2Ba;dsTrain = transform(dsTrain, @(data)preprocessVideoClips(data, preprocessInfo));dsVal = transform(dsVal, @(数据)preprocessVideoClips(数据,preprocessInfo));gydF4y2Ba

定义模型梯度函数gydF4y2Ba

创建支持功能金宝appgydF4y2BamodelGradientsgydF4y2Ba,在本例的末尾列出。的gydF4y2BamodelGradientsgydF4y2Ba函数以I3D视频分类器为输入gydF4y2Bai3dgydF4y2Ba,一小批输入数据gydF4y2BadlRGBgydF4y2Ba而且gydF4y2BadlFlowgydF4y2Ba,以及一小批地面真值标签数据gydF4y2Ba海底gydF4y2Ba.该函数返回训练损失值,损失相对于分类器的可学习参数的梯度,以及分类器的小批量精度。gydF4y2Ba

损失是通过计算每个子网络的预测的交叉熵损失的平均值来计算的。网络的输出预测是每个类在0到1之间的概率。gydF4y2Ba

rgydF4y2Ba ggydF4y2Ba bgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba =gydF4y2Ba cgydF4y2Ba rgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba egydF4y2Ba ngydF4y2Ba tgydF4y2Ba rgydF4y2Ba ogydF4y2Ba pgydF4y2Ba ygydF4y2Ba (gydF4y2Ba rgydF4y2Ba ggydF4y2Ba bgydF4y2Ba PgydF4y2Ba rgydF4y2Ba egydF4y2Ba dgydF4y2Ba 我gydF4y2Ba cgydF4y2Ba tgydF4y2Ba 我gydF4y2Ba ogydF4y2Ba ngydF4y2Ba )gydF4y2Ba

fgydF4y2Ba lgydF4y2Ba ogydF4y2Ba wgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba =gydF4y2Ba cgydF4y2Ba rgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba egydF4y2Ba ngydF4y2Ba tgydF4y2Ba rgydF4y2Ba ogydF4y2Ba pgydF4y2Ba ygydF4y2Ba (gydF4y2Ba fgydF4y2Ba lgydF4y2Ba ogydF4y2Ba wgydF4y2Ba PgydF4y2Ba rgydF4y2Ba egydF4y2Ba dgydF4y2Ba 我gydF4y2Ba cgydF4y2Ba tgydF4y2Ba 我gydF4y2Ba ogydF4y2Ba ngydF4y2Ba )gydF4y2Ba

lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba =gydF4y2Ba 米gydF4y2Ba egydF4y2Ba 一个gydF4y2Ba ngydF4y2Ba (gydF4y2Ba [gydF4y2Ba rgydF4y2Ba ggydF4y2Ba bgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba ,gydF4y2Ba fgydF4y2Ba lgydF4y2Ba ogydF4y2Ba wgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba ]gydF4y2Ba )gydF4y2Ba

每个分类器的精度是通过取RGB和光流预测的平均值,并将其与输入的ground truth标签进行比较来计算的。gydF4y2Ba

指定培训项目gydF4y2Ba

在600次迭代中使用20个小批量进行训练。方法指定保存具有最佳验证精度的视频分类器的迭代gydF4y2BaSaveBestAfterIterationgydF4y2Ba参数。gydF4y2Ba

指定余弦退火学习率计划[gydF4y2Ba3.gydF4y2Ba)参数:gydF4y2Ba

  • 最小学习率为1e-4。gydF4y2Ba

  • 最大学习率为1e-3。gydF4y2Ba

  • 100、200和300的余弦迭代数,之后学习率计划周期重新开始。的选项gydF4y2BaCosineNumIterationsgydF4y2Ba定义每个余弦周期的宽度。gydF4y2Ba

为SGDM优化指定参数。在训练开始时初始化SGDM优化参数:gydF4y2Ba

  • 动量是0.9。gydF4y2Ba

  • 初始速度参数初始化为gydF4y2Ba[]gydF4y2Ba.gydF4y2Ba

  • L2正则化因子为0.0005。gydF4y2Ba

指定使用并行池在后台调度数据。如果gydF4y2BaDispatchInBackgroundgydF4y2Ba设置为true时,打开具有指定数量的并行工作者的并行池,并创建gydF4y2BaDispatchInBackgroundDatastoregydF4y2Ba,它在后台调度数据,使用异步数据加载和预处理来加速训练。默认情况下,这个例子使用一个可用的GPU。否则,它使用CPU。使用GPU需要并行计算工具箱™和支持CUDA®的NVIDIA®GPU。有关受支持的计算功能的信息,请参见金宝appgydF4y2BaGPU计算要求gydF4y2Ba(并行计算工具箱)gydF4y2Ba.gydF4y2Ba

参数个数。类=cl一个年代年代e年代;参数个数。MiniBatchSize = 20;参数个数。NumIterations = 600; params.SaveBestAfterIteration = 400; params.CosineNumIterations = [100, 200, 300]; params.MinLearningRate = 1e-4; params.MaxLearningRate = 1e-3; params.Momentum = 0.9; params.VelocityRGB = []; params.VelocityFlow = []; params.L2Regularization = 0.0005; params.ProgressPlot = true; params.Verbose = true; params.ValidationData = dsVal; params.DispatchInBackground = false; params.NumWorkers = 4;

训练I3D视频分类器gydF4y2Ba

利用RGB视频数据和光流数据训练I3D视频分类器。gydF4y2Ba

对于每个时代:gydF4y2Ba

  • 在循环遍历小批量数据之前,对数据进行洗牌。gydF4y2Ba

  • 使用gydF4y2BaminibatchqueuegydF4y2Ba对小批进行循环检查。支持函数金宝appgydF4y2BacreateMiniBatchQueuegydF4y2Ba,在本例末尾列出,它使用给定的训练数据存储来创建一个gydF4y2BaminibatchqueuegydF4y2Ba.gydF4y2Ba

  • 使用验证数据gydF4y2BadsValgydF4y2Ba验证网络。gydF4y2Ba

  • 使用支持函数显示每个历元的损失和精度结果金宝appgydF4y2BadisplayVerboseOutputEveryEpochgydF4y2Ba,在本例的末尾列出。gydF4y2Ba

对于每个小批量:gydF4y2Ba

  • 将视频数据或光流数据与标签转换为gydF4y2BadlarraygydF4y2Ba对象的基础类型为single。gydF4y2Ba

  • 为了使用I3D视频分类器处理视频数据的时间维度,指定时间序列维度,gydF4y2Ba“T”gydF4y2Ba.指定尺寸标签gydF4y2Ba“SSCTB”gydF4y2Ba(空间,空间,通道,时间,批量)为视频数据,和gydF4y2Ba“CB”gydF4y2Ba用于标签数据。gydF4y2Ba

的gydF4y2BaminibatchqueuegydF4y2Ba对象使用支持函数金宝appgydF4y2BabatchVideoAndFlowgydF4y2Ba,在本例的末尾列出,用于批处理RGB视频和光流数据。gydF4y2Ba

参数个数。ModelFilename =gydF4y2Ba“inflated3d-FiveClasses-hmdb51.mat”gydF4y2Ba;gydF4y2Ba如果gydF4y2BadoTraining epoch = 1;bestLoss = realmax;accTrain = [];accTrainRGB = [];accTrainFlow = [];lossTrain = [];迭代= 1;开始= tic;trainTime =开始;shuffleTrainDs(dsTrain);gydF4y2Ba输出数量为3个:一个RGB帧,一个光流gydF4y2Ba% data,一个用于ground truth标签。gydF4y2BanumOutputs = 3;mbq = createMiniBatchQueue(shuffered, numOutputs, params);gydF4y2Ba使用initializeTrainingProgressPlot和initializeVerboseOutputgydF4y2Ba%支金宝app持函数(示例末尾列出)用于初始化gydF4y2Ba%的训练进度图和详细输出显示训练gydF4y2Ba%损失、训练准确率和验证准确率。gydF4y2Baploters = initializeTrainingProgressPlot(params);initializeVerboseOutput (params);gydF4y2Ba而gydF4y2Ba迭代<= params。NumIterationsgydF4y2Ba遍历数据集。gydF4y2Ba[dlVideo,dlFlow,dlY] = next(mbq);gydF4y2Ba使用dlfeval评估模型梯度和损失。gydF4y2Ba[gradRGB gradFlow,损失,acc, accRGB accFlow, stateRGB, stateFlow] =gydF4y2Ba...gydF4y2Badlfeval (@modelGradients i3d、dlVideo dlFlow,海底);gydF4y2Ba累积损失和精度。gydF4y2BalossTrain = [lossTrain, loss];accTrain = [accTrain, acc];accTrainRGB = [accTrainRGB, accRGB];accTrainFlow = [accTrainFlow, accFlow];gydF4y2Ba更新网络状态。gydF4y2Bai3d。V我deoState = stateRGB; i3d.OpticalFlowState = stateFlow;更新RGB和光流的梯度和参数gydF4y2Ba%的子网使用SGDM优化器。gydF4y2Ba[i3d.VideoLearnables,参数个数。VelocityRGB] =gydF4y2Ba...gydF4y2BaupdateLearnables (i3d.VideoLearnables gradRGB params, params.VelocityRGB,迭代);[i3d.OpticalFlowLearnables,参数个数。VelocityFlow learnRate] =gydF4y2Ba...gydF4y2BaupdateLearnables (i3d.OpticalFlowLearnables gradFlow params, params.VelocityFlow,迭代);gydF4y2Ba如果gydF4y2Ba~hasdata(mbq) ||迭代==参数。NumIterationsgydF4y2Ba当前纪元已完成。进行验证并更新进度。gydF4y2BatrainTime = toc(trainTime);(cmat validationTime, lossValidation、accValidation accValidationRGB, accValidationFlow] =gydF4y2Ba...gydF4y2BadoValidation (params, i3d);accTrain = mean(accTrain);accTrainRGB = mean(accTrainRGB);accTrainFlow = mean(accTrainFlow);lossTrain = mean(lossTrain);gydF4y2Ba更新培训进度。gydF4y2BadisplayVerboseOutputEveryEpoch(参数、启动、learnRate时代,迭代,gydF4y2Ba...gydF4y2BaaccTrain、accTrainRGB accTrainFlow,gydF4y2Ba...gydF4y2BaaccValidation、accValidationRGB accValidationFlow,gydF4y2Ba...gydF4y2BalossTrain lossValidation,火车离站时刻表,validationTime);updateProgressPlot (params,策划者,时代,迭代,开始,lossTrain, accTrain, accValidation);gydF4y2Ba保存训练的视频分类器和给出的参数gydF4y2Ba%是迄今为止最好的验证损失。使用saveData支持函数,金宝appgydF4y2Ba%列在本例末尾。gydF4y2BabestLoss = saveData(i3d,bestLoss,迭代,cmat,lossTrain,lossValidation,gydF4y2Ba...gydF4y2BaaccTrain、accValidation params);gydF4y2Ba结束gydF4y2Ba如果gydF4y2Ba~hasdata(mbq) && iteration <参数。NumIterationsgydF4y2Ba当前纪元已完成。初始化训练损失,准确度高gydF4y2Ba% values,以及下一个纪元的minibatchqueue。gydF4y2BaaccTrain = [];accTrainRGB = [];accTrainFlow = [];lossTrain = [];trainTime = tic;Epoch = Epoch + 1;shuffleTrainDs(dsTrain);numOutputs = 3;mbq = createMiniBatchQueue(shuffered, numOutputs, params);gydF4y2Ba结束gydF4y2Ba迭代=迭代+ 1;gydF4y2Ba结束gydF4y2Ba培训完成时显示消息。gydF4y2BaendVerboseOutput (params);disp (gydF4y2Ba“模型保存到:”gydF4y2Ba+ params.ModelFilename);gydF4y2Ba结束gydF4y2Ba

评估训练网络gydF4y2Ba

使用测试数据集来评估训练后的视频分类器的准确性。gydF4y2Ba

加载训练期间保存的最佳模型或使用预训练的模型。gydF4y2Ba

如果gydF4y2BadoTraining transferLearned = load(params.ModelFilename);inflated3dPretrained = transferLearned.data.inflated3d;gydF4y2Ba结束gydF4y2Ba

创建一个gydF4y2BaminibatchqueuegydF4y2Ba对象加载测试数据的批次。gydF4y2Ba

numOutputs = 3;mbq = createMiniBatchQueue(参数。V一个l我d一个t我onData, numOutputs, params);

对于每一批测试数据,使用RGB和光流网络进行预测,对预测结果取平均值,并使用混淆矩阵计算预测精度。gydF4y2Ba

numClasses = nummel(类);cmat = sparse(numClasses,numClasses);gydF4y2Ba而gydF4y2Bahasdata(mbq) [dlRGB, dlFlow, dlY] = next(mbq);gydF4y2Ba将视频输入作为RGB和光流数据通过gydF4y2Ba%双流I3D视频分类器,得到单独的预测。gydF4y2Ba[dlYPredRGB,dlYPredFlow] = predict(inflated3dpre训练,dlRGB,dlFlow);gydF4y2Ba通过计算预测的平均值来融合预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow)/2;gydF4y2Ba计算预测的准确性。gydF4y2Ba[~,YTest] = max(dlY,[],1);[~,YPred] = max(dlYPred,[],1);cmat = aggregateConfusionMetric(cmat,YTest,YPred);gydF4y2Ba结束gydF4y2Ba

计算训练网络的平均分类精度。gydF4y2Ba

= sum(diag(cmat))./sum(cmat,gydF4y2Ba“所有”gydF4y2Ba)gydF4y2Ba
精确度= 0.8850gydF4y2Ba

显示混淆矩阵。gydF4y2Ba

Figure chart = confusionchart(cmat,classes);gydF4y2Ba

在Kinetics-400数据集上预训练的膨胀- 3d视频分类器,在迁移学习中为人类活动识别提供了更好的性能。上述训练在24GB Titan-X GPU上运行约100分钟。在一个小型活动识别视频数据集上从头开始训练时,训练时间和收敛时间比预先训练的视频分类器要长得多。使用Kinetics-400预训练的膨胀- 3d视频分类器进行传输学习,也避免了在运行大量epoch时过度拟合分类器。然而,在Kinetics-400数据集上预训练的slow - fast视频分类器和R(2+1)D视频分类器在训练期间与膨胀- 3d视频分类器相比提供了更好的性能和更快的收敛。要了解有关使用深度学习的视频识别的更多信息,请参见gydF4y2Ba开始使用深度学习进行视频分类gydF4y2Ba.gydF4y2Ba

金宝app支持功能gydF4y2Ba

inputStatisticsgydF4y2Ba

的gydF4y2BainputStatisticsgydF4y2Ba函数将包含HMDB51数据的文件夹名称作为输入,并计算RGB数据和光流数据的最小值和最大值。最小值和最大值被用作网络输入层的归一化输入。该函数还获取每个视频文件中的帧数,以便在以后的训练和测试网络时使用。为了找到不同数据集的最小值和最大值,请对包含该数据集的文件夹名称使用此函数。gydF4y2Ba

函数gydF4y2BainputStats = inputStatistics(数据文件夹)ds = createDatastore(数据文件夹);ds。ReadFcn = @getMinMax;抽搐;Tt =高(ds);Varnames = {gydF4y2Ba“rgbMax”gydF4y2Ba,gydF4y2Ba“rgbMin”gydF4y2Ba,gydF4y2Ba“oflowMax”gydF4y2Ba,gydF4y2Ba“oflowMin”gydF4y2Ba};统计数据=收集(groupsummary(tt,[],{gydF4y2Ba“马克斯”gydF4y2Ba,gydF4y2Ba“最小值”gydF4y2Ba}, varnames));inputStats。Filename = gather(tt.Filename);inputStats。NumFrames = gather(tt.NumFrames);inputStats。rgbMax = stats.max_rgbMax; inputStats.rgbMin = stats.min_rgbMin; inputStats.oflowMax = stats.max_oflowMax; inputStats.oflowMin = stats.min_oflowMin; save(“inputStatistics.mat”gydF4y2Ba,gydF4y2Ba“inputStats”gydF4y2Ba);toc;gydF4y2Ba结束gydF4y2Ba函数gydF4y2Badata = getMinMax(文件名)reader = VideoReader(文件名);opticFlow = opticalFlowFarneback;数据= [];gydF4y2Ba而gydF4y2BahasFrame(reader) = readFrame(reader);[rgb,oflow] = findMinMax(frame,opticFlow);data = assignMinMax(data, rgb, oflow);gydF4y2Ba结束gydF4y2BatotalFrames = floor(阅读器。Duration * reader.FrameRate);totalFrames = min(totalFrames, reader.NumFrames);[labelName, filename] = getLabelFilename(filename);数据。Filename = fullfile(labelName, Filename);数据。NumFrames = totalFrames;数据= struct2table(数据,gydF4y2Ba“AsArray”gydF4y2Ba,真正的);gydF4y2Ba结束gydF4y2Ba函数gydF4y2Ba[labelName, filename] = getLabelFilename(filename) fileNameSplit = split(filename,gydF4y2Ba' / 'gydF4y2Ba);labelName = fileNameSplit{end-1};filename = fileNameSplit{end};gydF4y2Ba结束gydF4y2Ba函数gydF4y2Badata = assignMinMax(data, rgb, oflow)gydF4y2Ba如果gydF4y2Baisempty(数据)的数据。rgbMax = rgb.Max;数据。rgbMin = rgb.Min; data.oflowMax = oflow.Max; data.oflowMin = oflow.Min;返回gydF4y2Ba;gydF4y2Ba结束gydF4y2Ba数据。rgbMax = max(data.rgbMax, rgb.Max); data.rgbMin = min(data.rgbMin, rgb.Min); data.oflowMax = max(data.oflowMax, oflow.Max); data.oflowMin = min(data.oflowMin, oflow.Min);结束gydF4y2Ba函数gydF4y2Ba[rgbMinMax,oflowMinMax] = findMinMax(rgb, opticFlow) rgbMinMax。Max = Max (rgb,[],[1,2]);rgbMinMax。Min = Min (rgb,[],[1,2]);灰色= rgb2gray(rgb);流量= estimateFlow(opticFlow,灰色);oflow = cat(3,流量。vx,流量。vy,流量。量级);oflowMinMax。Max = Max (oflow,[],[1,2]);oflowMinMax。Min = Min (oflow,[],[1,2]);gydF4y2Ba结束gydF4y2Ba函数gydF4y2Bads = createDatastore(文件夹)gydF4y2Ba...gydF4y2Ba“IncludeSubfolders”gydF4y2Ba,真的,gydF4y2Ba...gydF4y2Ba“FileExtensions”gydF4y2Ba,gydF4y2Ba“.avi”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“UniformRead”gydF4y2Ba,真的,gydF4y2Ba...gydF4y2Ba“ReadFcn”gydF4y2Ba, @getMinMax);disp (gydF4y2Ba”NumFiles:“gydF4y2Ba+元素个数(ds.Files));gydF4y2Ba结束gydF4y2Ba

createFileDatastoregydF4y2Ba

的gydF4y2BacreateFileDatastoregydF4y2Ba函数创建gydF4y2BaFileDatastoregydF4y2Ba使用给定的文件名初始化。的gydF4y2BaFileDatastoregydF4y2Ba对象将数据读入gydF4y2Ba“partialfile”gydF4y2Ba模式,因此每次读取都可以从视频中返回部分读取的帧。这个功能有助于读取大的视频文件,如果所有的帧都不适合内存。gydF4y2Ba

函数gydF4y2Badatastore = createFileDatastore(trainingFolder,numFrames,numChannels,classes,isDataForTraining) readFcn = @(f,u)readVideo(f,u,numFrames,numChannels,classes,isDataForTraining);数据存储= fileDatastore(训练文件夹,gydF4y2Ba...gydF4y2Ba“IncludeSubfolders”gydF4y2Ba,真的,gydF4y2Ba...gydF4y2Ba“FileExtensions”gydF4y2Ba,gydF4y2Ba“.avi”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“ReadFcn”gydF4y2BareadFcn,gydF4y2Ba...gydF4y2Ba“ReadMode”gydF4y2Ba,gydF4y2Ba“partialfile”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba

shuffleTrainDsgydF4y2Ba

的gydF4y2BashuffleTrainDsgydF4y2Ba函数打乱训练数据存储中存在的文件gydF4y2BadsTraingydF4y2Ba.gydF4y2Ba

函数gydF4y2BashuffleTrainDs(dsTrain) shuffle =复制(dsTrain);转换= isa(洗牌,gydF4y2Ba“matlab.io.datastore.TransformedDatastore”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba转换文件= shuffle .底层数据存储{1}.Files;gydF4y2Ba其他的gydF4y2Bafiles = shuffle . files;gydF4y2Ba结束gydF4y2BaN = numel(文件);shuffledIndices = randperm(n);gydF4y2Ba如果gydF4y2Ba改变shuffled.UnderlyingDatastores{1}。Files = Files (shuffledIndices);gydF4y2Ba其他的gydF4y2Ba重新洗了一遍。Files = Files (shuffledIndices);gydF4y2Ba结束gydF4y2Ba重置(重组);gydF4y2Ba结束gydF4y2Ba

readVideogydF4y2Ba

的gydF4y2BareadVideogydF4y2Ba函数读取视频帧,以及给定视频文件的相应标签值。在训练过程中,read函数根据网络输入大小读取特定的帧数,起始帧随机选择。在测试期间,依次读取所有帧。视频帧的大小被调整为训练、测试和验证所需的分类器网络输入大小。gydF4y2Ba

函数gydF4y2Ba[data,userdata,done] = readVideo(filename,userdata,numFrames,numChannels,classes,isDataForTraining)gydF4y2Ba如果gydF4y2Baisempty(用户数据)用户数据。re一个der=VideoReader(f我len一个米e);用户数据。batchesRead = 0; userdata.label = getLabel(filename,classes); totalFrames = floor(userdata.reader.Duration * userdata.reader.FrameRate); totalFrames = min(totalFrames, userdata.reader.NumFrames); userdata.totalFrames = totalFrames; userdata.datatype = class(read(userdata.reader,1));结束gydF4y2BaReader = userdata.reader;totalFrames = userdata.totalFrames;标签= userdata.label;batchesRead = userdata.batchesRead;gydF4y2Ba如果gydF4y2BaisDataForTraining视频= readForTraining(reader, numFrames, totalFrames);gydF4y2Ba其他的gydF4y2Ba视频= readForValidation(读取器,用户数据。d一个t一个type,numChannels, numFrames, totalFrames);结束gydF4y2Ba数据={视频,标签};batchesRead = batchesRead + 1;用户数据。b一个tchesRead = batchesRead;如果gydF4y2BanumFrames > totalFrames numbatch = 1;gydF4y2Ba其他的gydF4y2BanumBatches = floor(totalFrames/numFrames);gydF4y2Ba结束gydF4y2Ba如果读取器已经读取了所有帧,则将done标志设置为truegydF4y2Ba%如果是训练。gydF4y2Badone = batchesRead == numbatch || isDataForTraining;gydF4y2Ba结束gydF4y2Ba

readForTraininggydF4y2Ba

的gydF4y2BareadForTraininggydF4y2Ba函数读取用于训练视频分类器的视频帧。该函数根据网络输入大小读取特定的帧数,起始帧是随机选择的。如果没有足够的帧,则重复视频序列以填充所需的帧数。gydF4y2Ba

函数gydF4y2Bavideo = readForTraining(reader, numFrames, totalFrames)gydF4y2Ba如果gydF4y2BanumFrames >= totalFrames startIdx = 1;endIdx = totalFrames;gydF4y2Ba其他的gydF4y2BastartIdx = randperm(totalFrames - numFrames + 1);startIdx = startIdx(1);endIdx = startIdx + numFrames - 1;gydF4y2Ba结束gydF4y2Bavideo = read(reader,[startIdx,endIdx]);gydF4y2Ba如果gydF4y2BanumFrames > totalFramesgydF4y2Ba添加更多帧以填充网络输入大小。gydF4y2Ba附加= ceil(numFrames/totalFrames);视频= repmat(视频,1,1,1,附加);video = video(:,:,:,1:numFrames);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

readForValidationgydF4y2Ba

的gydF4y2BareadForValidationgydF4y2Ba函数读取视频帧,以评估训练后的视频分类器。该函数根据网络输入大小依次读取特定的帧数。如果没有足够的帧,则重复视频序列以填充所需的帧数。gydF4y2Ba

函数gydF4y2BaH = readForValidation(reader, datatype, numChannels, numFrames, totalFrames);W = reader.Width;toRead = min([numFrames,totalFrames]);视频= 0 ([H,W,numChannels,toRead],数据类型);frameIndex = 0;gydF4y2Ba而gydF4y2BahasFrame(reader) && frameIndex < numFrames frame = readFrame(reader);frameIndex = frameIndex + 1;video(:,:,:,frameIndex) =帧;gydF4y2Ba结束gydF4y2Ba如果gydF4y2BaframeIndex < numFrames video = video(:,:,:,1:frameIndex);附加= ceil(numFrames/frameIndex);视频= repmat(视频,1,1,1,附加);video = video(:,:,:,1:numFrames);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

getLabelgydF4y2Ba

的gydF4y2BagetLabelgydF4y2Ba函数从文件名的完整路径中获取标签名称。文件的标签是文件所在的文件夹。例如,对于文件路径如gydF4y2Ba“/道路/ /数据/鼓掌/ video_0001.avi”gydF4y2Ba,标签名称为gydF4y2Ba“鼓掌”gydF4y2Ba.gydF4y2Ba

函数gydF4y2Ba标签= getLabel(文件名,类)文件夹= fileparts(字符串(文件名));[~,label] = fileparts(文件夹);Label = categorical(string(Label), string(classes));gydF4y2Ba结束gydF4y2Ba

augmentVideogydF4y2Ba

的gydF4y2BaaugmentVideogydF4y2Ba方法提供的增强变换函数gydF4y2BaaugmentTransformgydF4y2Ba金宝app支持功能应用相同的增强跨视频序列。gydF4y2Ba

函数gydF4y2Badata = augmentVideo(data) numSequences = size(data,1);gydF4y2Ba为gydF4y2Baii = 1:numSequences视频=数据{ii,1};gydF4y2Ba% HxWxCgydF4y2BaSz = size(视频,[1,2,3]);gydF4y2Ba每个序列增加一次gydF4y2BaaugmentFcn = augmentTransform(sz);data{ii,1} = augmentFcn(视频);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

augmentTransformgydF4y2Ba

的gydF4y2BaaugmentTransformgydF4y2Ba函数使用随机左右翻转和缩放因子创建一个增强方法。gydF4y2Ba

函数gydF4y2BaaugmentFcn = szgydF4y2Ba%随机翻转和缩放图像。gydF4y2Batform = randomAffine2d(gydF4y2Ba“XReflection”gydF4y2Ba,真的,gydF4y2Ba“规模”gydF4y2Ba1.1 [1]);rout = affineOutputView(sz,tform,gydF4y2Ba“BoundsStyle”gydF4y2Ba,gydF4y2Ba“CenterOutput”gydF4y2Ba);augmentFcn = @(data)augmentData(data,tform,rout);gydF4y2Ba函数gydF4y2Badata = augmentData(data,tform,rout) data = imwarp(data,tform, rout)gydF4y2Ba“OutputView”gydF4y2Ba,溃败);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

preprocessVideoClipsgydF4y2Ba

的gydF4y2BapreprocessVideoClipsgydF4y2Ba函数对训练视频数据进行预处理,将其调整为膨胀- 3d视频分类器的输入大小。它需要gydF4y2BaInputNormalizationStatisticsgydF4y2Ba和gydF4y2BaInputSizegydF4y2Ba结构中视频分类器的属性,gydF4y2Ba信息gydF4y2Ba.的gydF4y2BaInputNormalizationStatisticsgydF4y2Ba属性用于将视频帧和光流数据在-1到1之间重新缩放。输入大小用于调整视频帧的大小gydF4y2BaimresizegydF4y2Ba基于gydF4y2BaSizingOptiongydF4y2Ba的价值gydF4y2Ba信息gydF4y2Ba结构体。或者,你可以用gydF4y2Ba“randomcrop”gydF4y2Ba或gydF4y2Ba“centercrop”gydF4y2Ba作为价值观gydF4y2BaSizingOptiongydF4y2Ba将输入数据随机裁剪或集中裁剪到视频分类器的输入大小。gydF4y2Ba

函数gydF4y2Ba预处理= preprocessVideoClips(数据,信息)inputSize = info. inputSize (1:2);sizingOption = info.SizingOption;gydF4y2Ba开关gydF4y2BasizingOptiongydF4y2Ba情况下gydF4y2Ba“调整”gydF4y2Basize = @(x)imresize(x,inputSize);gydF4y2Ba情况下gydF4y2Ba“randomcrop”gydF4y2BasizingFcn = @(x)cropVideo(x,@ randomcropwindow2d,inputSize);gydF4y2Ba情况下gydF4y2Ba“centercrop”gydF4y2BasizingFcn = @(x)cropVideo(x,@ centercropwindow2d,inputSize);gydF4y2Ba结束gydF4y2BanumClips = size(data,1);rgbMin = info.Statistics.Video.Min;rgbMax = info.Statistics.Video.Max;oflowMin = info.Statistics.OpticalFlow.Min;oflowMax = info.Statistics.OpticalFlow.Max;numChannels = length(rgbMin);rgbMin =重塑(rgbMin, 1,1, numChannels);rgbMax =重塑(rgbMax, 1,1, numChannels);numChannels = length(oflowMin);oflowMin =重塑(oflowMin, 1,1, numChannels); oflowMax = reshape(oflowMax, 1, 1, numChannels); preprocessed = cell(numClips, 3);为gydF4y2Baii = 1:numClips video = data{ii,1};resized = sizingFcn(视频);oflow = computeFlow(resized,inputSize);gydF4y2Ba将输入转换为单个。gydF4y2BaResized =单个(调整大小);Oflow =单(Oflow);gydF4y2Ba在-1到1之间重新缩放输入。gydF4y2Baresize = rescale(resize,-1,1,gydF4y2Ba“InputMin”gydF4y2BargbMin,gydF4y2Ba“InputMax”gydF4y2Ba, rgbMax);Oflow = rescale(Oflow,-1,1,gydF4y2Ba“InputMin”gydF4y2BaoflowMin,gydF4y2Ba“InputMax”gydF4y2Ba, oflowMax);预处理{ii,1} =调整大小;预处理{ii,2} = oflow;预处理{ii,3} =数据{ii,2};gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba函数gydF4y2BaoutData = cropVideo(data, cropFcn, inputSize) imsz = size(data,[1,2]);cropWindow = cropFcn(imsz, inputSize);numFrames = size(data,4);sz = [inputSize, size(data,3), numFrames];outData = 0 (sz,gydF4y2Ba“喜欢”gydF4y2Ba、数据);gydF4y2Ba为gydF4y2Baf = 1: numFrames outData (:,:: f) = imcrop(数据(:,::f), cropWindow);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

computeFlowgydF4y2Ba

的gydF4y2BacomputeFlowgydF4y2Ba函数将视频序列作为输入,gydF4y2BavideoFramesgydF4y2Ba,并计算相应的光流数据gydF4y2BaopticalFlowDatagydF4y2Ba使用gydF4y2BaopticalFlowFarnebackgydF4y2Ba.光流数据包含两个通道,分别对应于gydF4y2Ba xgydF4y2Ba - - -gydF4y2Ba ygydF4y2Ba -速度分量。gydF4y2Ba

函数gydF4y2BaopticalFlowData = computeFlow(videoFrames, inputSize) opticalFlow = opticalFlowFarneback;numFrames = size(videoFrames,4);sz = [inputSize, 2, numFrames];opticalFlowData = 0 (sz,gydF4y2Ba“喜欢”gydF4y2Ba, videoFrames);gydF4y2Ba为gydF4y2Baf = 1:numFrames gray = rgb2gray(videoFrames(:,:,:,f));流量= estimateFlow(opticalFlow,灰色);opticalFlowData (:,:: f) =猫(3 flow.Vx flow.Vy);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

createMiniBatchQueuegydF4y2Ba

的gydF4y2BacreateMiniBatchQueuegydF4y2Ba函数创建gydF4y2BaminibatchqueuegydF4y2Ba对象,它提供gydF4y2BaminiBatchSizegydF4y2Ba来自给定数据存储的数据量。它还创建了一个gydF4y2BaDispatchInBackgroundDatastoregydF4y2Ba如果并行池处于打开状态。gydF4y2Ba

函数gydF4y2Bambq = createMiniBatchQueue(数据存储,numOutputs, params)gydF4y2Ba如果gydF4y2Ba参数个数。DispatchInBackground && isempty(gcp(gydF4y2Ba“nocreate”gydF4y2Ba))gydF4y2Ba如果DispatchInBackground为true,则启动并行池调度gydF4y2Ba%的数据在后台使用并行池。gydF4y2BaC = parcluster(gydF4y2Ba“本地”gydF4y2Ba);c.NumWorkers = params.NumWorkers;parpool (gydF4y2Ba“本地”gydF4y2Ba, params.NumWorkers);gydF4y2Ba结束gydF4y2BaP = gcp(gydF4y2Ba“nocreate”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~isempty(p) datastore = DispatchInBackgroundDatastore(datastore, p. numworkers);gydF4y2Ba结束gydF4y2BainputFormat (1: numOutputs-1) =gydF4y2Ba“SSCTB”gydF4y2Ba;outputFormat =gydF4y2Ba“CB”gydF4y2Ba;mbq = minibatchqueue(数据存储,numOutputs,gydF4y2Ba...gydF4y2Ba“MiniBatchSize”gydF4y2Ba,参数个数。MiniBatchSize,gydF4y2Ba...gydF4y2Ba“MiniBatchFcn”gydF4y2Ba@batchVideoAndFlow,gydF4y2Ba...gydF4y2Ba“MiniBatchFormat”gydF4y2BainputFormat, outputFormat]);gydF4y2Ba结束gydF4y2Ba

batchVideoAndFlowgydF4y2Ba

的gydF4y2BabatchVideoAndFlowgydF4y2Ba函数批量处理来自单元阵列的视频、光流和标签数据。它使用gydF4y2BaonehotencodegydF4y2Ba函数将基本真相分类标签编码为一个热数组。单热编码数组包含gydF4y2Ba1gydF4y2Ba在标签的类别对应的位置,和gydF4y2Ba0gydF4y2Ba在所有其他位置。gydF4y2Ba

函数gydF4y2Ba[视频,流,标签]= batchVideoAndFlow(视频,流,标签)gydF4y2Ba批量尺寸:5gydF4y2BaVideo = cat(5, Video {:});流量= cat(5,流量{:});gydF4y2Ba批处理尺寸:2gydF4y2Ba标签= cat(2,标签{:});gydF4y2Ba特征维数:1gydF4y2Ba标签= onehotencode(标签,1);gydF4y2Ba结束gydF4y2Ba

modelGradientsgydF4y2Ba

的gydF4y2BamodelGradientsgydF4y2Ba函数将一小批RGB数据作为输入gydF4y2BadlRGBgydF4y2Ba,对应的光流数据gydF4y2BadlFlowgydF4y2Ba,和相应的目标gydF4y2Ba海底gydF4y2Ba,并返回相应的损失,以及损失相对于可学习参数的梯度,以及训练精度。要计算梯度,请计算gydF4y2BamodelGradientsgydF4y2Ba函数使用gydF4y2BadlfevalgydF4y2Ba在训练循环中发挥作用。gydF4y2Ba

函数gydF4y2Ba[gradientsRGB,gradientsFlow,loss,acc,accRGB,accFlow,stateRGB,stateFlow] = modelGradients(i3d,dlRGB,dlFlow,Y)gydF4y2Ba将视频输入作为RGB和光流数据通过双流gydF4y2Ba%网络。gydF4y2Ba[dlYPredRGB,dlYPredFlow,stateRGB,stateFlow] = forward(i3d,dlRGB,dlFlow);gydF4y2Ba计算两流的熔合损耗、梯度和精度gydF4y2Ba%的预测。gydF4y2BargbLoss = crossentropy(dlYPredRGB,Y);flowLoss = crossentropy(dlYPredFlow,Y);gydF4y2Ba%熔断损失。gydF4y2Ba损失= mean([rgbLoss,flowLoss]);gradientsRGB = dlgradient(rgbLoss,i3d.VideoLearnables);gradientsFlow = dlgradient(flowLoss,i3d.OpticalFlowLearnables);gydF4y2Ba通过计算预测的平均值来融合预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow)/2;gydF4y2Ba计算预测的准确性。gydF4y2Ba[~,YTest] = max(Y,[],1);[~,YPred] = max(dlYPred,[],1);acc = gather(extractdata(sum(YTest == YPred)./ nummel (YTest)));gydF4y2Ba计算RGB和流量预测的准确性。gydF4y2Ba[~,YTest] = max(Y,[],1);[~,YPredRGB] = max(dlYPredRGB,[],1);[~,YPredFlow] = max(dlYPredFlow,[],1);accRGB = gather(extractdata(sum(YTest == YPredRGB)./ nummel (YTest)));accFlow = gather(extractdata(sum(YTest == YPredFlow)./ nummel (YTest)));gydF4y2Ba结束gydF4y2Ba

updateLearnablesgydF4y2Ba

的gydF4y2BaupdateLearnablesgydF4y2Ba函数更新提供的gydF4y2Ba可学的gydF4y2Ba与梯度等参数使用SGDM优化函数gydF4y2BasgdmupdategydF4y2Ba.gydF4y2Ba

函数gydF4y2Ba[learnables,velocity,learnRate] = updateLearnables(learnables,gradients,params,velocity,iteration)gydF4y2Ba使用余弦退火学习率计划确定学习率。gydF4y2BalearnRate = cosineAnnealingLearnRate(迭代,params);gydF4y2Ba对权重应用L2正则化。gydF4y2BaIdx =可学习物。参数= =gydF4y2Ba“重量”gydF4y2Ba;梯度(idx,:) = dlupdate(@(g,w) g +参数。l2Regularization*w, gradients(idx,:), learnables(idx,:));使用SGDM优化器更新网络参数。gydF4y2Ba[learnables, velocity] = sgdmupdate(learnables, gradients, velocity, learnRate, params.Momentum);gydF4y2Ba结束gydF4y2Ba

cosineAnnealingLearnRategydF4y2Ba

的gydF4y2BacosineAnnealingLearnRategydF4y2Ba函数根据当前迭代次数、最小学习率、最大学习率和退火的迭代次数计算学习率[gydF4y2Ba3.gydF4y2Ba].gydF4y2Ba

函数gydF4y2Balr = cosineAnnealingLearnRate(迭代,参数)gydF4y2Ba如果gydF4y2BaIteration == params。NumIterationslr=参数个数。MinLearningRate;返回gydF4y2Ba;gydF4y2Ba结束gydF4y2BacosineNumIter = [0, params.CosineNumIterations];csum = cumsum(cosineNumIter);Block = find(csum >=迭代,1,gydF4y2Ba“第一”gydF4y2Ba);cosineIter =迭代- csum(block - 1);annealingIteration = mod(cosineIter, cosineenumiter (block));cosineIteration = cosineNumIter(block);minR = params.MinLearningRate;maxR = params.MaxLearningRate;cosMult = 1 + cos(pi * annealingIteration / cosineIteration);lr = minR + ((maxR - minR) * cosMult / 2);gydF4y2Ba结束gydF4y2Ba

aggregateConfusionMetricgydF4y2Ba

的gydF4y2BaaggregateConfusionMetricgydF4y2Ba函数根据预测结果增量填充混淆矩阵gydF4y2BaYPredgydF4y2Ba以及预期的结果gydF4y2Ba欧美gydF4y2Ba.gydF4y2Ba

函数gydF4y2Bacmat = aggregateConfusionMetric(cmat,YTest,YPred) YTest = gather(extractdata(YTest));YPred = gather(extractdata(YPred));[m,n] = size(cmat);cmat = cmat + full(稀疏(YTest,YPred,1,m,n));gydF4y2Ba结束gydF4y2Ba

doValidationgydF4y2Ba

的gydF4y2BadoValidationgydF4y2Ba函数使用验证数据验证视频分类器。gydF4y2Ba

函数gydF4y2Ba[validationTime, cmat, lossValidation, accValidation, accValidationRGB, accValidationFlow] = doValidation(params, i3d) validationTime = tic;numOutputs = 3;mbq = createMiniBatchQueue(参数。V一个l我d一个t我onData, numOutputs, params); lossValidation = []; numClasses = numel(params.Classes); cmat = sparse(numClasses,numClasses); cmatRGB = sparse(numClasses,numClasses); cmatFlow = sparse(numClasses,numClasses);而gydF4y2Bahasdata(mbq) [dlX1,dlX2,dlY] = next(mbq);[loss,YTest,YPred,YPredRGB,YPredFlow] = predictValidation(i3d,dlX1,dlX2,dlY);lossValidation = [lossValidation,loss];cmat = aggregateConfusionMetric(cmat,YTest,YPred);cmatRGB = aggregateConfusionMetric(cmatRGB,YTest,YPredRGB);cmatFlow = aggregateConfusionMetric(cmatFlow,YTest,YPredFlow);gydF4y2Ba结束gydF4y2BalossValidation = mean(lossValidation);accValidation = sum(diag(cmat))./sum(cmat,gydF4y2Ba“所有”gydF4y2Ba);accValidationRGB = sum(diag(cmatRGB))./sum(cmatRGB,gydF4y2Ba“所有”gydF4y2Ba);accValidationFlow = sum(diag(cmatFlow))./sum(cmatFlow,gydF4y2Ba“所有”gydF4y2Ba);validationTime = toc(validationTime);gydF4y2Ba结束gydF4y2Ba

predictValidationgydF4y2Ba

的gydF4y2BapredictValidationgydF4y2Ba函数使用提供的视频分类器对RGB和光流数据计算损失和预测值。gydF4y2Ba

函数gydF4y2Ba[loss,YTest,YPred,YPredRGB,YPredFlow] = predictValidation(i3d,dlRGB,dlFlow,Y)gydF4y2Ba将视频输入通过双流膨胀- 3d视频分类器。gydF4y2Ba[dlYPredRGB,dlYPredFlow] = predict(i3d,dlRGB,dlFlow);gydF4y2Ba分别计算两个流输出的交叉熵。gydF4y2BargbLoss = crossentropy(dlYPredRGB,Y);flowLoss = crossentropy(dlYPredFlow,Y);gydF4y2Ba%熔断损失。gydF4y2Ba损失= mean([rgbLoss,flowLoss]);gydF4y2Ba通过计算预测的平均值来融合预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow)/2;gydF4y2Ba计算预测的准确性。gydF4y2Ba[~,YTest] = max(Y,[],1);[~,YPred] = max(dlYPred,[],1);[~,YPredRGB] = max(dlYPredRGB,[],1);[~,YPredFlow] = max(dlYPredFlow,[],1);gydF4y2Ba结束gydF4y2Ba

saveDatagydF4y2Ba

的gydF4y2BasaveDatagydF4y2Ba函数保存给定的膨胀-3d视频分类器,精度,损失和其他训练参数到一个mat文件。gydF4y2Ba

函数gydF4y2BabestLoss = saveData(inflated3d,bestLoss,迭代,cmat,lossTrain,lossValidation,gydF4y2Ba...gydF4y2BaaccTrain、accValidation params)gydF4y2Ba如果gydF4y2Ba迭代>= params。SaveBestAfterIteration lossValidation = extractdata(gather(lossValidation));gydF4y2Ba如果gydF4y2Balossvalidation < bestLoss params = rmfield(params,gydF4y2Ba“VelocityRGB”gydF4y2Ba);参数= rmfield(参数,gydF4y2Ba“VelocityFlow”gydF4y2Ba);bestLoss = lossvalidation;inflated3d = gatherFromGPUToSave(inflated3d);数据。BestLoss = BestLoss;数据。TrainingLoss = extractdata(gather(lossTrain));数据。训练准确性= accTrain;数据。V一个lidationAccuracy = accValidation; data.ValidationConfmat= cmat; data.inflated3d = inflated3d; data.Params = params; save(params.ModelFilename,“数据”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

gatherFromGPUToSavegydF4y2Ba

的gydF4y2BagatherFromGPUToSavegydF4y2Ba函数从GPU收集数据,以便将视频分类器保存到磁盘。gydF4y2Ba

函数gydF4y2Baclassifier = gatherFromGPUToSavegydF4y2Ba如果gydF4y2Ba~ canUseGPUgydF4y2Ba返回gydF4y2Ba;gydF4y2Ba结束gydF4y2BaP = string(属性(分类器));p = p(endsWith)gydF4y2Ba“可学的”gydF4y2Ba,gydF4y2Ba“状态”gydF4y2Ba)));gydF4y2Ba为gydF4y2BaJj = 1:数字(p)道具= p(Jj);classifier.(prop) = gatherValues(classifier.(prop));gydF4y2Ba结束gydF4y2Ba函数gydF4y2Batbl = gatherValues(tbl)gydF4y2Ba为gydF4y2BaIi = 1:高度(tbl) tbl。V一个lue{ii} = gather(tbl.Value{ii});结束gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

checkForHMDB51FoldergydF4y2Ba

的gydF4y2BacheckForHMDB51FoldergydF4y2Ba函数检查下载文件夹中的下载数据。gydF4y2Ba

函数gydF4y2Ba类= checkForHMDB51Folder(dataLoc)gydF4y2Ba“hmdb51_org”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~ isfolder (hmdbFolder)错误(gydF4y2Ba在运行示例之前,使用支持函数“downloadHMDB51”下载“hmdb51_or金宝appg.rar”文件并提取RAR文件。gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba类= [gydF4y2Ba“brush_hair”gydF4y2Ba,gydF4y2Ba“车轮”gydF4y2Ba,gydF4y2Ba“抓”gydF4y2Ba,gydF4y2Ba“咀嚼”gydF4y2Ba,gydF4y2Ba“鼓掌”gydF4y2Ba,gydF4y2Ba“爬”gydF4y2Ba,gydF4y2Ba“climb_stairs”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“潜水”gydF4y2Ba,gydF4y2Ba“draw_sword”gydF4y2Ba,gydF4y2Ba“口水”gydF4y2Ba,gydF4y2Ba“喝”gydF4y2Ba,gydF4y2Ba“吃”gydF4y2Ba,gydF4y2Ba“fall_floor”gydF4y2Ba,gydF4y2Ba“击剑”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“flic_flac”gydF4y2Ba,gydF4y2Ba“高尔夫球”gydF4y2Ba,gydF4y2Ba“倒立”gydF4y2Ba,gydF4y2Ba“打”gydF4y2Ba,gydF4y2Ba“拥抱”gydF4y2Ba,gydF4y2Ba“跳”gydF4y2Ba,gydF4y2Ba“踢”gydF4y2Ba,gydF4y2Ba“kick_ball”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“吻”gydF4y2Ba,gydF4y2Ba“笑”gydF4y2Ba,gydF4y2Ba“选择”gydF4y2Ba,gydF4y2Ba“倒”gydF4y2Ba,gydF4y2Ba“引体向上”gydF4y2Ba,gydF4y2Ba“打”gydF4y2Ba,gydF4y2Ba“推”gydF4y2Ba,gydF4y2Ba“俯卧撑”gydF4y2Ba,gydF4y2Ba“ride_bike”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“ride_horse”gydF4y2Ba,gydF4y2Ba“运行”gydF4y2Ba,gydF4y2Ba“shake_hands”gydF4y2Ba,gydF4y2Ba“shoot_ball”gydF4y2Ba,gydF4y2Ba“shoot_bow”gydF4y2Ba,gydF4y2Ba“shoot_gun”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“坐”gydF4y2Ba,gydF4y2Ba“仰卧起坐”gydF4y2Ba,gydF4y2Ba“微笑”gydF4y2Ba,gydF4y2Ba“烟”gydF4y2Ba,gydF4y2Ba“筋斗”gydF4y2Ba,gydF4y2Ba“站”gydF4y2Ba,gydF4y2Ba“swing_baseball”gydF4y2Ba,gydF4y2Ba“剑”gydF4y2Ba,gydF4y2Ba...gydF4y2Ba“sword_exercise”gydF4y2Ba,gydF4y2Ba“交谈”gydF4y2Ba,gydF4y2Ba“扔”gydF4y2Ba,gydF4y2Ba“转”gydF4y2Ba,gydF4y2Ba“走”gydF4y2Ba,gydF4y2Ba“波”gydF4y2Ba];expectFolders = fullfile(hmdbFolder, classes);gydF4y2Ba如果gydF4y2Ba~所有(arrayfun (@ (x)存在(x,gydF4y2Ba“dir”gydF4y2Ba), expectFolders)错误(gydF4y2Ba在运行示例之前,使用支持函数“downloadHMDB51”下载hmd金宝appb51_org.rar并提取RAR文件。gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

downloadHMDB51gydF4y2Ba

的gydF4y2BadownloadHMDB51gydF4y2Ba函数下载数据集并将其保存到一个目录。gydF4y2Ba

函数gydF4y2BadownloadHMDB51 (dataLoc)gydF4y2Ba如果gydF4y2Banargin == 0 dataLoc = pwd;gydF4y2Ba结束gydF4y2BadataLoc = string(dataLoc);gydF4y2Ba如果gydF4y2Ba~ isfolder (dataLoc) mkdir (dataLoc);gydF4y2Ba结束gydF4y2BadataUrl =gydF4y2Ba“http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/hmdb51_org.rar”gydF4y2Ba;选项= weboptions(gydF4y2Ba“超时”gydF4y2Ba、正);rarFileName = fullfile(dataLoc,gydF4y2Ba“hmdb51_org.rar”gydF4y2Ba);gydF4y2Ba下载RAR文件并保存到下载文件夹。gydF4y2Ba如果gydF4y2Ba~ isfile rarFileName disp (gydF4y2Ba"下载hmdb51_org.rar (2gb)到文件夹:"gydF4y2Ba(dataLoc) disp(gydF4y2Ba“下载需要几分钟……”gydF4y2Ba) websave(rarFileName, dataUrl,选项);disp (gydF4y2Ba“下载完成了。”gydF4y2Ba) disp (gydF4y2Ba"将hmdb51_org.rar文件内容解压缩到文件夹:"gydF4y2Ba) disp (dataLoc)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

initializeTrainingProgressPlotgydF4y2Ba

的gydF4y2BainitializeTrainingProgressPlotgydF4y2Ba函数配置两个图来显示训练损失、训练精度和验证精度。gydF4y2Ba

函数gydF4y2Ba绘图仪= initializeTrainingProgressPlot(参数)gydF4y2Ba如果gydF4y2Ba参数个数。Progre年代年代PlotgydF4y2Ba绘制损失、训练准确率和验证准确率。gydF4y2Ba数字gydF4y2Ba%损失图gydF4y2Ba次要情节(2,1,1)策划者。lo年代年代Plotter=一个n我米一个tedline; xlabel(“迭代”gydF4y2Ba) ylabel (gydF4y2Ba“损失”gydF4y2Ba)gydF4y2Ba%准确度图gydF4y2Ba次要情节(2,1,2)策划者。TrainAccPlotter =动画线(gydF4y2Ba“颜色”gydF4y2Ba,gydF4y2Ba“b”gydF4y2Ba);策划者。V一个lAccPlotter = animatedline(“颜色”gydF4y2Ba,gydF4y2Ba‘g’gydF4y2Ba);传奇(gydF4y2Ba“训练的准确性”gydF4y2Ba,gydF4y2Ba“验证准确性”gydF4y2Ba,gydF4y2Ba“位置”gydF4y2Ba,gydF4y2Ba“西北”gydF4y2Ba);包含(gydF4y2Ba“迭代”gydF4y2Ba) ylabel (gydF4y2Ba“准确性”gydF4y2Ba)gydF4y2Ba其他的gydF4y2Baploters = [];gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

updateProgressPlotgydF4y2Ba

的gydF4y2BaupdateProgressPlotgydF4y2Ba函数在训练过程中用损失和精度信息更新进度图。gydF4y2Ba

函数gydF4y2BaupdateProgressPlot (params,策划者,时代,迭代,开始,lossTrain, accuracyTrain, accuracyValidation)gydF4y2Ba如果gydF4y2Ba参数个数。Progre年代年代PlotgydF4y2Ba更新培训进度。gydF4y2BaD = duration(0,0,toc(start),gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);标题(plotters.LossPlotter.Parent,gydF4y2Ba”时代:“gydF4y2Ba+ epoch +gydF4y2Ba,消失:"gydF4y2Ba+字符串(D));addpoints (plotters.LossPlotter、迭代、双(收集(extractdata (lossTrain))));addpoints (plotters.TrainAccPlotter迭代,accuracyTrain);addpoints (plotters.ValAccPlotter迭代,accuracyValidation);drawnowgydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

initializeVerboseOutputgydF4y2Ba

的gydF4y2BainitializeVerboseOutputgydF4y2Ba函数显示训练值表的列标题,该表显示历元、小批精度和其他训练值。gydF4y2Ba

函数gydF4y2BainitializeVerboseOutput (params)gydF4y2Ba如果gydF4y2Ba参数个数。Verbo年代edisp (gydF4y2Ba”“gydF4y2Ba)gydF4y2Ba如果gydF4y2BacanUseGPU disp (gydF4y2Ba“GPU培训。”gydF4y2Ba)gydF4y2Ba其他的gydF4y2Badisp (gydF4y2Ba“CPU培训。”gydF4y2Ba)gydF4y2Ba结束gydF4y2BaP = gcp(gydF4y2Ba“nocreate”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~ isempty (p) disp (gydF4y2Ba“并行集群训练”gydF4y2Ba+ p.Cluster.Profile +gydF4y2Ba”’。”gydF4y2Ba)gydF4y2Ba结束gydF4y2Badisp (gydF4y2Ba”NumIterations:“gydF4y2Ba+字符串(params.NumIterations));disp (gydF4y2Ba”MiniBatchSize:“gydF4y2Ba+字符串(params.MiniBatchSize));disp (gydF4y2Ba“类:”gydF4y2Ba+加入(字符串(params.Classes),gydF4y2Ba”、“gydF4y2Ba));disp (gydF4y2Ba"|=======================================================================================================================================================================|"gydF4y2Ba) disp (gydF4y2Ba| Epoch | Iteration | Time Elapsed | Mini-Batch Accuracy | Validation Accuracy | Mini-Batch | Validation | Base Learning | Train Time | Validation Time |gydF4y2Ba) disp (gydF4y2Ba“| | | (hh: mm: ss) | (Avg: RGB:流)| (Avg: RGB:流)| | | |率损失损失(hh: mm: ss) | (hh: mm: ss) |”gydF4y2Ba) disp (gydF4y2Ba"|=======================================================================================================================================================================|"gydF4y2Ba)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

displayVerboseOutputEveryEpochgydF4y2Ba

的gydF4y2BadisplayVerboseOutputEveryEpochgydF4y2Ba函数显示训练值的详细输出,例如历元、迷你批处理精度、验证精度和迷你批处理损失。gydF4y2Ba

函数gydF4y2BadisplayVerboseOutputEveryEpoch(参数、启动、learnRate时代,迭代,gydF4y2Ba...gydF4y2BaaccTrain、accTrainRGB accTrainFlow, accValidation、accValidationRGB accValidationFlow, lossTrain, lossValidation,火车离站时刻表,validationTime)gydF4y2Ba如果gydF4y2Ba参数个数。Verbo年代eD = duration(0,0,toc(start),gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);trainTime = duration(0,0,trainTime,gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);validationTime = duration(0,0,validationTime,gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);lossValidation = gather(extractdata(lossValidation));lossValidation = compose(gydF4y2Ba“% .4f”gydF4y2Ba, lossValidation);accValidation = composePadAccuracy(accValidation);accValidationRGB = composePadAccuracy(accValidationRGB);accValidationFlow = composePadAccuracy(accValidationFlow);accVal = join([accValidation,accValidationRGB,accValidationFlow],gydF4y2Ba": "gydF4y2Ba);lossTrain = gather(extractdata(lossTrain));lossTrain = compose(gydF4y2Ba“% .4f”gydF4y2Ba, lossTrain);accTrain = composePadAccuracy(accTrain);accTrainRGB = composePadAccuracy(accTrainRGB);accTrainFlow = composePadAccuracy(accTrainFlow);accTrain = join([accTrain,accTrainRGB,accTrainFlow],gydF4y2Ba": "gydF4y2Ba);learnRate =合成(gydF4y2Ba“% .13f”gydF4y2Ba, learnRate);disp (gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string(时代),5,gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(字符串(迭代)9gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (D) 12gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (accTrain), 26岁,gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (accVal), 26岁,gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (lossTrain) 10gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (lossValidation) 10gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (learnRate), 13日gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string(火车离站时刻表)10gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba...gydF4y2Ba垫(string (validationTime) 15gydF4y2Ba“两个”gydF4y2Ba) +gydF4y2Ba“|”gydF4y2Ba)gydF4y2Ba结束gydF4y2Ba函数gydF4y2Baacc = composePadAccuracy(acc)gydF4y2Ba“% .2f”gydF4y2Ba, 100年acc *) +gydF4y2Ba“%”gydF4y2Ba;Acc = pad(string(Acc),6,gydF4y2Ba“左”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

endVerboseOutputgydF4y2Ba

的gydF4y2BaendVerboseOutputgydF4y2Ba函数在训练期间显示详细输出的结束。gydF4y2Ba

函数gydF4y2BaendVerboseOutput (params)gydF4y2Ba如果gydF4y2Ba参数个数。Verbo年代edisp (gydF4y2Ba"|=======================================================================================================================================================================|"gydF4y2Ba)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

参考文献gydF4y2Ba

卡雷拉,若昂和安德鲁·泽瑟曼。“Quo Vadis, Action Recognition?”一个新的模型和动力学数据集。gydF4y2BaIEEE计算机视觉与模式识别会议论文集gydF4y2Ba(CVPR): 6299 ? 6308。檀香山,HI: IEEE, 2017。gydF4y2Ba

Simonyan, Karen, Andrew Zisserman。“用于视频动作识别的双流卷积网络。”gydF4y2Ba神经信息处理系统研究进展gydF4y2Ba27,加州长滩:NIPS, 2017。gydF4y2Ba

Loshchilov, Ilya和Frank Hutter。SGDR:随机梯度下降与热重启gydF4y2Ba2017学习表现国际会议gydF4y2Ba.土伦,法国:ICLR, 2017。gydF4y2Ba

[4]杜tran,王恒,洛伦佐·托雷萨尼,杰米·雷,扬·勒昆,马诺哈尔·帕卢里。《动作识别的时空卷积研究》。IEEE计算机视觉与模式识别会议论文集,2018,pp. 6450-6459。gydF4y2Ba

[5] Christoph Feichtenhofer,樊浩奇,Jitendra Malik,和开明。视频识别的慢速网络gydF4y2BaIEEE计算机视觉与模式识别会议论文集gydF4y2Ba(CVPR), 2019年。gydF4y2Ba

Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman。“动力学人体动作视频数据集。”gydF4y2BaarXiv预印arXiv:1705.06950gydF4y2Ba, 2017年。gydF4y2Ba