Getting Started with Mask R-CNN for Instance Segmentation
实例分割是一种增强的对象检测类型,可为对象的每个检测到的实例生成分割图。实例分割将单个对象视为不同的实体,而不论对象的类别。相反,语义分割考虑了同一类的所有对象,属于单个实体。
Mask R-CNN is a popular deep learning instance segmentation technique that performs pixel-level segmentation on detected objects[1]。蒙版R-CNN算法可以容纳多个类和重叠对象。
您可以使用该网络创建验证的蒙版R-CNN网络maskrcnn
目的。该网络对MS-COCO数据集进行了训练,并可以检测80个不同类别的对象。要执行实例分割,请将验证的网络传递给segmentObjects
功能。
If you want to modify the network to detect additional classes, or to adjust other parameters of the network, then you can perform transfer learning. For an example that shows how to train a Mask R-CNN, seePerform Instance Segmentation Using Mask R-CNN。
Mask R-CNN Network Architecture
这Mask R-CNN network consists of two stages. The first stage is a region proposal network (RPN), which predicts object proposal bounding boxes based on anchor boxes. The second stage is an R-CNN detector that refines these proposals, classifies them, and computes the pixel-level segmentation for these proposals.
蒙版R-CNN模型在更快的R-CNN模型上构建。mask r-CNN用一个更快的R-CNN代替ROI Max池池Roialignlayer
that provides more accurate sub-pixel level ROI pooling. The Mask R-CNN network also adds a mask branch for pixel level object segmentation. For more information about the Faster R-CNN network, seeGetting Started with R-CNN, Fast R-CNN, and Faster R-CNN。
This diagram shows a modified Faster R-CNN network on the left and a mask branch on the right.
To configure a Mask R-CNN network for transfer learning, specify the class names and anchor boxes when you create amaskrcnn
目的。You can optionally specify additional network properties including the network input size and the ROI pooling sizes.
准备口罩R-CNN培训数据
Load Data
To train a Mask R-CNN, you need the following data.
数据 | 描述 |
---|---|
RGB图像 | 用作网络输入的RGB图像,指定为H-经过-W-by-3数字阵列。 例如,此示例RGB图像是Camvid数据集中的修改图像[2]that has been edited to remove personally identifiable information. |
Ground-truth bounding boxes | RGB图像中对象的边界框,指定为NumObjects-经过-4 matrix, with rows in the format [xywh]). 例如, bboxes = 394 442 36 101 436 457 32 88 619 293 209 281 460 441 210 234 862 375 190 314 816 271 235 305 |
Instance labels | Label of each instance, specified as aNumObjects-b-1字符串向量或NumObjects-1个字符矢量的细胞阵列。) 例如, 标签= 6×1个单元格数组{'person'} {'person'} {'车辆'} {'车辆'} {'车辆'} {'车辆'} |
Instance masks | 掩盖物体实例。蒙版数据有两种格式:
For example, this montage shows the binary masks of six objects in the sample RGB image. |
Create Datastore that Reads Data
使用数据存储读取数据。数据存储必须以1 x-4单元格数返回数据,该阵列的格式{RGB映像,边界框,标签,掩码}。您可以使用以下步骤以这种格式创建数据存储:
创建一个n
成像
返回RGB图像数据创建一个
boxLabelDatastore
将边界的框数据和实例标签返回为两列单元格数组创建一个n
成像
并指定一个自定义读取功能,该功能将蒙版数据返回为二进制矩阵Combine the three datastores using the
结合
功能
图像,边界框和掩码的大小必须匹配网络的输入大小。如果您需要调整数据大小,则可以使用精加工
调整RGB图像和口罩的大小,以及bboxresize
调整边界框的功能。
有关更多信息,请参阅数据stores for Deep Learning(Deep Learning Toolbox)。
可视化培训数据
要通过图像显示实例掩码,请使用InsertObjectMask
。您可以指定一个colormap,以便每个实例以不同的颜色出现。此示例代码显示了如何显示实例掩码在面具
variable over the RGB image in the我是
variable using thelines
colormap.
我是Overlay = insertObjectMask(im,masks,Color=lines(numObjects)); imshow(imOverlay);
To show the bounding boxes with labels over the image, use the展示
功能。此示例代码显示了如何显示带有边界框大小和位置数据的标记的矩形形状bboxes
变量和标签数据labels
variable.
我是show(imOverlay) showShape(“矩形”,bboxes,Label=labels,Color="red");
火车面具R-CNN型号
Train a Mask R-CNN network using thetrainMaskRCNN
功能。For an example, seePerform Instance Segmentation Using Mask R-CNN。
参考
[1]He, Kaiming, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. "Mask R-CNN."ArXiv:1703.06870 [Cs],2018年1月24日。https://arxiv.org/pdf/1703.06870。
[2]Brostow, Gabriel J., Julien Fauqueur, and Roberto Cipolla. "Semantic Object Classes in Video: A High-Definition Ground Truth Database." Pattern Recognition Letters 30, no. 2 (January 2009): 88–97. https://doi.org/10.1016/j.patrec.2008.04.005.
See Also
Apps
Functions
Related Examples
More About
- MATLAB的深度学习(Deep Learning Toolbox)
- 数据stores for Deep Learning(Deep Learning Toolbox)