Main Content

Grad-CAM Reveals the Why Behind Deep Learning Decisions

此示例显示了如何使用梯度加权类激活映射(GRAD-CAM)技术来了解为什么深度学习网络做出分类决策。Grad-Cam,由Selvaraju和合着者发明[1],使用网络确定的卷积特征的分类分数的梯度,以了解图像的哪些部分对于分类最重要。此示例使用Googlenet预验证的网络进行图像。

Grad-CAM是类激活映射(CAM)技术的概括。有关实时网络摄像头数据的激活映射技术,请参阅Investigate Network Predictions Using Class Activation Mapping. Grad-CAM can also be applied to nonclassification examples such as regression or semantic segmentation. For an example showing how to use Grad-CAM to investigate the predictions of a semantic segmentation network, seeExplore Semantic Segmentation Network Using Grad-CAM.

负载预估计的网络

Load the GoogLeNet network.

net = googlenet;

Classify Image

Read the GoogLeNet image size.

inputSize = net.Layers(1).InputSize(1:2);

Loadsherlock.jpg., an image of a golden retriever included with this example.

img = imread("sherlock.jpg");

Resize the image to the network input dimensions.

img = imresize(img,inputSize);

Classify the image and display it, along with its classification and classification score.

[classfn,score] = classify(net,img); imshow(img); title(sprintf("%s (%.2f)", classfn, score(classfn)));

GoogLeNet correctly classifies the image as a golden retriever. But why? What characteristics of the image cause the network to make this classification?

Grad-CAM Explains Why

Grad-CAM技术利用了相对于最终卷积特征图的分类分数的梯度,以确定最大的分类分数的输入图像的各个部分。该梯度很大的地方正好是最终分数最依赖数据的地方。

ThegradCAMfunction computes the importance map by taking the derivative of the reduction layer output for a given class with respect to a convolutional feature map. For classification tasks, thegradCAMfunction automatically selects suitable layers to compute the importance map for. You can also specify the layers with the“还原器”“ featurelayer”name-value arguments.

计算Grad-CAM图。

map = gradCAM(net,img,classfn);

Show the Grad-CAM map on top of the image by using an'AlphaData'value of 0.5. The'jet'colormap has deep blue as the lowest value and deep red as the highest.

imshow(img);抓住; imagesc(map,'AlphaData',0.5); colormapjet抓住off; title("Grad-CAM");

Clearly, the upper face and ear of the dog have the greatest impact on the classification.

For a different approach to investigating the reasons for deep network classifications, seeocclusionSensitivityimageLIME.

参考

[1]Selvaraju, R. R., M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. "Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization." In IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. Available atGrad-CAM上the Computer Vision Foundation Open Access website.

See Also

|||

相关话题