Main Content

Grad-CAM Reveals the Why Behind Deep Learning Decisions


Grad-CAM是类激活映射(CAM)技术的概括。有关实时网络摄像头数据的激活映射技术,请参阅Investigate Network Predictions Using Class Activation Mapping. Grad-CAM can also be applied to nonclassification examples such as regression or semantic segmentation. For an example showing how to use Grad-CAM to investigate the predictions of a semantic segmentation network, seeExplore Semantic Segmentation Network Using Grad-CAM.


Load the GoogLeNet network.

net = googlenet;

Classify Image

Read the GoogLeNet image size.

inputSize = net.Layers(1).InputSize(1:2);

Loadsherlock.jpg., an image of a golden retriever included with this example.

img = imread("sherlock.jpg");

Resize the image to the network input dimensions.

img = imresize(img,inputSize);

Classify the image and display it, along with its classification and classification score.

[classfn,score] = classify(net,img); imshow(img); title(sprintf("%s (%.2f)", classfn, score(classfn)));

GoogLeNet correctly classifies the image as a golden retriever. But why? What characteristics of the image cause the network to make this classification?

Grad-CAM Explains Why


ThegradCAMfunction computes the importance map by taking the derivative of the reduction layer output for a given class with respect to a convolutional feature map. For classification tasks, thegradCAMfunction automatically selects suitable layers to compute the importance map for. You can also specify the layers with the“还原器”“ featurelayer”name-value arguments.


map = gradCAM(net,img,classfn);

Show the Grad-CAM map on top of the image by using an'AlphaData'value of 0.5. The'jet'colormap has deep blue as the lowest value and deep red as the highest.

imshow(img);抓住; imagesc(map,'AlphaData',0.5); colormapjet抓住off; title("Grad-CAM");

Clearly, the upper face and ear of the dog have the greatest impact on the classification.

For a different approach to investigating the reasons for deep network classifications, seeocclusionSensitivityimageLIME.


[1]Selvaraju, R. R., M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. "Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization." In IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. Available atGrad-CAM上the Computer Vision Foundation Open Access website.

See Also

