August 14, 2017

Explainable AI: 3 Deep Explanations Approaches to XAI

The Deep Explanations Approach to Explainable AI

In a recent post, I described the importance of thinking about what we hope to achieve from building more transparent neural nets and explainable AI models. But today I want to talk about one of three emerging methods that researchers are using to make headway in this space.

Let's start with deep explanations, a method of building explainable AI where where we try to tease apart what is happening within the neural network itself. Within deep explanations, I’d like to highlight three techniques:

Learning Semantic Associations

In this first technique, SRI tried to learn semantic associations in video. The goal was to “count” occurrences of items of interest in frames of a long video and generate a caption for that event. The eventual goal was to later be able to search video content.

In one example, they took a video sequence of a skateboarder. They identified frames of a skateboarder standing, jumping or falling. When composited into one event, they wanted the model to say “This is a skateboarder attempting a board trick”.

They retrieved as much information as possible from the scenes using visual recognition, audio based recognition, and even OCR techniques for looking at closed captioning when available. This information was then fed into stacked convolutional layers in a network for classification and trained multiples times; they trained it to associate semantic attributes within specific hidden layers, and they trained it to associate hidden nodes with human-labeled ontologies. And they used this to generate examples of prominent but unlabeled nodes to see what they should be applying.

Another example used a video of a wedding. They wanted the model to look at the evidence supporting this event being a wedding: a bride, a groom, an exchange of rings.

This technique mixes learned subject matter expertise with a learned system. Classifying the picture as a ring being placed on a finger is not enough in this case; you also have to provide the ontology as the subject matter expert that these events imply something specific (ie, two people getting married). This has to be done as part of building the model itself.

SRI completed this task successfully. They later used meta data derived from scenes to build a system for searching video for certain moments (ie: show me the vows). You can see the original paper detailing this work here.

Generating Visual Explanations

The second technique, out of Berkeley, expanded on these efforts.

Their goal was to generate a caption about why an image was in a specific class. So they didn’t want to just generate a textual caption, they also wanted the system to be discriminative and describe the reasons for thinking a scene shows something.

One example looks at an image of a bird with a long white neck and yellow beak. This system is not able to discriminate this bird from other birds until you bring in the detail that there is a red eye.

They did this by working with a model to classify images. They passed in categories and target sentences to train an LSTM model to generate descriptions. Then they rewarded it on its ability to discriminate between that particular classification and any other classification that could have been derived (as opposed to being rewarded based on relevance). It’s interesting to note that again, subject matter expertise and ontology had to be passed in. The model then learned to generate the appropriate descriptions and repetitively generate more examples to compete with itself, discriminate and be rewarded accordingly.

One important thing to note is that, while it’s discriminative in this way, these explanations for AI are just justifications. The technique does not peer into the innards of a neural net and say what the derived feature is actually representing. They trained the system to competitively discriminate between features that could be used to identify explain different types of animals in the scene. But there’s no way to know if the underlying neural representation was not actually identifying a red eye, and was instead identifying a bird flying over water. You can identify correlation, but you can’t look in and say for certain that you know what the model is doing. You can find the original paper here.

Rationalizing neural predictions

This last technique highlights work done out of MIT. They focused on extracting snippets of text that corresponded to scores.

In one example, they looked at reviews of beer to take textual review, identify ratings, and associate those with each other. This case showed a five star review for appearance of beer being associated with “A pleasant ruby red amber color”. Overall, this is objective. But they also wanted to highlight salient parts of questions that corresponded to categories in forms.

They did this with two pieces: an encoder and a generator.

The encoder builds mapping between the text and the specific scores. The generator is trained simultaneously and looks to pull out text. It takes words, assigns each a probability using traditional methods, and then selects phrases that are not just high probability correlation with the scores they see, but that also possess continuity. Effectively, they’re looking to find strings of words that run together that are all associated with that score Continuity is seen not just with text, but also images, in this system.

With this mechanism, they were able to create brief, coherent explanations tying the scores back to the text. This is not a way of explaining what the neural network is actually doing, but it is a method for rationalizing the behavior and justifying its predictions. You can read the original paper here.

You can view my full talk on Explainable AI techniques in the video below. To learn how Bonsai is building explainability into reinforcement learning models, you can visit


Subscribe to our newsletter to stay up-to-date on our latest product news & more!