The way in which artificial intelligence reaches insights and makes decisions is often mysterious, raising concerns about how trustworthy machine learning can be. Now, in a new study, researchers have revealed a new method for comparing how well the reasoning of AI software matches that of a human in order to rapidly analyze its behavior.
As machine learning increasingly finds real-world applications, it becomes critical to understand how it reaches its conclusions and whether it does so correctly. For example, an AI program may appear to have accurately predicted that a skin lesion was cancerous, but it may have done so by focusing on an unrelated blot in the background of a clinical image.
“Machine-learning models are infamously challenging to understand,” says Angie Boggust, a computer science researcher at MIT and lead author of a new study concerning AI’s trustworthiness. “Knowing a model’s decision is easy, but knowing why that model made that decision is hard.”
A common strategy to make sense of AI reasoning examines the features of the data that the program focused on—say, an image or a sentence—in order to make its decision. However, such so-called saliency methods often yield insights on just one decision at a time, and each must be manually inspected. AI software is often trained using millions of instances of data, making it nearly impossible for a person to analyze enough decisions to identify patterns of correct or incorrect behavior.
“Providing human users with tools to interrogate and understand their machine-learning models is crucial to ensuring machine-learning models can be safely deployed in the real world.”
—Angie Boggust, MIT
Now scientists at MIT and IBM Research have created a way to collect and inspect the explanations an AI gives for its decisions, thus allowing a quick analysis of its behavior. The new technique, named Shared Interest, compares saliency analyses of an AI’s decisions with human-annotated databases.
For example, an image-recognition program might classify a picture as that of a dog, and saliency methods might show that the program highlighted the pixels of the dog’s head and body to make its decision. The Shared Interest approach might, by contrast, compare the results of these saliency methods with databases of images where people annotated which parts of pictures were those of dogs.
Based on these comparisons, the Shared Interest method then calls for computing how much an AI’s decision-making aligned with human reasoning, classifying it as one of eight patterns. On one end of the spectrum, the AI may prove completely human-aligned, with the program making the correct prediction and highlighting the same features in the data as humans did. On the other end, the AI is completely distracted, with the AI making an incorrect prediction and highlighting none of the features that humans did.
The other patterns into which AI decision-making might fall highlight the ways in which a machine-learning model correctly or incorrectly interprets details in the data. For example, Shared Interest might find that an AI correctly recognizes a tractor in an image based solely on a fragment of it—say, its tire—instead of identifying the whole vehicle, as a human might, or find that an AI might recognize a snowmobile helmet in an image only if a snowmobile was also in the picture.
In experiments, Shared Interest helped reveal how AI programs worked and whether they were reliable or not. For example, Shared Interest helped a dermatologist quickly see examples of a program’s correct and incorrect predictions of cancer diagnosis from photos of skin lesions. Ultimately, the dermatologist decided he could not trust the program because it made too many predictions based on unrelated details rather than actual lesions.
In another experiment, a machine-learning researcher used Shared Interest to test a saliency method he was applying to the BeerAdvocate data set, helping him analyze thousands of correct and incorrect decisions in a fraction of the time needed with traditional manual methods. Shared Interest helped show that the saliency method generally behaved well as hoped but also revealed previously unknown pitfalls, such as overvaluing certain words in reviews in ways that led to incorrect predictions.
“Providing human users with tools to interrogate and understand their machine-learning models is crucial to ensuring machine-learning models can be safely deployed in the real world,” Boggust says.
The researchers caution that Shared Interest performs only as well as the saliency methods it employs. Each saliency method possesses its own limitations, which Shared Interest inherits, Boggust notes.
In the future, the scientists would like to apply Shared Interest to more kinds of data, such as the tabular data used in medical records. Another potential area of research could be automating the estimation of uncertainty in AI results, Boggust adds.
The scientists have made the source code for Shared Interest and live demos of it publicly available. They will detail their findings 3 May at the ACM CHI Conference on Human Factors in Computing Systems.
- How to Make Autonomous Cars Trustworthy and Free from ... ›
- Can AI Stop People From Believing Fake News? - IEEE Spectrum ›
- Making Medical AI Trustworthy and Transparent - IEEE Spectrum ›