Error-Detection Tool Makes AI Mistakes Easy to Spot

When neural networks get confused, a good map can untangle the mess

4 min read
Many grey circles pressed together. One of them has many yellow spikes rising from it.

Certain data maps of a neural network's inferences, researchers have found, contain spikes where ambiguities and errors occur—rendering possible failure points of the system relatively easy to spot.

Purdue University

Brain-mimicking neural networks promise to perform more quickly, accurately, and impartially than humans on a wide range of problems—including analyzing mutations associated with cancer to deciding who receives a loan. However, these AI systems work in notoriously mysterious ways, raising concerns over their trustworthiness. Now a new study has found a way to reveal when neural networks might get confused, potentially shedding light on what they may be doing when they make mistakes.

As neural networks run computations on sets of data, such as collections of images, they focus on details, such as potential facial features, within each sample making up a set. The strings of numbers encoding these details are used to calculate the probability that a sample belongs to a specific category—whether, in this case, the image is of a person’s face.

“I’m still amazed at how helpful this technique is.”
—David Gleich, Purdue University

However, the way in which neural networks learn what details help them find solutions is often a mystery. This “black box“ nature of neural networks makes it difficult to know if a neural network’s answers are right or wrong.

“When a person solves a problem, you can ask them how they solved it and hopefully get an answer you can understand,” says study senior author David Gleich, a professor of computer science at Purdue University in West Lafayette, Ind. “Neural networks don’t work like that.”

In the new study, instead of attempting to follow the decision-making process for any one sample on which neural networks are tested, Gleich and his colleagues sought to visualize the relationships these AI systems detect in all samples in an entire database.

“I’m still amazed at how helpful this technique is to help us understand what a neural network might be doing to make a prediction,” Gleich says.

The scientists experimented with a neural network trained to analyze roughly 1.3 million images in the ImageNet database. They developed a method of splitting and overlapping classifications to identify images that had a high probability of belonging to more than one classification.

The researchers then drew from the mathematical field of topology—which studies properties of geometric objects—to map the relationships the neural network inferred between each image and each classification. Analytical techniques from topology can help scientists identify similarities between sets of data despite any seeming differences. “Tools based on topological data analysis have in the past been used to analyze gene expression levels and identify specific subpopulations in breast cancer, among other really interesting insights,” Gleich says.

“Our tool allows us to build something like a map that makes it possible to zoom in on regions of data.”
—David Gleich, Purdue University

In the maps this new study generated, each group of images a network thinks are related is represented by a single dot. Each dot is color-coded by classification. The closer the dots, the more similar the network considers groups to be. Most areas of these maps show clusters of dots in a single color.

However, groups of images with a high probability of belonging to more than one classification are represented by two differently colored overlapping dots. “Our tool allows us to build something like a map that makes it possible to zoom in on regions of data,” Gleich says. “These are often places where predictions show a border between types, where solutions may not be so clear. It highlights specific data predictions that are worth investigating further.”

A person looking at the maps this new tool generates can see areas where the network cannot distinguish between two classifications. This approach “allowed us to apply our innate human pattern recognition to hypothesize or guess at why or how the network might work,” Gleich says. “This allowed us to forecast how the network would respond to totally new inputs based on the relationships suggested.”

The research team’s experiments caught neural networks mistaking the identity of images in databases of everything from chest X-rays and gene sequences to apparel. For example, when one neural network was tested on the Imagenette database (a subset of ImageNet), it repeatedly misclassified images of cars as cassette players. The scientists found this was because the pictures were drawn from online sales listings and included tags for the cars’ stereo equipment.

The team’s new method helps reveal “where the mistakes are being made,” Gleich says. “Analyzing data at this level is what allows scientists to go from just getting a bunch of useful predictions on new data to deeply understanding how the neural network might be working with their data.”

In addition, “our tool seems very good at helping us see when the training data itself contained errors,” Gleich says. “People do make mistakes when they hand-label data.”

Potential uses for this analytical strategy may include especially high-stakes neural-network applications—“say, something like a neural network in health care or medicine to study sepsis or skin cancer,” Gleich says.

Gleich and his colleagues tried applying their new tool to a neural network designed to predict whether a convicted criminal might go on to commit another crime. This kind of neural network can help decide the severity of the sentence a person receives, or whether they are granted parole. “The results were inconclusive, because we didn’t have the entire set of data underlying the predictions,” Gleich says.

Critics argue that since most neural networks are trained on past decisions that reflect biases based on race and other factors, AI systems end up copying the mistakes of the past. Gleich says that finding a way to use their new tool to “to understand bias or prejudice in predictions could be a powerful development.”

Currently, this new tool works with neural networks that generate specific predictions from small sets of data, such as “whether or not a gene mutation might be harmful,” Gleich says. So far, the researchers do not have a way of applying it to large language models such as those powering ChatGPT, or to diffusion models such as those driving the text-to-image system DALL-E, he notes.

The scientists detailed their findings 17 November in the journal Nature Machine Intelligence.

The Conversation (0)