A New Way to Find Bugs in Self-Driving AI Could Save Lives

Most software bugs won’t kill you. A possibly lethal exception could be the error that leads a self-driving car’s AI to make the wrong decision at the wrong time. That is why researchers developed a bug-hunting method that can systematically expose bad decision-making by the deep learning algorithms deployed in online services and autonomous vehicles.

The new DeepXplore method [PDF] uses at least three neural networks—the basic architecture of deep learning algorithms—to act as “cross-referencing oracles” in checking each other’s accuracy. Researchers at Columbia University and Lehigh University designed DeepXplore to solve an optimization problem in which they looked to strike the best balance between two objectives: maximizing the number of neurons activated within neural networks, and triggering as many conflicting decisions as possible among different neural networks. By assuming that the majority of neural networks will generally make the right decision, DeepXplore automatically retrains the neural network that made the lone dissenting decision to follow the example of the majority in a given scenario.

“This is a differential testing framework that can find thousands of errors in self-driving systems and in similar neural network systems,” says Yinzhi Cao, assistant professor of computer science at Lehigh University in Bethlehem, Pa.

Cao and his colleagues on the DeepXplore team recently won best paper after presenting their research at the 2017 Symposium on Operating Systems Principles (SOSP) held in Shanghai, China, from 28-31 Oct. Their win may signal a growing recognition of the need for debugging tools such as DeepXplore in deep learning AI.

Typically, deep learning algorithms become better at certain tasks by filtering huge amounts of training data that humans have labeled with the correct answers. That has enabled such algorithms to achieve accuracies of well over 90 percent on certain test datasets that involve tasks such as identifying the correct human faces in Facebook photos or choosing the correct phrase in a Google translation between, say, Chinese and English. In these cases, it’s not the end of the world if a friend occasionally gets misidentified or if a certain esoteric phrase gets translated incorrectly.

Example photos of errors in self-driving car software. The left photo shows an arrow correctly turning left. The right photo, a slightly darker version, has an arrow indicating the car turns right, into the guard rail. This example from DeepXplore shows an error found in Nvidia's DAVE-2 self-driving car software, which would cause the car to crash into a guardrail due to a darker version of the image.Images: Columbia University/Lehigh University/ACM

But the consequences of mistakes rise sharply once tech companies begin using deep learning algorithms in applications such as one where a two-ton machine is moving at highway speeds. A wrong decision by a self-driving AI could lead to the car crashing into a guard rail, colliding with another vehicle or running down pedestrians and cyclists. Government regulators will want to know for sure that self-driving cars can meet a certain safety level—and random test datasets may not uncover all those rare “corner cases” that could lead an algorithm to make a catastrophic mistake.

“I think this push toward secure and reliable AI kind of fits in nicely with explainable AI,” says Suman Jana, an assistant professor of computer science at Columbia University in New York City. “Transparency, explanation and robustness all have to be improved a lot in machine learning systems before these systems can start working together with human beings or start running on roads.”

Jana and Cao come from a group of researchers who share backgrounds in software security and debugging. In their world, even software that is 99-percent error free could still be vulnerable if malicious hackers can exploit that one lone bug in the system. That has made them far less tolerant of errors than many deep learning researchers who see mistakes as a natural part of the training process. It also made them fairly ideal candidates to figure out a new and more comprehensive approach for debugging deep learning.

Until now, debugging of the neural networks in self-driving cars has involved fairly tedious or random methods. One random testing approach involves human researchers manually creating test images and feeding those into the networks until they triggered a wrong decision. A second approach, called adversarial testing, can automatically create a sequence of test images by slightly tweaking one particular image until it trips up the neural network.

DeepXplore took a different approach by automatically creating test images most likely to cause three or more neural networks to make conflicting decisions. For example, DeepXplore might look for just the right amount of lighting in a given image that could lead two neural networks to identify a vehicle as a car while a third neural network identifies it as a face.

At the same time, DeepXplore also aimed to maximize neuron coverage in its testing by activating the maximum number of neurons and different neural network pathways. Such neuron coverage is based on a similar concept in traditional software testing called code coverage, Cao explains. This process was able to activate 100 percent of network neurons, or about 30 percent more on average than either the random or adversarial testing methods previously used in deep learning algorithms.

Testing with 15 state-of-the-art neural networks looking at five different public datasets showed how DeepXplore could find thousands of previously undiscovered errors in a wide variety of deep learning applications. The test datasets included scenarios for self-driving car AI, automatic object recognition in online images, and automatic detection of malware masquerading as ordinary software.

DeepXplore cannot yet guarantee that it has found every single possible bug in a system, but it’s far more comprehensive in testing large-scale neural networks than previous methods, Jana says. By comparison, a Stanford University team has taken an almost opposite approach by showing how to guarantee a small cluster of neurons is free of errors. Neither is a complete solution, but both represent promising and crucial steps toward debugging the future of deep learning.

self-driving cars debugging software deep learning robot software software security ai machine learning

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

A New Way to Find Bugs in Self-Driving AI Could Save Lives

A debugging method for deep learning AI pits neural networks against each other to find errors

7 Bell Labs Breakthroughs Honored as IEEE Milestones

Video Friday: Musculoskeletal Robot Dog

The Untold History of the RESISTORS

Related Stories

DeepMind's Robots Play Infinite Table Tennis

Why the Nobel Prize in Physics Went to AI Research

15 Graphs That Explain the State of AI in 2024

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

A New Way to Find Bugs in Self-Driving AI Could Save Lives

A debugging method for deep learning AI pits neural networks against each other to find errors

7 Bell Labs Breakthroughs Honored as IEEE Milestones

Video Friday: Musculoskeletal Robot Dog

The Untold History of the RESISTORS

Related Stories

DeepMind's Robots Play Infinite Table Tennis

Why the Nobel Prize in Physics Went to AI Research

15 Graphs That Explain the State of AI in 2024