Nvidia Opens Up The "Black Box" of Its Robocar's Deep Neural Network

Peering into a deep neural network just got a lot easier

2 min read

Three road scenes with a bright green highlight surrounding other cars, lane markings, and the edges of the road
The features marked in green form high-priority focus points for the deep neural network.
Images: NVIDIA

A deep neural network’s ability to teach itself is a strength, because the machine gets better with experience, and a weakness, because it’s got no code that an engineer can tweak. It’s a black box.

That’s why the creators of Google Deep Mind’s AlphaGo couldn’t explain how it played the game of Go. All they could do was watch their brainchild rise from beginner status to defeat one of the best players in the world. 

Such opacity’s okay in a game-playing machine but not in a self-driving car. If a robocar makes a mistake, engineers must be able to look under the hood, find the flaw and fix it so that the car never makes the same mistake again. One way to do this is through simulations that first show the AI one feature and then show it another, thus discovering which things affect decision making. 

Nvidia, a supplier of an automotive AI chipset, now says it has found a simpler way of instilling transparency. “While the technology lets us build systems that learn to do things we can’t manually program, we can still explain how the systems make decisions,” wrote Danny Shapiro, Nvidia’s head of automotive, in a blog post.

And, because the work is done right inside the layers of processing arrays that make up a neural network, results can be displayed in real time, as a “visualization mask” that’s superimposed on the image coming straight from the car’s forward-looking camera. So far, the results involve the machine’s turning of the steering wheel to keep the car within its lane.  

The method works by taking the analytical output from a high layer in the network—one that has already extracted important features from the image fed in by a camera. It then superimposes that output onto lower layers, averages it, then superimposes it on still lower layers until getting all the way to the original camera image. 

The result is a camera image on which the AI’s opinion of what’s significant is highlighted. And, in fact, those parts turn out to be just what a human driver would consider significant—lane markings, road edges, parked vehicles, hedges alongside the route, and so forth. But, just to make sure that these features really were key to decision making, the researchers classified all the pixels into two classes—Class 1 contains “salient” features that clearly have to do with driving decisions, and Class 2, which contains non-salient features, typically in the background. The researchers manipulated the two classes digitally and found that only salient features mattered.

“Shifting the salient objects results in a linear change in steering angle that is nearly as large as that which occurs when we shift the entire image,” the researchers write in a white paper. “Shifting just the background pixels has a much smaller effect on the steering angle.”

True, engineers can’t reach into the system to fix a “bug,” because deep neural nets, lacking code, can’t properly be said to have a bug. What they have are features. And now we can visualize them, at least up to a point.

The Conversation (0)