A Deep Learning AI Chip for Your Phone

A chip designed to run powerful neural networks for image analysis uses one-tenth the energy a mobile GPU would

3 min read
A Deep Learning AI Chip for Your Phone
Illustration: iStockphoto

Neural networks learn to recognize objects in images and perform other artificial intelligence tasks with a very low error rate. (Just last week, a neural network built by Google’s Deep Mind lab in London beat a master of the complex Go game—one of the grand challenges of AI.) But they’re typically too complex to run on a smartphone, where, you have to admit, they’d be pretty useful. Perhaps no more. At the IEEE International Solid State Circuits Conference in San Francisco on Tuesday, MIT engineers presented a chip designed to use run sophisticated image-processing neural network software on a smartphone’s power budget.

The great performance of neural networks doesn’t come free. In image processing, for example, neural networks like AlexNet work so well because they put an image through a huge number of filters, first finding image edges, then identifying objects, then figuring out what’s happening in a scene. All that requires moving data around a computer again and again, which takes a lot of energy, says Vivienne Sze, an electrical engineering professor at MIT. Sze collaborated with MIT computer science professor Joel Emer, who is also a senior research scientist at GPU-maker Nvidia.

imgEyeriss has 168 processing elements (PE), each with its own memory.Image: MIT

“On our chip we bring the data as close as possible to the processing units, and move the data as little as possible,” says Sze. When run on an ordinary GPU, neural networks fetch the same image data multiple times. The MIT chip has 168 processing engines, each with its own dedicated memory nearby. Nearby units can talk to each other directly, and this proximity saves power. There’s also a larger, primary storage bank farther off, of course. “We try to go there as little as possible,” says Emer. Furthering the limits on moving data, the hardware compresses the data it does send and uses statistics about the data to do fewer calculations on it than a GPU would.

All that means that when running a powerful neural network program the MIT chip, called Eyeriss, uses one-tenth the energy (0.3 watts) of a typical mobile GPU (5 – 10 W). “This is the first custom chip capable of demonstrating a full, state-of-the-art neural network,” says Sze. Eyeriss can run AlexNet, a highly accurate and computationally demanding neural network. Previous such chips could only run specific algorithms, says the MIT group; they chose to test AlexNet because it’s so demanding, and are confident it can run others of arbitrary size, they say.

Besides a use in smartphones, this kind of chip could help self-driving cars navigate and play a role in other portable electronics. At ISSCC, Hoi-Jun Yoo’s group at the Korea Advanced Institute of Science and Technology showed a pair of augmented reality glasses that use a neural network to train a gesture- and speech-based user interface to a particular user’s gestures, hand size, and dialect.

Yoo says the MIT chip may be able to run neural networks at low power once they’re trained, but he notes that the even more computationally-intensive learning process for AlexNet can’t be done on them. The MIT chip could in theory run any kind of trained neural network, whether it analyzes images, sounds, medical data, or whatever else. Yoo says it’s also important to design chips that may be more specific to a particular category of task—such as following hand gestures—and are better at learning those tasks on the fly. He says this could make for a better user experience in wearable electronics, for example. These systems need to be able to learn on the fly because the world is unpredictable and each user is different. Your computer should start to fit you like your favorite pair of jeans. 

The Conversation (0)

3D-Stacked CMOS Takes Moore’s Law to New Heights

When transistors can’t get any smaller, the only direction is up

10 min read
An image of stacked squares with yellow flat bars through them.
Emily Cooper

Perhaps the most far-reaching technological achievement over the last 50 years has been the steady march toward ever smaller transistors, fitting them more tightly together, and reducing their power consumption. And yet, ever since the two of us started our careers at Intel more than 20 years ago, we’ve been hearing the alarms that the descent into the infinitesimal was about to end. Yet year after year, brilliant new innovations continue to propel the semiconductor industry further.

Along this journey, we engineers had to change the transistor’s architecture as we continued to scale down area and power consumption while boosting performance. The “planar” transistor designs that took us through the last half of the 20th century gave way to 3D fin-shaped devices by the first half of the 2010s. Now, these too have an end date in sight, with a new gate-all-around (GAA) structure rolling into production soon. But we have to look even further ahead because our ability to scale down even this new transistor architecture, which we call RibbonFET, has its limits.

Keep Reading ↓Show less