The December 2022 issue of IEEE Spectrum is here!

Close bar

Two Startups Use Processing in Flash Memory for AI at the Edge

Mythic AI and Syntiant sound similar on the surface, but they’re after different markets

2 min read
A zoomed-in photo shows the flash array developed by the startup called Mythic.
Photo: Courtesy of Mythic

Irvine, Calif.–based Syntiant thinks it can use embedded flash memory to greatly reduce the amount of power needed to perform deep-learning computations. Austin, Texas–based Mythic thinks it can use embedded flash memory to greatly reduce the amount of power needed to perform deep-learning computations. They both might be right.

A growing crowd of companies are hoping to deliver chips that accelerate otherwise onerous deep learning applications, and to some degree they all have similarities because “these are solutions that are created by the shape of the problem,” explains Mythic founder and CTO Dave Fick.

When executed in a CPU, that problem is shaped like a traffic jam of data. A neural network is made up of connections and “weights” that denote how strong those connections are, and having to move those weights around so they can be represented digitally in the right place and time is the major energy expenditure in doing deep learning today.

“Our approach is to completely eliminate both memory bandwidth and memory power penalties by doing computation in memory,” explains Syntiant CEO Kurt Busch.

In both companies’ approaches, the network weights are actually levels of charge stored in an array of flash memory cells. The charge alters the amount of current that flows through the cell, and the cells are arranged in a way that the current produces the crucial “multiply and accumulate” computations needed for a network to tell a stop sign from a sunset, or “OK Google” from “big gray poodle.”

Because the weights are always just where they need to be, there’s no need to expend any time or energy to move them. The analog nature of the computation also keeps power low. While training neural networks is typically done by computing with fairly precise (8- or 16-bit) numbers, actually using the trained network—called inferencing—can be done faster and at lower power using much-less precise—5-bit or even 3-bit—numbers as the weights. “With analog computation, you can build multiply and accumulate that is low precision but very, very accurate,” says Busch.

Mythic is aiming for a mere 0.5 picojoules per multiply and accumulate, which would result in about 4 trillion operations per watt (TOPS/W). Syntiant is hoping to get to 20 TOPS/W. An Nvidia Volta V100 GPU can do 0.4 TOPS/W, according to Syntiant. However, real apples-to-apples comparisons in the machine learning world are difficult to determine, Fick points out.

The extent of analog circuitry each startup uses is a key difference between them. Syntiant’s entire network is analog, but Mythic surrounds the analog flash array with programmable digital circuitry. Mythic uses the surrounding circuitry to add flexibility to the size and types of networks its chips can run. “All network topologies run at approximately the same efficiency on our chip,” says Fick. 

This difference also influences the two companies’ target customers and applications. At Syntiant “we often say that Mythic comes up at 100 percent of investment meetings and [with] zero percent of customers,” says Busch. Both companies say they are seeking to add AI at “the edge.” But that’s a broad category that includes everything from self-driving cars to AI-enhanced hearing aids.

Syntiant is going after smaller, often-milliwatt power applications. It’s first device will do things like spotting wake words and identifying speakers.

Mythic is going for applications that need more complex networks capable of handling high-resolution video in systems with low-single-digit watt power, such as autonomous drones and smartphones. Fick says there are orders of magnitude differences between the operations per second needed for these applications and those Syntiant is after.

The Conversation (0)

Why Functional Programming Should Be the Future of Software Development

It’s hard to learn, but your code will produce fewer nasty surprises

11 min read
A plate of spaghetti made from code
Shira Inbar

You’d expectthe longest and most costly phase in the lifecycle of a software product to be the initial development of the system, when all those great features are first imagined and then created. In fact, the hardest part comes later, during the maintenance phase. That’s when programmers pay the price for the shortcuts they took during development.

So why did they take shortcuts? Maybe they didn’t realize that they were cutting any corners. Only when their code was deployed and exercised by a lot of users did its hidden flaws come to light. And maybe the developers were rushed. Time-to-market pressures would almost guarantee that their software will contain more bugs than it would otherwise.

Keep Reading ↓Show less