For some years now, electrical engineers and computer scientists have been trying hard to figure out how to perform neural-network calculations faster and more efficiently. Indeed, the design of accelerators suitable for neural-network calculations has lately become a hotbed of activity, with the most common solution, GPUs, vying with various application-specific ICs (think Google’s Tensor Processing Unit) and field-programmable gate arrays.
Well, another contender has just entered the arena, one based on an entirely different paradigm: computing with light. A MIT spinoff called Lightmatter described its “Mars” device at last week’s Hot Chips virtual conference. Lightmatter is not the only company pursuing this novel strategy, but it seems to be ahead of its competition.
It’s somewhat misleading, though, for me to call this approach “novel.” Optical computing, in fact, has a long history. It was used as far back as the late 1950s, to process some of the first synthetic-aperture radar (SAR) images, which were constructed at a time when digital computers were not up to the task of carrying out the necessary mathematical calculations. The lack of suitable digital computers back in the day explains why engineers built analog computers of various kinds, ones based on spinning disks, sloshing fluids, continuous amounts of electric charge, or even light.
Over the decades, researchers have from time to time resurrected the idea of computing things with light, but the concept hasn’t proven widely practical for anything. Lightmatter is trying to change that now when it comes to neural-network calculations. Its Mars device has at its heart a chip that includes an analog optical processor, designed specifically to perform the mathematical operations that are fundamental to neural networks.
The key operations here are matrix multiplications, which consist of multiplying pairs of numbers and adding up the results. That you can perform addition with light isn’t surprising, given that the electromagnetic waves that constitute light add together when two light beams are combined.
What’s trickier to understand is how you can do multiplication using light. Let’s me sketch that out here, although for a fuller account I’d recommend reading the very nice description of its technology that Lightmatter has provided on its blog.
The basic unit of Lightmatter’s optical processor is what’s known as a Mach-Zehnder interferometer. Ludwig Mach and Ludwig Zehnder invented this device in the 1890s, so we’re not talking about something exactly modern here. What’s new is the notion of shrinking many Mach-Zehnder interferometers down to a size that’s measured in nanometers and integrating them together on one chip for the purpose of speeding up neural-network calculations.
Such an interferometer splits incoming light into two beams, which then take two different paths. The resulting two beams are then recombined. If the two paths are identical, the output looks just like the input. If, however, one of the two beams must travel farther than the other or is slowed, it falls out of phase with the other beam. At an extreme, it can be a full 180 degrees (one half wavelength) out of phase, in which case the two beams interfere destructively when recombined, and the output is entirely nulled.
More generally, the field amplitude of the light at the output will be the amplitude of the light at the input times the cosine of half of the phase difference between the light traveling in its two arms. If you can control that phase difference in some convenient way, you then have a device that works for multiplication.
Lightmatter’s tiny Mach-Zehnder interferometers are constructed by fashioning appropriately small waveguides for light inside a nanophotonic chip. By using materials whose refractive index depends on the electric field they are subjected to, the relative phase of the split beam can be controlled simply by applying a voltage to create an electric field, as happens when you charge a capacitor. In Lightmatter’s chip, that’s done by applying an electric field of one polarity to one arm of the interferometer and an electric field of the opposite polarity to the other arm.
As is true for a capacitor, current flows only while charge is being built up. Once there is sufficient charge to provide an electric field of the desired strength, no more current flows and thus no more energy is required. That’s important here, because it means that once you have set the value of the multiplier you want to apply, no more energy is needed if that value (a “weight” in the neural-network calculation) doesn’t subsequently change. The flow of light through the chip similarly doesn’t consume energy. So you have here a very efficient system for performing multiplication, one that operates at, well, the speed of light.
One of the weaknesses of analog computers of all kinds has been the limited accuracy of the calculations they can perform. That, too, is a shortcoming of Lightmatter’s chip—you just can’t specify numbers with as fine resolution as you can do using a digital circuitry. Fortunately, the “inference” calculations that neural networks carry out once they have been trained don’t need much resolution. Training a neural network, however, does. “Training requires higher dynamic range; we’re focused on inference because of that,” says Nicholas Harris, Lightmatter’s CEO and one of the company’s founders. “We have an 8-bit-equivalent system.”
You might imagine that Lightmatter’s revolutionary new equipment for performing neural-network calculations with light is at this stage just a laboratory prototype, but that would be a mistake. The company is quite far along in producing a practical product, one that can be added to any server motherboard with a PCI Express slot and immediately programmed to start cranking out neural-network inference calculations. “We are very focused on making it so that it doesn’t look like alien technology,” says Harris. He explains that Lightmatter not only has this hardware built, it has also created the necessary software toolchains to support its use with standard neural-network frameworks (TensorFlow and PyTorch).
Lightmatter expects to go into production with a commercial unit based its Mars device in late 2021. Harris says that the company’s chips, sophisticated as they are, have good yields, in large part because the nanophotonic components involved are not really that small compared with what’s found in cutting-edge electronic devices. “You don’t get the same point defects that destroy things.” So it shouldn’t be difficult to keep yields high and the pricing for the Mars device low enough to be competitive with GPUs.
And who knows, perhaps other companies such as Lightintelligence, LightOn, Optalysis, or Fathom Computing, will introduce their own light-based neural-network accelerator cards by then. Harris isn’t worried about that, though, he says. “I’d say we’re pretty far ahead.”
David Schneider is a senior editor at IEEE Spectrum. His beat focuses on computing, and he contributes frequently to Spectrum's Hands On column. He holds a bachelor's degree in geology from Yale, a master's in engineering from UC Berkeley, and a doctorate in geology from Columbia.