The IoT Needs a New Set of Eyes

Cameras for the Internet of Things will have to be fast, cheap, and powerful—and might not look like cameras at all


The rise of computer vision has given us robot chefs and cameras that detect gas flares in fuel production. It’s also led to an increase in connected cameras that are trying to run at the edge of the network.

“Running at the edge” means these cameras are not only communicating wirelessly with the cloud but also communicating with local gateways and working with built-in logic boards to complete a task. The task might be as simple as notifying a manufacturer when a production line produces a defective item or as complex as identifying a person to determine if the system should sound an alarm.

But as we connect more cameras and ask them to perform more complicated tasks, their fundamental architecture is changing. Today we see changes in the silicon that handles image processing and computing. In a few years, we may see our notion of cameras change to meet the needs of digital eyes, not human ones.

There are two challenges driving the silicon shift. First, processing power: Many of these cameras try to identify specific objects by using machine learning. For example, an oil company might want a drone that can identify leaks as it flies over remote oil pipelines. Typically, training these identification models is done in the cloud because of the enormous computing power required. Some of the more ambitious chip providers believe that in a few years, not only will edge-based chips be able to match images using these models, but they will also be able to train models directly on the device.

That’s not happening yet, due to the second challenge that silicon providers face. Comparing images with models requires not just computing power but actual power. Silicon providers are trying to build chips that sip power while still doing their job. Qualcomm has one such chip, called Glance, in its research labs. The chip combines a lens, an image processor, and a Bluetooth radio on a module smaller than a sugar cube.

Glance can manage only three or four simple models, such as identifying a shape as a person, but it can do it using fewer than 2 milliwatts of power. Qualcomm hasn’t commercialized this technology yet, but some of its latest computer-vision chips combine on-chip image processing with an emphasis on reducing power consumption.

But does a camera even need a lens? Researchers at the University of Utah suggest not, having invented a lensless camera that eliminates some of a traditional camera’s hardware and high data rates. Their camera is a photodetector against a pane of plexiglass that takes basic images and converts them into shapes a computer can be trained to recognize.

This won’t work for jobs where high levels of detail are important, but it could provide a cheaper, more power-efficient view of the world for computers fulfilling basic functions. We can also apply this thinking to how we generate image data for computers. Researchers at the University of Washington, for example, have been studying ways to use disruptions in Wi-Fi signals to teach computers how to understand gestures.

A camera doesn’t have to look like a camera anymore. It just needs to match incoming data to a statistical model to tell us what something looks like. If it can do this cheaply and without sucking up too much power, it could change beyond our recognition, even as it becomes more essential.

This article appears in the November 2018 print issue as “New Eyes for the IoT.”