Aerial robotics research has brought us flapping hummingbirds, seagulls, bumblebees, and dragonflies. But if these robots are to do anything more than bear a passing resemblance to their animal models, there is one thing they’ll definitely need: better vision.
In February, at the International Solid-State Circuits Conference (ISSCC) in San Francisco, two teams presented new work (PDF) aimed at building better-performing and lower-power vision systems that would help aerial robots navigate and aid them in identifying objects.
Dongsuk Jeon, a graduate student working with Zhengya Zhang and IEEE Fellows David Blaauw and Dennis Sylvester at the University of Michigan, in Ann Arbor, outlined an approach to drastically lower the power of the very first stage of any vision system—the feature extractor. That system uses an algorithm to draw out potentially important features like circles and squares from an overall image.
To navigate and to determine whether a scene looks familiar, a micro air vehicle needs to be able to use its entire field of view. But existing full-image feature-extraction algorithms are built to run on desktop computers and servers, not battery-powered platforms, Blaauw says. He adds that although there are some low-power feature-extraction algorithms, those tend to focus on very specific applications such as face recognition.
The Michigan team’s solution was to pare down a traditional feature-extraction algorithm, reengineering it to work well on a specialized image-processing accelerator and optimizing factors like the number of times a portion of an image is accessed for analysis. Image sections in traditional feature extraction may be analyzed a handful of times, because areas of interest often overlap. The new accelerator pushes data through only once, as if the data were on a conveyor belt. “We feed a little bit of the image through a bus, and all the little processors watch,” Blaauw says. “When they see a part of the image they need, they grab it.”
Jeon and his colleagues also incorporated a few hardware tricks to cut down on the power usage. One approach was to rework the shift registers that act as buffers for data that’s in process. Typical shift registers are made up of cells with two latches—circuit components that contain about 10 transistors each. The team found a way to rework the registers so that their cells each contained only one latch. The resulting registers were just as fast, but because they contained half as many latches they lost half as much power from leakage.
The team’s accelerator, which consumes just 2.7 milliwatts of power, was made using a 28-nanometer manufacturing process. The core clocks at a very low 27 megahertz, which keeps the power consumption down; the clock on a typical vision system-on-a-chip works at more than 100 MHz. Although feature extraction is just one aspect of vision systems, Jeon says, it can take up as much as 70 percent of the area of the core on a vision chip.
Another approach, led by IEEE Fellow Hoi-Jun Yoo at the Korea Advanced Institute of Science and Technology (KAIST) in Daejeon, South Korea, is to attack the vision problem at the complete chip level. Yoo’s team has built an SoC with 21 image-processing cores. The chip, which has been demonstrated on a toy car and on a four-propeller flying robot called a quadrotor, is capable of distinguishing faces and differentiating between objects and pictures of objects—for instance, a car and a billboard carrying the image of a car.
To save power, the chip can dynamically adjust the voltage and frequency as well as the number of cores used to process each video frame. The team also incorporated a mix of digital and analog circuitry to save power. All told, they were able to drop the power consumption of the chip down to 260 mW from typical object recognition SoC levels of more than 300 mW, says graduate student Junyoung Park, so that means more power can be set aside to keep a robot aloft.
Both the Michigan research and the KAIST work “have a similar design methodology in that they’re taking vision algorithms developed over the last 20 years and squeezing them into new design architectures and circuit implementations,” says Mike Polley, director of the Vision, Video, and Image Processing R&D Labs at Texas Instruments, in Dallas. He adds that we could see some of these approaches emerging in personal robots, cars, and other applications within five years.
This article originally appeared in print as "Seeing on the Fly."