Today’s car crash-avoidance systems and experimental driverless cars rely on radar and other sensors to detect pedestrians on the road. The next improvement may come from engineers at the University of California, San Diego (UCSD), who have developed a pedestrian detection system that can perform in close to real-time based on visual cues alone. This video-only detection could make systems for spotting pedestrians both cheaper and more effective.
Such a vision-based safety system has remained elusive in cars because computers typically face a tradeoff between analyzing video images quickly and drawing the right conclusions. On the one hand, a simple “cascade detection” computer vision algorithm can quickly detect many pedestrians in certain images, but lacks the sophistication to distinguish between pedestrians and similar-looking objects in the toughest cases. On the other hand, machine learning algorithms called deep neural networks can handle such complex pattern recognition, but work too slowly for real-time pedestrian detection.
Researchers combined the best of both approaches for the new system, said Nuno Vasconcelos, electrical engineering professor at UCSD, in a press release:
No previous algorithms have been capable of optimizing the trade-off between detection accuracy and speed for cascades with stages of such different complexities. In fact, these are the first cascades to include stages of deep learning. The results we're obtaining with this new algorithm are substantially better for real-time, accurate pedestrian detection.
Vasconcelos and his colleagues developed their algorithm to the point where it can analyze 2 to 4 image frames per second. That’s not quick enough to keep up with the pace of real-time video, but it’s getting close. The algorithm also supposedly performs with just half the error rate of similar existing systems.
The new algorithm begins with the simpler “cascade detection” algorithms in the early stages of analysis to help filter out obvious non-pedestrian parts of an image such as the sky, then brings in the more sophisticated “deep learning” of neural networks only in the final stages. In these later stages, the algorithm combines the simpler and more sophisticated algorithms in a way that balances detection accuracy against complexity.
Google’s experimental self-driving cars currently rely on a wide array of radar, lidar, and other sensors to detect pedestrians and other objects on the road. Getting rid of some of that equipment could make the cars cheaper and easier to design. But it’s not just driverless cars that would benefit; modern crash-avoidance systems found in existing cars could also potentially make use of such an algorithm.
In fact, Google has been developing its own video-based pedestrian detection system. Google’s version takes a slightly different approach, but with the same general philosophy of using the deep learning algorithms in a sparing but targeted manner. In 2015, its system was capable of accurately identifying pedestrians within 0.25 seconds, Anelia Angelova, a research scientist at Google working on computer vision and machine learning, told IEEE Spectrum at the time. She and her team hoped to eventually reach the 0.07-second identification benchmark needed for such a system to work in real-time.
Jeremy Hsu has been working as a science and technology journalist in New York City since 2008. He has written on subjects as diverse as supercomputing and wearable electronics for IEEE Spectrum. When he’s not trying to wrap his head around the latest quantum computing news for Spectrum, he also contributes to a variety of publications such as Scientific American, Discover, Popular Science, and others. He is a graduate of New York University’s Science, Health & Environmental Reporting Program.