Your Candy Wrappers are Listening

Visual microphone reconstructs nearby sound from silent videos of ordinary objects

Image: MIT
MIT's visual microphone system can reconstruct sound from silent video of candy wrappers, potted plants, and more.

“I had to double check I wasn’t playing the wrong audio file.”

The first time Abe Davis coaxed intelligible speech from a silent video of a bag of crab chips (an impassioned recitation of “Mary Had a Little Lamb”) he could hardly believe it was possible. Davis is a Ph.D. candidate at MIT, and his group’s image processing algorithm can turn everyday objects into visual microphones—deciphering the tiny vibrations they undergo as captured on video. 

The research, which will be presented at the computer graphics conference SIGGRAPH 2014 next week, builds on work from MIT’s Computer Science and Artificial Intelligence Laboratory to capture movement on video much smaller than a single pixel. By seeing how border pixels on an object fluctuated in color, the group’s algorithm can measure and calculate the object's minuscule movements (and even magnify a wine glass’s oscillations when a tone is played or visually reveal a heartbeat under the skin).

“It was clear for us quickly that there’s a strong relation between sound and visual motion,” says Michael Rubinstein, a postdoc at Microsoft Research who worked on this and the earlier CSAIL research. “We had this crazy idea: can we actually use videos to recover sound?”

The first speech recovered from the chip bag can be played below. (Future recordings were much clearer, but probably less funny.)


The Tech Alert Newsletter

Receive latest technology science and technology news & analysis from IEEE Spectrum every Thursday.

About the Tech Talk blog

IEEE Spectrum’s general technology blog, featuring news, analysis, and opinions about engineering, consumer electronics, and technology and society, from the editorial staff and freelance contributors.