I’ve certainly shot videos with my phone only to find out later that the audio part of the recording captured the conversations around me rather than the sounds coming from the subject of the video. That has happened when trying to record my kids performing on a school stage while someone coughed or chattered next to me, and at the beach, where a recording I thought would be full of the sound of waves and seagulls, instead featured the voices of people shouting nearby. So, I definitely understand the problem Nokia set out to solve with its OZO audio software. The product, demonstrated for me during CES in Las Vegas last week, is expected to be available for the world’s smartphones later this year.
Nokia’s software aims to give mobile phone users much of the same kind of control over audio recording that we are used to having when recording video or still images. That is, the technology allows users to selectively focus on different points in a scene so it can capture sound coming from in front (or even behind) the cellphone’s cameras and microphones. What’s more, the software allows the audio recording to automatically track the selected person, animal, or object. It also allows users to zoom in on a particular sound, like someone talking or a bird singing, and sync the audio zoom with the video zoom.
Jyri Huopaniemi, head of product and technology for Nokia Technologies, the research and licensing arm of Finland-based Nokia, explained that the software came out of a project to develop a 360-degree camera. The hardware development effort was discontinued, he said, but the software took on a life of its own.
“There have been big advances in smartphones around user-generated content,” Huopaniemi said. “A lot of work has been done on the generation of videos, but much less on audio. We set out to see what kinds of audio capture we could do with phones that have multiple microphones.” (Today’s mobile phones typically have at least two microphones). The ability to zoom and selectively focus on audio while recording video are the first audio-processing tools to come out of this research, but Huopaniemi promises there will be more.
Because the algorithms Nokia is using for this effect have to be adjusted to account for the exact position of microphones on each model of mobile phone, this technology can’t be rolled out as an app, Huopaniemi explained. Rather, getting it into users’ hands will require phone manufacturers to license it for integration into their built-in audio and video recording software. It’s already available in some models of Nokia-branded phones manufactured by HMD sold outside the United States, Huopaniemi said, and will be “quietly” moving into mobile devices from other manufacturers this year.
The listening experience was impressive—I was able to touch the demonstration display to focus on different points in a scene in order to listen to different sounds—a sort of audio refocusing similar to the image refocusing available in some phones with multiple camera modules. I particularly liked the ability to hear “behind” me—and might be tempted by the potential of this feature as an enhanced eavesdropping tool when catching fragments of interesting conversations in a crowded coffee shop.
Huopaniemi promised to let me know when the technology gets into a wider range of mobile devices. Until then, you can check out the demo video below. Do wear headphones while listening to get the full effect.
Tekla S. Perry is a senior editor at IEEE Spectrum. Based in Palo Alto, Calif., she's been covering the people, companies, and technology that make Silicon Valley a special place for more than 40 years. An IEEE member, she holds a bachelor's degree in journalism from Michigan State University.