Hey there, human — the robots need you! Vote for IEEE’s Robots Guide in the Webby Awards.

Close bar

Cypher: The Deep-Learning Software That Will Help Siri, Alexa, and Cortana Hear You

This startup pivoted from cleaning up calls to helping machines discern speech

3 min read
Photo-illustration by Jan Stromme/Getty Images
Photo-illustration: Jan Stromme/Getty Images

imgPhoto-illustration: Jan Stromme/Getty Images

When John Walker founded Cypher four years ago, he had a simple premise: No one wants to waste time on a noisy phone call. So he built software for smartphones, incorporating a deep neural network, which could apply machine learning to deliver crystal-clear conversations devoid of background noise.

To show it off, John Yoon, Cypher’s head of strategy, recently stood at the busy intersection outside IEEE Spectrum’s office building in New York City. Yoon called CEO Walker, who was waiting in our office. They spoke on speakerphone for a few minutes as car horns honked and sirens blared.

Cypher

Location: Salt Lake City

Founded: 2012

Employees: 11

Funding: US $10 million

Once Yoon switched Cypher’s demo program on, the call became as quiet and clear as if he had dialed from a conference room. The company says it can cut out 99 percent of background noise at the cost of introducing a delay of just 24 milliseconds (far below the 200 ms that would be noticeable to a human listener).

To achieve clarity in conversations, Cypher’s program is primed to recognize a voice based on characteristics of human speech that algorithms can easily trace. For example, “all human speech incorporates vowels,” Yoon explains. “If you look at them on a spectral graph, because of the construction of your nasal cavity, tongue, and teeth, you get these really nice harmonic analyses. A jackhammer does not have the same type of harmonics.”

Once the program has picked out a voice, its algorithms strip away everything else. This technique departs from many other software-based approaches, which have often focused on first identifying noise and then subtracting it from speech. However, because noise is much more varied than speech, it’s difficult to correctly classify all unwanted sounds.

Deep neural networks have become very fashionable due to their ability to extract useful information from messy data. Sy Choudhury, senior director of product management at Qualcomm, says a wave of startups and manufacturers are working to integrate these networks into smartphones for a variety of purposes. In Cypher’s case, the greatest challenge, however, has not been in developing a successful program but in figuring out how to sell it.

Cypher had hoped that the promise of exceptional voice quality would be enough to persuade smartphone manufacturers to install the program. It argued that the technology could set devices apart for consumers in a crowded market. “As phones become commoditized, who’s to say Huawei couldn’t use this as a differentiator?” says Yoon.

But customers care much more about camera quality and battery life than the clarity of calls. So Cypher tried another approach: convincing manufacturers to replace the dedicated hardware currently used for active noise cancellation with its algorithms, which can be incorporated into the phone’s operating system. Cypher estimates that doing so could take 50 U.S. cents to $1 off the cost of making each smartphone.

However, that argument also failed to gain much traction. Though Cypher has completed trials with Samsung, LG, and Huawei, it hasn’t inked a single licensing deal since its program launched last fall. Ronan de Renesse, a consumer technology analyst at Ovum, says top-of-the-line smartphone models cost $200 to $400 to manufacture, so saving 50 cents per phone isn’t enough to pique manufacturers’ interest.

Now, Cypher is again rethinking its strategy, to focus on helping the growing number of voice-activated digital assistants (such as Ok Google, Siri, and Cortana) to hear commands in noisy households. Recently, the team applied their software to Alexa, the digital assistant inside Amazon Echo, the company’s at-home taskmaster. In a test run by Cypher, the program improved Alexa’s word recognition (as measured by its error rate) by 116 percent as it analyzed hundreds of queries such as, “Alexa, what’s the weather like in Reno?”

The company says this improvement makes the difference in whether Alexa, if placed in a noisy kitchen, understands a command or not. But it remains to be seen whether customers think that background noise is a true hindrance for digital assistants and if manufacturers will pay Cypher to reduce it.

This article appears in the November 2016 print issue as “Profile: Cypher.”

This article is for IEEE members only. Join IEEE to access our full archive.

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, podcasts, and special reports. Learn more →

If you're already an IEEE member, please sign in to continue reading.

Membership includes:

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions