The October 2022 issue of IEEE Spectrum is here!

Close bar
An elderly woman at home talks on the phone in front of a computer, holding a cup of coffee
iStockphoto

Alzheimer’s disease is notoriously difficult to diagnose. Typically, doctors use a combination of cognitive tests, brain imaging, and observation of behavior that can be expensive and time-consuming. But what if a quick voice sample, easily taken at a person’s home, could help identify a patient with Alzheimer’s?

A company called Canary Speech is creating technology to do just that. Using deep learning, its algorithms analyze short voice samples for signs of Alzheimer’s and other conditions. Deep learning provider Syntiant recently announced a collaboration with Canary Speech, which will allow Canary to take a technology that is mostly used in doctor’s offices and hospitals into a person’s home via a medical device. While some research has found deep learning techniques using voice and other types of data to be highly accurate in classifying those with Alzheimer’s and other conditions in a lab setting, it’s possible the results would be different in the real world. Nevertheless, AI and deep learning techniques could become helpful tools in making a difficult diagnosis.

Most people think of Alzheimer’s disease, the most common form of dementia, as affecting memory. But research suggests that Alzheimer’s can impact speech and language even in the disease’s earliest stages, before most symptoms are noticeable. While people can’t usually pick up on these subtle effects, a deep learning model, trained on the voices of tens of thousands of people with and without these conditions, may be able to distinguish these differences.

“What you’re interested in is, what is the central nervous system telling you that is being conveyed through the creation of speech?” says Henry O’Connell, CEO and cofounder of Canary Speech. “That's what Canary Speech does—we analyze that data set.”

Until now, O’Connell says that the algorithm has been cloud-based, but Canary’s collaboration with Syntiant allows for a chip-based application, which is faster and has more memory and storage capacity. The new technology is meant to be incorporated into a wearable device and take less than a second to analyze a 20- or 30-second sample of speech for conditions like Alzheimer’s, as well as anxiety, depression, and even general energy level. O’Connell says that Canary’s system is about 92.5 percent accurate when it comes to correctly distinguishing between the voices of people with and without Alzheimer’s. There is some research to suggest that conditions like depression and anxiety impact speech, and O’Connell says that Canary is working to test and improve the accuracy of algorithms to detect these conditions.

Other voice-based technologies have had similar success, says Frank Rudzicz, an associate professor of computer science at the University of Toronto and cofounder of Winterlight Labs, which makes a similar product to Canary Speech. In a 2016 study, Rudzicz and other researchers used simple machine learning methods to analyze the speech of people with and without Alzheimer’s with an accuracy of about 81 percent.

“With deep learning, you would just give the raw data to these deep neural networks, and then the deep neural networks automatically produce their own internal representations,” Rudzicz says. Like all deep learning algorithms, this creates a “black box”—meaning it’s impossible to know exactly what aspects of speech the algorithm is homing in on. With deep learning, he says, the accuracy of these algorithms has risen above 90 percent.

Previously, programmers have used deep learning alongside medical imaging of the brain, such as MRI scans. In studies, many of these methods are similarly accurate—usually above 90 percent accuracy. In a December 2021 study, programmers successfully trained an algorithm to not only distinguish between the brains of cognitively normal people and those with Alzheimer’s, but also between those with mild cognitive impairment, in many cases an early precursor to Alzheimer’s, whose brains were either more similar to those of healthy people or more similar to those with Alzheimer’s. Distinguishing these subtypes is especially important because not everyone with mild cognitive impairment goes on to develop Alzheimer’s.

“We want to have methods to stratify individuals along the Alzheimer’s disease continuum,” says Eran Dayan, an assistant professor of radiology at the University of North Carolina, Chapel Hill and an author of the 2021 study. “These are subjects who are likely to progress to Alzheimer's disease.”

Identifying these patients as early as possible, Dayan says, will likely be crucial in effectively treating their diseases. He also says that, generally, scan-based deep learning has a similarly high efficacy rate, at least in classification studies done in the lab. Whether these technologies will be just as effective in the real world is less clear, he says, though they are still likely to work well. He says more research is needed to know for sure.

Another reason for concern, Dayan says, is potential biases, which recent research has shown that AI can harbor if there is not enough variety in the data the algorithm is trained on. For instance, Rudzicz says it’s possible that an algorithm trained using speech samples from people in Toronto would not work as well in a rural area. O’Connell says that the algorithm that Canary Speech analyzes nonlanguage elements of speech, and that they have versions of the technologies used in other countries, like Japan and China, that are trained using data from native language speakers.

“We validate our model and train it in that system, in that environment, for performance,” he says.

Though Canary’s collaboration with Syntiant may make remote, real-time monitoring possible, O’Connell personally believes a formal diagnosis should come from a doctor, with this technology serving as another tool in making the diagnosis. Dayan agrees.

“AI, in the coming years, I hope will help assist doctors, but absolutely not replace them,” he says.

The Conversation (0)

Deep Learning Could Bring the Concert Experience Home

The century-old quest for truly realistic sound production is finally paying off

12 min read
Vertical
Image containing multiple aspects such as instruments and left and right open hands.
Stuart Bradford
Blue

Now that recorded sound has become ubiquitous, we hardly think about it. From our smartphones, smart speakers, TVs, radios, disc players, and car sound systems, it’s an enduring and enjoyable presence in our lives. In 2017, a survey by the polling firm Nielsen suggested that some 90 percent of the U.S. population listens to music regularly and that, on average, they do so 32 hours per week.

Behind this free-flowing pleasure are enormous industries applying technology to the long-standing goal of reproducing sound with the greatest possible realism. From Edison’s phonograph and the horn speakers of the 1880s, successive generations of engineers in pursuit of this ideal invented and exploited countless technologies: triode vacuum tubes, dynamic loudspeakers, magnetic phonograph cartridges, solid-state amplifier circuits in scores of different topologies, electrostatic speakers, optical discs, stereo, and surround sound. And over the past five decades, digital technologies, like audio compression and streaming, have transformed the music industry.

Keep Reading ↓Show less
{"imageShortcodeIds":[]}