In 2017, Facebook’s Mark Chevillet gave himself two years to prove whether it was feasible to build a non-invasive technology able to read out 100 words per minute from brain activity.
It’s been two years, and the verdict is in: “The promise is there,” Chevillet told IEEE Spectrum. “We do think it will be possible.”
As research director of Facebook Reality Labs’ brain-computer interface program, Chevillet plans to push ahead with the project—and the company’s ultimate goal to develop augmented reality (AR) glasses that can be controlled without having to speak aloud.
Chevillet’s optimism is fueled in large part by a first in the field of brain-computer interfaces that hit the presses this morning: In the journal Nature Communications, a team at the University of California, San Francisco, funded by Facebook Reality Labs, has built a brain-computer interface that accurately decodes dialogue—words and phrases both heard and spoken by the person wearing the device—from brain signals in real time.
The results are an important step toward neural implants could be used to restore natural communication to patients who have lost the ability to speak due to stroke, spinal cord injury, or other conditions, says senior author and UCSF neurosurgeon Edward Chang.
Facebook, however, is more interested in building augmented reality glasses than biomedical devices. This work provides a proof of principle that it is possible to decode imagined speech from brain signals by measuring the activity of large populations of neurons, says Chevillet. “This [result] helps set the specification of what type of a wearable device we need to build.”
In April, Chang’s team premiered a different brain-computer interface able to directly decode speech from brain signals. The goal of the work described in today’s release was to boost the accuracy of decoding brain activity. “We’re decoding two kinds of information from two different parts of the brain, and using that as context,” says Chang. The result is a “sizable impact” on the accuracy of the decoding, he says.
That improved accuracy is based on a simple concept: adding context. Using electrodes implanted in the brains of three patient volunteers undergoing treatment for epilepsy, Chang’s team recorded brain activity while the volunteers listened to a set of pre-recorded questions and spoke aloud their responses.
That brain data was then used to train machine learning algorithms. Later, when the study participants were asked to respond to questions again, the algorithms used brain activity alone to first determine whether a volunteer was listening or speaking, and then try to decode the speech.
Most speech decoders work by making a best guess at what sound a person is thinking, so a normal brain decoder is expected to be confused by similar-sounding words like ‘synthesizer’ and ‘fertilizer’. The new UCSF system adds context to help discriminate between those words. First, the algorithm predicts the question being heard from a known set of questions, such as “What do you spread on a field?” That information is then used as context to help predict the answer: “Fertilizer.”
Schematic of real-time speech decoding during a question (blue) and answer (red) task.Illustration: Edward Chang/Nature Communications
By adding context, answers are significantly easier for a brain-computer interface to predict, says Chang. The system was able to decode perceived (heard) and produced (spoken) speech with an accuracy of up to 76 percent and 61 percent, respectively, using a restricted set of specified questions and answers. But the team says it hopes to expand the system’s vocabulary in the future.
Better algorithms and faster computers also improved the speed of decoding in the study: What used to take weeks to months of offline processing can now be done in real time, says Chang.
This quietly published, peer-reviewed study stands in marked contrast to the breathless media coverage in response to Elon Musk’s announcement earlier this month trumpeting the progress of his brain augmentation company, Neuralink. While Facebook intends to build AR glasses that externally listen to brain signals using infrared light (described at length today in a company blog post), Neuralink is developing an implantable array of 3,000 flexible electrodes to augment brain function.
The two announcements seem to set the companies on a race to be the first to offer a commercial brain-computer interface that decodes brain activity. But progress toward that goal is likely to be more of a slow plod than a dash. “We don’t have any actual product plans for this because this technology is such early stage research,” says Chevillet.
In the meantime, Chang hopes to soon bring meaningful change to patients who cannot speak. All the team’s work so far has been done with volunteers who are able to speak, so the team will now spend a year working with a single research participant with speech loss to generate text on a computer screen. All the data will be collected by UCSF and kept confidential on university servers. Meanwhile, all results from the collaboration with Facebook are being published and made accessible to the academic community, emphasizes Chang. “I hope it’s not just benefitting what we’re doing, but the entire field.”
This post was updated on 30 July.
Megan is an award-winning freelance journalist based in Boston, Massachusetts, specializing in the life sciences and biotechnology. She was previously a health columnist for the Boston Globe and has contributed to Newsweek, Scientific American, and Nature, among others. She is the co-author of a college biology textbook, “Biology Now,” published by W.W. Norton. Megan received an M.S. from the Graduate Program in Science Writing at the Massachusetts Institute of Technology, a B.A. at Boston College, and worked as an educator at the Museum of Science, Boston.