Apps Put a Psychiatrist in Your Pocket

Mood trackers spot hazardous shifts in mental health before we do

11 min read
Illustration of a person holding a phone, with many bubbles floating around them.
Greg Mably

Nearly every day since she was a child, Alex Leow, a psychiatrist and computer scientist at the University of Illinois Chicago, has played the piano. Some days she plays well, and other days her tempo lags and her fingers hit the wrong keys. Over the years, she noticed a pattern: How well she plays depends on her mood. A bad mood or lack of sleep almost always leads to sluggish, mistake-prone music.

In 2015, Leow realized that a similar pattern might be true for typing. She wondered if she could help people with psychiatric conditions track their moods by collecting data about their typing style from their phones. She decided to turn her idea into an app.

After conducting a pilot study, in 2018 Leow launched BiAffect, a research app that aims to understand mood-related symptoms of bipolar disorder through keyboard dynamics and sensor data from users’ smartphones. Now in use by more than 2,700 people who have volunteered their data to the project, the app tracks typing speed and accuracy by swapping the phone’s onscreen keyboard with its own nearly identical one.

The software then generates feedback for users, such as a graph displaying hourly keyboard activity. Researchers get access to the donated data from users’ phones, which they use to develop and test machine learning algorithms that interpret data for clinical use. One of the things Leow’s team has observed: When people are manic—a state of being overly excited that accompanies bipolar disorder—they type “ferociously fast,” says Leow.

Three screenshots of BiAffects app show a healthy patient, with a range of time spent lying down, a bipolar patient with little time spent prone, and one with depression and significant time spent lying down.Compared to a healthy user [top], a person experiencing symptoms of bipolar disorder [middle] or depression [bottom] may use their phone more than usual and late at night. BiAffect measures phone usage and orientation to help track those symptoms. BiAffect

BiAffect is one of the few mental-health apps that take a passive approach to collecting data from a phone to make inferences about users’ mental states. (Leow suspects that fewer than a dozen are currently available to consumers.) These apps run in the background on smartphones, collecting different sets of data not only on typing but also on the user’s movements, screen time, call and text frequency, and GPS location to monitor social activity and sleep patterns. If an app detects an abrupt change in behavior, indicating a potentially hazardous shift in mental state, it could be set up to alert the user, a caretaker, or a physician.

Such apps can’t legally claim to treat or diagnose disease, at least in the United States. Nevertheless, many researchers and people with mental illness have been using them as tools to track signs of depression, schizophrenia, anxiety, and bipolar disorder. “There’s tremendous, immediate clinical value in helping people feel better today by integrating these signals into mental-health care,” says John Torous, director of digital psychiatry at Beth Israel Deaconess Medical Center, in Boston. Globally, one in 8 people live with a mental illness, including 40 million with bipolar disorder.

These apps differ from most of the more than 10,000 mental-health and mood apps available, which typically ask users to actively log how they’re feeling, help users connect to providers, or encourage mindfulness. The popular apps Daylio and Moodnotes, for example, require journaling or rating symptoms. This approach requires more of the user’s time and may make these apps less appealing for long-term use. A 2019 study found that among 22 mood-tracking apps, the median user-retention rate was just 6.1 percent at 30 days of use.

App developers are trying to avoid the pitfalls of previous smartphone-psychiatry startups, some of which oversold their capabilities before validating their technologies.

But despite years of research on passive mental-health apps, their success is far from guaranteed. App developers are trying to avoid the pitfalls of previous smartphone psychiatry startups, some of which oversold their capabilities before validating their technologies. For example, Mindstrong was an early startup with an app that tracked taps, swipes, and keystrokes to identify digital biomarkers of cognitive function. The company raised US $160 million in funding from investors, including $100 million in 2020 alone, and went bankrupt in February 2023.

Mindstrong may have folded because the company was operating on a different timeline from the research, according to an analysis by the health-care news website Stat. The slow, methodical pace of science did not match the startup’s need to return profits to its investors quickly, the report found. Mindstrong also struggled to figure out the marketplace and find enough customers willing to pay for the service. “We were first out of the blocks trying to figure this out,” says Thomas Insel, a psychiatrist who cofounded Mindstrong.

Now that the field has completed a “hype cycle,” Torous says, app developers are focused on conducting the research needed to prove their apps can actually help people. “We’re beginning to put the burden of proof more on those developers and startups, as well as academic teams,” he says. Passive mental-health apps need to prove they can reliably parse the data they’re collecting, while also addressing serious privacy concerns.

Passive sensing catches mood swings early

Mood Sensors

Seven metrics apps use to make inferences about your mood

An illustration of a series of keys.  All icons: Greg Mably

Keyboard dynamics: Typing speed and accuracy can indicate a lot about a person’s mood. For example, people who are manic often type extremely fast.

An illustration of a pair of curved arrows.

Accelerometer: This sensor tracks how the user is oriented and moving. Lying in bed would suggest a different mood than going for a run.

An illustration of a phone and text bubble icon.

Calls and texts: The frequency of text messages and phone conversations signifies a person’s social isolation or activity, which indicates a certain mood.

An illustration of an arrow pointing downward

GPS location: Travel habits signal a person’s activity level and routine, which offer clues about mood. For example, a person experiencing depression may spend more time at home.

An illustration of a speaker and sound coming off

Mic and voice: Mood can affect how a person speaks. Microphone-based sensing tracks the rhythm and inflection of a person’s voice.

An illustration of multicolored "z's."

Sleep: Changes in sleep patterns signify a change in mood. Insomnia is a common symptom of bipolar disorder and can trigger or worsen mood disturbances.

An illustration of colored bars.

Screen time: An increase in the amount of time a person spends on a phone can be a sign of depressive symptoms and can interfere with sleep.

A crucial component of managing psychiatric illness is tracking changes in mental states that can lead to more severe episodes of the disease. Bipolar disorder, for example, causes intense swings in mood, from extreme highs during periods of mania to extreme lows during periods of depression. Between 30 and 50 percent of people with bipolar disorder will attempt suicide at least once in their lives. Catching early signs of a mood swing can enable people to take countermeasures or seek help before things get bad.

But detecting those changes early is hard, especially for people with mental illness. Observations by other people, such as family members, can be subjective, and doctor and counselor sessions are too infrequent.

That’s where apps come in. Algorithms can be trained to spot subtle deviations from a person’s normal routine that might indicate a change in mood—an objective measure based on data, like a diabetic tracking blood sugar. “The ability to think objectively about my own thinking is really key,” says retired U.S. major general Gregg Martin, who has bipolar disorder and is an advisor for BiAffect.

The data from passive sensing apps could also be useful to doctors who want to see objective data on their patients in between office visits, or for people transitioning from inpatient to outpatient settings. These apps are “providing a service that doesn’t exist,” says Colin Depp, a clinical psychologist and professor at the University of California, San Diego. Providers can’t observe their patients around the clock, he says, but smartphone data can help close the gap.

Depp and his team have developed an app that uses GPS data and microphone-based sensing to determine the frequency of conversations and make inferences about a person’s social interactions and isolation. The app also tracks “location entropy,” a metric of how much a user moves around outside of routine locations. When someone is depressed and mostly stays home, location entropy decreases.

Depp’s team initially developed the app, called CBT2go, as a way to test the effectiveness of cognitive behavioral therapy in between therapy sessions. The app can now intervene in real time with people experiencing depressive or psychotic symptoms. This feature helps people identify when they feel lonely or agitated so they can apply coping skills they’ve learned in therapy. “When people walk out of the therapist’s office or log off, then they kind of forget all that,” Depp says.

Another passive mental-health-app developer, Ellipsis Health in San Francisco, uses software that takes voice samples collected during telehealth calls to gauge a person’s level of depression, anxiety, and stress symptoms. For each set of symptoms, deep-learning models analyze the person’s words, rhythms, and inflections to generate a score. The scores indicate the severity of the person’s mental distress, and are based on the same scales used in standard clinical evaluations, says Michael Aratow, cofounder and chief medical officer at Ellipsis.

Aratow says the software works for people of all demographics, without needing to first capture baseline measures of an individual’s voice and speech patterns. “We’ve trained the models in the most difficult use cases,” he says. The company offers its platform, including an app for collecting the voice data, through health-care providers, health systems, and employers; it’s not directly available to consumers.

In the case of BiAffect, the app can be downloaded for free by the public. Leow and her team are using the app as a research tool in clinical trials sponsored by the U.S. National Institutes of Health. These studies aim to validate whether the app can reliably monitor mood disorders, and determine whether it could also track suicide risk in menstruating women and cognition in people with multiple sclerosis.

BiAffect’s software tracks behaviors like hitting the backspace key frequently, which suggests more errors, and an increase in typing “@” symbols and hashtags, which suggest more social media use. The app combines this typing data with information from the phone’s accelerometer to determine how the user is oriented and moving—for example, whether the user is likely lying down in bed—which yields more clues about mood.

Screenshot of Ellipsis Health sample patient\u2019s case management dashboard with text about the patient\u2019s health and popup window showing high risk scoreEllipsis Health analyzes audio captured during telehealth visits to assign scores for depression, anxiety, and stress.Ellipsis Health

The makers of BiAffect and Ellipsis Health don’t claim their apps can treat or diagnose disease. If app developers want to make those claims and sell their product in the United States, they would first have to get regulatory approval from the U.S. Food and Drug Administration. Getting that approval requires rigorous and large-scale clinical trials that most app makers don’t have the resources to conduct.

Digital-health software depends on quality clinical data

The sensing techniques upon which passive apps rely—measuring typing dynamics, movement, voice acoustics, and the like—are well established. But the algorithms used to analyze the data collected by the sensors are still being honed and validated. That process will require considerably more high-quality research among real patient populations.

Illustration of a hand holding a phone upwards, with many colored bubbles floating around them.Greg Mably

For example, clinical studies that include control or placebo groups are crucial and have been lacking in the past. Without control groups, companies can say their technology is effective “compared to nothing,” says Torous at Beth Israel.

Torous and his team aim to build software that is backed by this kind of quality evidence. With participants’ consent, their app, called mindLAMP, passively collects data from their screen time and their phone’s GPS and accelerometer for research use. It’s also customizable for different diseases, including schizophrenia and bipolar disorder. “It’s a great starting point. But to bring it into the medical context, there’s a lot of important steps that we’re now in the middle of,” says Torous. Those steps include conducting clinical trials with control groups and testing the technology in different patient populations, he says.

How the data is collected can make a big difference in the quality of the research. For example, the rate of sampling—how often a data point is collected—matters and must be calibrated for the behavior being studied. What’s more, data pulled from real-world environments tends to be “dirty,” with inaccuracies collected by faulty sensors or inconsistencies in how phone sensors initially process data. It takes more work to make sense of this data, says Casey Bennett, an assistant professor and chair of health informatics at DePaul University, in Chicago, who uses BiAffect data in his research.

One approach to addressing errors is to integrate multiple sources of data to fill in the gaps—like combining accelerometer and typing data. In another approach, the BiAffect team is working to correlate real-world information with cleaner lab data collected in a controlled environment where researchers can more easily tell when errors are introduced.

Who participates in the studies matters too. If participants are limited to a particular geographic area or demographic, it’s unclear whether the results can be applied to the broader population. For example, a night-shift worker will have different activity patterns from those with nine-to-five jobs, and a city dweller may have a different lifestyle from residents of rural areas.

After the research is done, app developers must figure out a way to integrate their products into real-world medical contexts. One looming question is when and how to intervene when a change in mood is detected. These apps should always be used in concert with a professional and not as a replacement for one, says Torous. Otherwise, the app’s assessments could be dangerous and distressing to users, he says.

When mood tracking feels like surveillance

No matter how well these passive mood-tracking apps work, gaining trust from potential users may be the biggest stumbling block. Mood tracking could easily feel like surveillance. That’s particularly true for people with bipolar or psychotic disorders, where paranoia is part of the illness.

Keris Myrick, a mental-health advocate, says she finds passive mental-health apps “both cool and creepy.” Myrick, who is vice president of partnerships and innovation at the mental-health-advocacy organization Inseparable, has used a range of apps to support her mental health as a person with schizophrenia. But when she tested one passive sensing app, she opted to use a dummy phone. “I didn’t feel safe with an app company having access to all of that information on my personal phone,” Myrick says. While she was curious to see if her subjective experience matched the app’s objective measurements, the creepiness factor prevented her from using the app enough to find out.

Keris Myrick, a mental-health advocate, says she finds passive mental-health apps “both cool and creepy.”

Beyond users’ perception, maintaining true digital privacy is crucial. “Digital footprints are pretty sticky these days,” says Katie Shilton, an associate professor at the University of Maryland focused on social-data science. It’s important to be transparent about who has access to personal information and what they can do with it, she says.

“Once a diagnosis is established, once you are labeled as something, that can affect algorithms in other places in your life,” Shilton says. She cites the misuse of personal data in the Cambridge Analytica scandal, in which the consulting firm collected information from Facebook to target political advertising. Without strong privacy policies, companies producing mental-health apps could similarly sell user data—and they may be particularly motivated to do so if an app is free to use.

Conversations about regulating mental-health apps have been ongoing for over a decade, but a Wild West–style lack of regulation persists in the United States, says Bennett of DePaul University. For example, there aren’t yet protections in place to keep insurance companies or employers from penalizing users based on data collected. “If there aren’t legal protections, somebody is going to take this technology and use it for nefarious purposes,” he says.

Some of these concerns may be mediated by confining all the analysis to a user’s phone, rather than collecting data in a central repository. But decisions about privacy policies and data structures are still up to individual app developers.

Leow and the BiAffect team are currently working on a new internal version of their app that incorporates natural-language processing and generative AI extensions to analyze users’ speech. The team is considering commercializing this new version in the future, but only following extensive work with industry partners to ensure strict privacy safeguards are in place. “I really see this as something that people could eventually use,” Leow says. But she acknowledges that researchers’ goals don’t always align with the desires of the people who might use these tools. “It is so important to think about what the users actually want.”

The Conversation (0)