Just like in a classic spy movie, someone could potentially bypass the fingerprint or voiceprint security measures on your phone by using fingerprint film or a recording of your voice. But fear not—your deepest, darkest secrets could someday be less vulnerable to hackers, thanks to a novel user verification technique for phones that relies not on a biological factor, but a behavioral one.
The new platform, LipPass, deciphers the subtle yet distinct differences in how a user’s mouth moves when they speak with 90.2 percent accuracy, and detects spoofers with 93.1 percent accuracy.
“To resist an attack, existing solutions either employ specialized infrastructure, such as Apple FaceID, or require users to involve extra operations, such as eye blinking, which introduces additional cost and effort and further reduces user experience,” says Jiadi Yu of Shanghai Jiao Tong University.
Instead, Yu and his team developed a new platform for user verification that relies on existing infrastructure in phones to detect the unique way in which each person moves their mouth as they speak. Their new lip-reading approach is described in a study, published 23 January in IEEE/ACM Transactions on Networking.
The researchers realized the audio components on smartphones can be exploited to depict the movement of a person’s mouth by analyzing the acoustic signals that bounce off the user’s face. Since each person exhibits unique speaking behaviors—like lip protrusion and closure, tongue stretch and constriction, as well as jaw angle changes—this creates a unique Doppler effect profile that can be detected by the phone.
The platform then uses a deep learning algorithm, which extracts distinct features from of the user’s Doppler profile as he or she speaks. Next, a binary tree-based approach is applied to distinguish the new user’s profile from previously registered users, which also helps discriminate between the identity of legal users and spoofers.
In a series of experiments, Yu and his colleagues tested LipPass on four different smartphones: a Nexus 6P, a Galaxy S6, a Galaxy Note 5, and a Huawei Honor 8. Volunteers used the platform in four distinct environments, ranging from a well-lit, quiet laboratory to a dark, noisy pub.
In a controlled laboratory environment, LipPass achieved an overall authentication accuracy of 95.3 percent, which is comparable to what two other platforms analyzed in the study achieved, with voiceprint recognition of WeChat at 96.1 percent, and Alipay face recognition at 97.2 percent. Notably, however, the accuracy of LipPass performance stayed relatively stable across the various environments, while the accuracy of WeChat dropped as low as 21.3 percent in noisy environments, and the accuracy of Alipay dropped to 20.4 percent in dark environments.
The researchers also assessed LipPass’s ability to detect spoofers attempting to hack the system in three different ways: by audio replay, by mimicking the mouth movements of the legal user, and by a recording of the reflected acoustic signals from the user’s mouth (which would be hard to attain without them noticing).
Across all environments and all kinds of attacks, the overall success rate was less than 10 percent, though attacks that used the third method—a recording of the user's Doppler profile—did succeed nearly 20 percent of the time under controlled, laboratory conditions. Yu notes, however, that the attacker must be in very close proximity—as close as 50 centimeters—to record someone’s Doppler profile in high enough quality for hacking purposes.
Yu’s team is considering this technology for smart homes as well as for smartphones. He says, “We also plan to extend the lip reading-based user authentication for smart speakers, which serves as the core commander of a smart home, such as Amazon Echo, and Google Home.”
Michelle Hampson is a freelance writer based in Halifax. She frequently contributes to Spectrum's Journal Watch coverage, which highlights newsworthy studies published in IEEE journals.