New smart glasses use sonar to read your lips

Speakers, microphones, and AI allow you to control electronics just by mouthing commands.
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Cornell researchers have created a new kind of wearable device that can read your lips, even if you’re not speaking aloud — and even though it doesn’t have a camera.

Equipped with sonar and AI, this pair of fairly normal looking glasses becomes capable of recognizing lip and mouth movements to execute up to 31 commands, without the user needing to make a peep.

“We’re very excited about this system because it really pushes the field forward on performance and privacy,” Cheng Zhang, an assistant professor of information science at Cornell, said.

“It’s small, low-power and privacy-sensitive, which are all important features for deploying new, wearable technologies in the real world.”

Straight out of SciFi: Developed at Cornell’s “Smart Computer Interfaces for Future Interactions” (or SciFi) Lab, the glasses, called EchoSpeech, can control a smartphone and interface with software, operating them by mouthing commands.

Rather than cameras — with all their size, power, and privacy problems — the glasses use miniscule speakers to bathe the face in sonar. That sonar signal is picked up by microphones, and then it is fed into a SciFi-designed deep learning algorithm, which determines, then recognizes, mouth movements.

“We noticed that facial movements, especially lip movements, are highly informative for silent speech recognition,” Ruidong Zhang, information science doctoral student and lead author of the EchoSpeech paper, said in a YouTube video.

Ruidong Zhang

Two speakers and two microphones are attached to the bottom of either side of the glasses frame. The silent sonar waves bounce off the lips in various directions to the microphones, which pick up various changes in shape for the AI to evaluate.

According to the researchers, their algorithm was able to recognize these sonar echo patterns with 95% accuracy.

Users need to train EchoSpeech before it can work, but the glasses can pick up commands within minutes. In the YouTube video, EchoSpeech learned eight commands for a music player with less than two minutes of training; in less than five minutes of training, the glasses were capable of recognizing random strings of numbers, spoken without stop.

Ditching the camera: For the Cornell team, relying on cameras for silent speech recognition poses a number of problems. Aside from the impracticality of constantly wearing one, cameras open up a whole host of privacy concerns both for their users and the people around them. 

In addition to not potentially filming everyone around you, the sonar data used by EchoSpeech is considerably smaller than image and video data, the researchers say, allowing it to be processed and sent directly to a smartphone via Bluetooth, in real time, co-author and professor of information science François Guimbretière said.

“And because the data is processed locally on your smartphone instead of uploaded to the cloud, privacy-sensitive information never leaves your control.”

The sonar tech is also easier on the batteries than a camera, working for up to ten hours.

Looking ahead: The team is currently looking at how to commercialize EchoSpeech’s sonar recognition tech, and sees future use cases including people who have difficulties vocalizing. 

“For people who cannot vocalize sound, this silent speech technology could be an excellent input for a voice synthesizer,” Ruidong Zhang said. “It could give patients their voices back.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
Meet Thresh, the world’s first professional gamer
Was Elon Musk any good at Quake? “He’s a legit gamer,” but…
You’re thinking of the metaverse all wrong, says Matthew Ball
Rumors of the metaverse’s demise have been greatly exaggerated.
Perplexity, Google, and the battle for AI search supremacy
AIs that generate answers to user queries could transform search, but only if someone can get the tech and the business model right.
How AI is rewriting Silicon Valley’s relationship with the Pentagon
Silicon Valley is warming to the Department of Defense as it works to get new AI systems developed and deployed en masse.
Ray Kurzweil explains how AI makes radical life extension possible
Life expectancy gains in developed countries have slowed in recent decades, but AI may be poised to transform medicine as we know it.
Up Next
a plane at sunset
Subscribe to Freethink for more great stories