EarSpy attack eavesdrops on Android phones via motion sensors

EarSpy eavesdrops on Android phones via motion sensors.

A group of researchers, to varying degrees; A hacking attack has been developed for Android devices that can recognize a caller’s gender and identity and even distinguish a private conversation.

Dubbed EarSpy, the side-channel attack aims to explore new eavesdropping possibilities by capturing motion sensor data readings caused by reverberation from ear speakers on mobile devices.

EarSpy is an academic effort by researchers from five American universities (Texas A&M University, New Jersey Institute of Technology, Temple University, University of Dayton and Rutgers University).

Although this type of attack has been explored on smartphone amplifiers, Ear speakers were thought to be too weak to generate enough vibration to steal enough vibration to make such a side-channel attack a reality.

However, modern smartphones use more powerful stereo speakers compared to the models of a few years ago, which produce better sound quality and stronger vibrations.

Similarly, Modern devices use more sensitive motion sensors and gyroscopes that can record even the tiniest echoes from speakers.

Also Read :  Metaverse in 2023: A whole new world for brand engagement

Evidence of this improvement is shown below, where the 2016 OnePlus 3T’s headphones barely register in the spectrogram, while the 2019 OnePlus 7T’s stereo speakers produce significantly more data.

left to right  OnePlus 3T ear speaker;  OnePlus 7T Ear Speaker and OnePlus 7T Loudspeaker
OnePlus 3T OnePlus 7T Left to right speaker for OnePlus 7T speaker
Source: (arxiv.org)

Tests and results

The researchers used a OnePlus 7T and a OnePlus 9 device in their tests, and used a variety of pre-recorded sounds that were played only through the two’s ear speakers.

The team used a third-party app ‘Physics Toolbox Sensor Suite’ to capture accelerometer data during a simulated call, feed it to MATLAB for analysis, and extract features from the audio stream.

A machine learning (ML) algorithm for speech content; Caller ID and Gender

Test data varies by dataset and device, but overall yields promising results for eavesdropping via ear speakers.

The OnePlus 7T has caller gender identification between 77.7% and 98.7%, caller ID classification between 63.0% and 91.2%, and speech recognition between 51.8% and 56.4%.

Test results on the OnePlus 7T
Test results on the OnePlus 7T (arxiv.org)

“We evaluate time and frequency domain features with classical ML algorithms, which show a maximum accuracy of 56.42%,” the researchers explain in their paper.

Also Read :  How Schneider Electric plans to help you take control of your energy bills

“Because there are ten different classes here, the accuracy still shows five times more accuracy than a random guess, which means that it can distinguish a reasonable amount of accelerometer data caused by ear speaker vibration” – EarSpy technical paper

On the OnePlus 9 device; Speaker identification dropped to an average of 73.6% in 88.7% gender discrimination, while speech recognition ranged between 33.3% and 41.6%.

Test results on the OnePlus 9
Test results on the OnePlus 9 (arxiv.org)

When researchers tested a similar attack in 2020 using a loudspeaker and the ‘Spearphone’ app, it achieved 99% caller gender and ID accuracy, and 80% accuracy in speech recognition.

Constraints and solutions

One thing that can reduce the effectiveness of an EarSpy attack is the volume users choose for their ear speakers. The volume prevents eavesdropping through this side-channel attack and is also more comfortable for the ears.

Also Read :  ETRI accelerates tactile communication with s

The arrangement of the device’s hardware components and the tightness of the installation affect the noise dispersion of the speaker.

Finally, User movements or vibrations introduced by the environment reduce the accuracy of the derived speech data.

Android 13 introduced a restriction on unauthorized sensor data collection for sampling data rates exceeding 200 Hz. Although this prevents speech recognition at a fixed sampling rate (400 Hz – 500 Hz). If the attack is only performed at 200 Hz, the accuracy drops by only about 10%.

The researchers recommend that phone manufacturers ensure sound pressure is stable during phone calls, and that motion sensors should be placed in a position that does not affect or at least minimizes internally generated vibrations.


Leave a Reply

Your email address will not be published.

Related Articles

Back to top button