Fooling Speaking Recognition Systems Is Not as Hard as It Sounds

Applications that function with voice commands are becoming increasingly popular. Nowadays, pretty much every new smartphone has at least one application that functions using audio signals – users dictate messages, translate words and phrases, do search queries and many other things using only their voice, which explains their popularity. However, although they may appear quite secure, voice applications pose serious security concerns.
A new study conducted by researchers at the University of Eastern Finland (UEF) shows that skillful voice impersonators are able to fool even state-of-the-art speaker recognition systems.
Voice attacks can be done in various ways: using speech synthesis, voice conversion, and replay attacks. Although new techniques and countermeasures against technically generated voice attacks are developed regularly, techniques against voice modifications produced by a human are sorely lacking. In fact, even state-of-the-art speaker recognition systems are not efficient in recognizing voice modifications.
The UEF study found that voice modifications such as impersonations and voice disguise can fool speaking recognition systems quite easily. Analyzing speech from two professional impersonators who mimicked eight Finnish public figures as well as an acted speech from 60 Finnish speakers, the researchers found that impersonators were able to fool automatic systems while mimicking some speakers. As for acted speech, the best strategy for voice modification was to sound like a child.

Reference:
University of Eastern Finland via ScienceDaily (https://www.sciencedaily.com/releases/2017/11/171114104831.htm)

Published by cwlee20

Active high school student attending Bergen Catholic High School.

Leave a comment

Design a site like this with WordPress.com
Get started