The NSA’s voice-recognition system raises hard questions for Echo and Google Home

23rd Jan 2018

Suppose you’re looking for a single person, somewhere in the world. (We’ll call him Waldo.) You know who he is, nearly everything about him, but you don’t know where he’s hiding. How do you find him?

The scale is just too great for anything but a computerized scan. The first chance is facial recognition — scan his face against cameras at airports or photos on social media — although you’ll be counting on Waldo walking past a friendly camera and giving it a good view. But his voice could be even better: How long could Waldo go without making a phone call on public lines? And even if he’s careful about phone calls, the world is full of microphones — how long before he gets picked up in the background while his friend talks to her Echo?

As it turns out, the NSA had roughly the same idea. In an Intercept piece on Friday, reporter Ava Kofman detailed the secret history of the NSA’s speaker recognition systems, dating back as far as 2004. One of the programs was a system known as Voice RT, which was able to match speakers to a given voiceprint (essentially solving the Waldo problem), along with generating basic transcriptions. According to classified documents, the system was deployed in 2009 to track the Pakistani army’s chief of staff, although officials expressed concern that there were too few voice clips to build a viable model. The same systems scanned voice traffic to more than 100 Iranian delegates’ phones when President Mahmoud Ahmadinejad visited New York City in 2007.

We’ve seen voice recognition systems like this before — most recently with the Coast Guard — but there’s never been one as far-reaching as the Voice RT, and it raises difficult new questions about voice recordings. The NSA has always had broad access to US phone infrastructure, something driven home by the early Snowden documents, but the last few years have seen an explosion of voice assistants like the Amazon Echo and Google Home, each of which floods more voice audio into the cloud where it could be vulnerable to NSA interception. Is home assistant data a target for the NSA’s voice scanning program? And if so, are Google and Amazon doing enough to protect users?

In previous cases, law enforcement has chiefly been interested in obtaining specific incriminating data picked up by a home assistant. In the Bentonville murder case last year, police sought recordings or transcripts from a specific Echo, hoping the device might have triggered accidentally during a pivotal moment. If that tactic worked consistently, it might be a privacy concern for Echo and Google Home owners — but it almost never does. Devices like the Echo and Google Home only retain data after hearing their wake word (“Okay Google” or “Alexa”), which means all police would get is a list of intentional commands. Security researchers have been trying to break past that wake-word safeguard for years, but so far, they can’t do it without an in-person firmware hack, at which point you might as well just install your own microphone.

But the NSA’s tool would be after a person’s voice instead of any particular words, which would make the wake-word safeguard much less of an issue. If you can get all the voice commands sent back to Google or Amazon servers, you’re guaranteed a full profile of the device owner’s voice, and you might even get an errant houseguest in the background. And because speech-to-text algorithms are still relatively new, both Google and Amazon keep audio files in the cloud as a way to catalog transcription errors. It’s a lot of data, and The Intercept is right to think that it would make a tempting target for the NSA.

When police try to collect recordings from a voice assistant, they have to play by roughly the same warrant rules as your email or Dropbox files — but the NSA might have a way to get around the warrant too. Collecting the data would still require a court order (in the NSA’s case, one approved by the FISA court), but the data wouldn’t necessarily need to be collected. In theory, the NSA could appeal to platforms to scan their own archives, arguing they would be helping to locate a dangerous terrorist. It would be similar to the scans companies already run for child abuse, terrorism or copyright-protected material on their networks, all of which are largely voluntary. If companies complied, the issue could be kept out of conventional courts entirely.



source/read more:


Leave a Reply

Your email address will not be published. Required fields are marked *

SPAM/MORON CHECK: * Time limit is exhausted. Please reload CAPTCHA.