Alexa keeps listening, but does not record constantly. It will not be sent to the cloud server until you hear the activation word (Alexa, Echo or Computer). However, listening to wake-up words is more difficult than you might think.
Echo hardware is not that smart. Without the internet, every request or question will fail. This is because your commands are sent to the cloud for interpretation and decision. Amazon does not want to record every conversation you make in front of a smart speaker, just the commands you give to the smart speaker. For this reason, the company uses a word of welcome to attract the attention of the intelligent speaker. To achieve this, Amazon uses a combination of finely tuned microphones, a short memory buffer and neural network training.
Fine-Tuned Microphones Detect Your Voice
Voice Assistant speakers such as Echo and Echo Dot typically have multiple built-in microphones. For example, the Echo Dot has seven. This array offers devices a variety of options, from listening to distant commands to separating background sounds from voices.
The latter is especially useful for the recognition of wake-up words. With its multiple microphones, the echo can determine your position relative to its position and listen in that direction while ignoring the rest of the room.
You see this in action when you use the catchword. Stand next to an echo or echo point and say the word. Note that the ring will turn dark blue and then light blue when it circles and points in your direction. Now take a few steps aside and speak the word again. Notice that the light blue lights follow you.
If you know where you are, the unit can focus on you and eliminate noise from elsewhere.
Short memory prevents the speaker from holding too much.
Echo devices have plenty of storage space, but they do not consume much. According to Rohit Prasad, Amazon vice president and Alexa Artificial Intelligence chief scientist, an echo can physically store just a few seconds of audio.
By reducing its capacity, Amazon not only gives you more privacy (it's one less place) but also keeps echoes listening in on entire conversations, and limits itself to finding the watchword.
Imagine that you have a three-second tape and a tape recorder. Suppose that the tape has reached the end and has always dragged back to the beginning. When you start recording a conversation, everything you said four seconds ago is deleted and immediately resumed. This is how an Amazon Echo works.
It records continuously, but erases everything that has just been recorded simultaneously. This short attention span means that all it can hear is the word "Alexa" and not much else. However, three seconds are enough to record, investigate and treat this word.
Neural network training helps with pattern matching.
Finally, Amazon relies on neural network training to teach the echo how to customize patterns. Similar to other forms of machine learning, Amazon trains its algorithms by entering the word Alexa (or computer or echo, depending on which activation word the company trains) instance by instance.
RELATED: ] What are algorithms and why do people feel uncomfortable?
It is about grasping every inflection and accent, but also the context. Amazon wants your echo to recognize the difference when you speak with when you talk about or maybe when you speak with . Person named Alexa. The directional microphones also support this goal.
With every word that hears the echo, audio data is passed through several algorithms. Each level should exclude false alarms or contextual hints. If a layer check is successful, the word goes to the next one. When the local device decides that it has heard the wake-up word, it starts recording and distributing the audio to Amazon's cloud servers. Amazon uses four algorithms: one for each word (Alexa, Computer, Echo) and one for Alexa Guard, who handles certain sounds like glass splinters like a wake-up word.
But even if a match occurs, Amazon is still running more complicated tests. Did you notice that someone who speaks the word Alexa on a TV show or advertisement usually does not get a response from your echo? That's because Amazon also does a cloud check.
Cloud checks exclude some false positives.
When companies advertise with Alexa, they can send the audio to Amazon. The company executes the audio data using similar pattern matching algorithms that identify the word. Once the exact instance is fully cataloged, it is added to a database.
As part of the process of reaching the cloud, your echo contains information about the wake-up word it has heard and validates that database. If a match is found, Amazon alerts your echo to ignore the wake-up word, shut down, and discard all recorded audio.
In addition, Amazon checks whether the word is spoken simultaneously. Since not every company sends audio to Amazon, the company developed a novel backup solution. After checking for a database match, the company compares the wordpress footprint with all other cases that arrive at the same time. It is unlikely that two people who say Alexa at the same time sound exactly the same. So, if there's a match, Amazon knows it's probably a commercial or TV show and ignores the request. You can hear what your echo has recorded in the Amazon Data Protection Hub, and you'll probably find at least one false positive in the bunch. However, the technology is constantly improving, and Amazon wants it to work without a wake-up word.