It can feel at times like we live in a science fiction future. We hold the whole of human knowledge in palm-sized devices that are constantly connected to the Internet. We speak to our computers and they respond with seemingly intelligent feedback.
But while the hardware that powers our lives has advanced at rapid speed in the last three decades, voice assistant technology still relies heavily on the same human input that traditional software programs have for much of that time.
Amazon Alexa, for example, had 63,215 skills in the United States alone through June of this year. These are individually programmed interactions that users can have with their Alexa devices – from ordering products to checking the weather or playing a trivia game.
But for Alexa to become a true digital assistant, the platform – and those like it in other devices – needs to be more proactive. This is where behavioral intent prediction comes in, utilizing machine learning to evaluate and predict user behaviors based on thousands of inputs.
The result will be a much more human-like interaction – with a device that can predict what a consumer needs and when they need it, much the same as a human assistant.
For companies, this will lead to a boom in data insights that further enhance targeting, personalization, and the likelihood of a sale.
The ability to predict the behavior of a consumer is a holy grail to many corporations. Billions of dollars are spent annually on market research, behavioral analysis, and new technologies to deliver smarter advertising to users. So, it’s no wonder we are starting to see an increase in the sophistication of our voice-activated devices – these are consumer applications after all.
In 1980, Paul Warshaw summarized a new model for predicting behavior in consumers based on intentions. While the existing models that had long been used by marketers focused on attitude measurements developed by those marketers, the new model is designed to evaluate the subjective intent of an individual to perform a specific behavior. In short, there is a scientific basis for predicting what someone will do based on a number of variables.
Fast forward nearly forty years and developers are using a similar approach to “teach” VUI’s like Alexa and Siri to learn more about their users and respond in kind. Of course, there are many challenges to successfully doing this as well.
The sheer volume of data that needs to be collected, cataloged and labeled before it can be input into the system is extensive. Amazon, for example, spends a considerable amount of money having thousands of hours of audio annotated each day to help the system better understand key elements based on the content of user speech.
One of the biggest challenges with voice systems in their early iterations was how specific you needed to be. Everyone has attempted to trigger a command with their phone or Echo device and found that they did not use the right combination of words to trigger the action.
These devices have been improved substantially and now attempt to determine, from context, what the user is asking, even if the specific language that triggers a skill is not used. Colloquialisms and variations on questions allow users to ask, “What’s it like outside?” or “Should I wear my coat?” instead of specific inquiries like “What is the weather in Chicago today?”
Behavioral Intent Prediction goes beyond the reading of context in vague questions, though. It allows these systems to start evaluating key elements of how a user interfaces with the system each day. This is most evident in “Hunches”, a form of skill that allows Alexa to try and figure out what someone means based on location, time of day, or recent activity.
For homes that have sensors installed or for more advanced applications that interface with IoT devices, this allows for some creative implementations of voice control.
For example, you might say “Alexa, play some music,” and the device would be able to intuit which connected device to play the music on, and what volume to set based on the time of day and the typical volume you choose.
The number of times that users have to reframe questions repeatedly to get the response they expect and desire is decreasing as these interfaces get smarter and better able to evaluate intent and respond in kind.
Predictive analytics go well beyond what a company might see in a survey or market research study and analyzes every element of a phone call, VUI interaction, or other recorded discussion. This allows developers and marketers alike to evaluate the root cause of a conversation, why someone’s mood changes during such a conversation (an invaluable resource in customer service), and much more. The result is a better user experience that caters itself to the user, and more actionable data for companies.
One of the many barriers to a predictive model in voice assistant technology was the lack of context. Devices could hear commands and respond, and to some degree evaluate the specific words being spoken, but only with a more advanced approach to the context of the words and how they are spoken can the next step be taken.
Emotion AI is capable of evaluating several elements of the user beyond their words. For example, it can take into account the regional dialect of the user, the micro-cues that indicate a specific emotion that might influence how and why they are saying something.
On a small scale, Alexa is now able to recognize when someone is whispering and whisper back – a godsend for parents trying to check the time or the weather while holding a sleeping baby.
Now imagine when the system could anticipate a mood entirely based on how something was said and respond accordingly, not only with the right content but in a way that is catered to those emotions.
Behavioral Intent Prediciton is only one example of an emerging technology that is bridging the gap between human and machine. Advancements such as these are inevitable based on the current trajectory of implementation and discovery.
Rather than be fearful or doubtful of their abilities, take this as an opportunity to explore and engage in the understanding of how technology will advance humankind.