Action! Not Words
A good virtual assistant needs to be able to do more than recognize words. As the use of gestures becomes more widespread in the way we interact with devices, so the importance of how they are understood and the chain of reactions they start will become a critical component in their commercial success.
In the very near future Automated Speech Recognition (ASR), one of the main methods of communicating with devices at the moment, will just be one of a number of different input methods that users will apply. Even gestures will be nothing more than a way of communicating. It will be technology such as Natural Language Interaction (NLI) that will provide the necessary underlying intelligence to turn a stupid device into a smart one and provide the ability to continue “conversations” between disparate devices and apps.
Already, it’s possible to see the limitations of today’s simplistic devices, with most only able to perform single command tasks like opening up an application.
For example, in the majority of mobile assistants available today, there is no ability to set a route in maps and cross reference it quickly with a restaurant recommendation from Trip Advisor. By the time the app has opened, the virtual assistant has forgotten where it was you were going. To enable this requires something most virtual assistants don’t have – a memory.
Memory is one of the key factors in delivering intelligence and speeding up the process of reasoning and reacting. It allows a device to take action based on several pieces of information, not just one fact or request. For example, imagine you’re on your way home and get into a conversation with a friend about who is a particular movie star and what films they’ve starred in.
The most natural way to ask would be “Who is Cameron Diaz?” and “What movies has she been in?” , but with the majority of today’s simple virtual assistants, each question needs to be asked with the relevant information – it doesn’t remember who you were talking about and so “she” could be anyone.
Place NLI into the mix and not only will the device be able to respond intelligently, understanding that you were talking about Cameron Diaz, but can show you the movie in the most appropriate format. Whether that’s high definition because you’ve just walked through the door and switched to the home TV with access to high speed broadband or low-res because you are still using your 3G tablet.
How the question was conveyed to the device whether it was spoken or typed is immaterial. In NLI it is the understanding, reasoning and reaction to a request that is the most critical part in delivering the ultimate user experience.