Zoe meets Anna
Virtual assistants have been around for a while now and vendors such as Artificial Solutions have done a lot to popularize this technology, particularly as it relates to web-based customer service.
Artificial Solutions’ virtual assistant technology has proved its worth in multiple engagements, the best known perhaps being Anna, IKEA’s popular web-based avatar, which can have intelligent conversations about IKEA products in 21 different languages.
Virtual assistant technology is clearly “field proven” and these days discussions with potential customers tend to focus on ROI, customization and other pragmatic aspects rather than having to spend a lot of time explaining exactly what virtual assistants do.
Hence, I was intrigued to read about a new development in virtual assistants that focuses less on natural language interaction and more on their visual interaction, a relatively undeveloped area of avatar technology.
“Zoe”, as the new avatar is named, has been developed by a group of researchers at the University of Cambridge in the UK.
Zoe seeks to replicate human emotions with “unprecedented realism” and her life-like face can display emotions such as happiness and anger. What’s more, Zoe can change her voice to suit any feeling — her speech and facial expressions are based on recordings of a TV actress.
It sounds interesting and one can see a range of potential applications where natural language technology could benefit from being complemented by visual interaction with a life-like avatar.
For example, the Cambridge researchers are currently working with a school for autistic and deaf children, where the technology could be used to help pupils to “read” emotions and lip-read. They argue the system could have multiple uses – including in gaming, in audio-visual books, as a means of delivering online lectures, and in other user interfaces.
According to the University, Zoe’s life-like face can display different emotions and change her voice to suit any feeling the user wants it to simulate.
“This technology could be the start of a whole new generation of interfaces which make interacting with a computer much more like talking to another human being,” said Professor Roberto Cipolla, from the Department of Engineering, University of Cambridge.
One obvious application area is social networks. A user could type in any message and also specify the required emotion – happy, sad, tender, angry, afraid or neutral. A personalized avatar could then recite the text with the appropriate emotion, effectively acting as an emotionally-realistic “stand-in” for the user in their interactions in social networks.
According to its designers, Zoe offers “unprecedented realism” and is more expressive than other systems.
Given the wealth of potential applications, I suspect one of the biggest challenges facing the researchers is going to be identifying areas which will truly benefit from realistic visual interaction.
The experiences of the IVR industry suggests that consumers want natural-sounding dialogues with IVR systems, but they also need to know when they are talking to a machine rather than a human agent – not least to avoid possible frustration and confusion.
The same goes for screen-based avatars. IKEA’s Anna interacts with visitors to IKEA’s website in a natural way, blinking periodically and moving her head as she tries to answer their questions. But I don’t think any visitor is under the illusion that Anna is a real call-center operator.
If Anna was too realistic-looking, I think some people – me for one – might find it slightly disconcerting to be continually wondering whether she is “for real”, particularly as the topic of the dialogue – selecting furniture – is not one where emotions normally play a significant role – notwithstanding the heated discussions you sometimes hear in my local IKEA store!
I guess only time will tell whether people want to “emotionally engage” with realistic-looking virtual assistants modeled on real people or whether they prefer technology that does not try too hard to be human.
More information on the University of Cambridge research project can be found here.
See this earlier blog post for a related project focused on understanding human feelings in speech.