Digital Tower of Babel
Culturally and linguistically Europe is a rich and diverse region. But as the online economy globalizes, less widely-spoken languages run the real risk of being under-represented and maybe even ignored by the digital content industry.
Natural language technology could contribute signiﬁcantly to restoring the balance as well as removing the linguistic barriers that stifle communication between Europe’s citizens, enterprises and politicians.
Of course, the EU already does a lot to promote multilingualism – its annual bill for translators and interpreters runs to around €1bn. But this represents a drop in the ocean of information that could benefit from translation, and huge regional market opportunities remain untapped because of the language barriers.
While these barriers remain, Europe will continue to be at a disadvantage when compared to trading partners such as China, Japan or United States that have a single predominant language.
So why don’t Europeans all speak English? Of course, some companies – particularly US ones — assume they do as their websites are only available in English, the lingua franca of the internet.
That’s a big omission as only about half of the 500m inhabitants of the EU speak English.
In addition, less than 10% of the EU’s population is willing to use online services in English according to a recent report* by Meta-Net, a European network of 60 research centers specialized in language technologies.
Of course, there are some shining exceptions. Ask Anna, the IKEA help center chat bot, can deal with customer inquiries in 21 different languages. Anna was built using Artificial Solutions’ web-based virtual assistant technology, which is currently available in over 20 languages.
But many businesses, particularly SMEs, do not see the need to develop a multilingual presence on the web and so they risk excluding a large part of their potential market without even knowing it.
The EU is not just concerned about the lost commercial opportunities but the very real prospect that languages may also disappear under the online onslaught of English.
According to the Meta-Net report, Icelandic, Latvian, Lithuanian and Maltese are at the highest risk of disappearing, while other languages such as Bulgarian, Greek, Hungarian and Polish are also in danger.
Of course, researchers have a vested interest in painting a bleak future as they want to ensure research funds keep flowing.
Others might wonder whether the problem is really that serious. Google Translate, a free service, can handle the most obscure minority-language pairs – Basque to Georgian, for example. What’s more, it does so in real-time and for free. So what is all the fuss about?
But appearances can be deceptive. Google Translate and similar services are based on statistical techniques and, unlike natural language interaction, have no real understanding of the meaning of the words typed into the text box.
That’s fine as far as it goes, but that’s not very far.
Try using an online translation service to translate a news story written in a language you do not understand. As a journalist, I can tell you that this is not a particularly taxing task for a human translator as journalists like to keep things simple– newspaper stories are written for readers with a reading age of 14 years or less.
If it’s about a topic you have some familiarity with, you can probably make some sense of the story, even if the machine translation is fairly bad. But if it’s an unfamiliar or technically complex subject you will struggle to extract any sense at all.
What’s more, you are unlikely to want to have an extended conversation with a machine whose linguistic skills are clearly so deficient.
In 2020, things could be different. Natural language technologies will have improved to such an extent that they will be ubiquitous in both online and offline interactions. At least that’s the hope of Meta-Net which proposes a strategic research agenda based on developing innovative natural language interfaces to technologies and knowledge.
2020 is still some years away, but Artificial Solutions’ natural language technology is ready to deploy today. So why not get a head start on the multilingual future?
*Meta-Net’s Strategic Research Agenda for Multilingual Europe 2020 can be downloaded here.