When we hear Catalan nowadays we think of Barcelona, of Gaudí, of Miró and the strange sounding language with hints of Spanish that so many Barcelona dwellers speak – and this makes us question why we bothered learning Spanish at school.
However, the breadth and depth of Catalan stretches much further than that. It is spoken across Catalonia and in various variants in Valencia, the Balearics, parts of France, Sardinia and even appearing as the official language of Andorra.
It is a language that has had a potted history, declared socially unacceptable or illegal on numerous occasions throughout Spain’s past, yet it has continually rebounded even having its own Reneixença (renaissance) in the 19th century. Today, there are between 10 and 15 million speakers worldwide. And while it shares similarities to its cousins of the Romance languages evolved from Latin (predominantly Italian, French, Spanish, Portuguese, Romanian) – Catalan is very much its own language with a rich history of poetry, culture, art and important agency across industry, politics and commerce.
This became particularly notable last year when we released our Spanish automatic speech recognition system which generated significant interest across Spain, South America and beyond. However, we also quickly realised that, whether for call centres or media monitoring, education or subtitling – potential customers had a need for Catalan as well. Spanish was a good start but for true coverage across Spanish-speaking countries we simply had to have Catalan.
The Speechmatics languages team took it on as part of our traditional Christmas hackathon, and within 2 weeks we had a fully operating system that is available on our website and has already attracted significant interest with users across both our public cloud-based platform and private on-premises implementation.
Carme Calduch, a Catalan Lecteur of Cambridge and Queen Mary University of London tested our cloud-based Catalan transcription service. “Speechmatics has developed a fantastic tool, doing a manual transcription is a laborious task and using this service I was able to obtain a transcript in less than 3 minutes. It is completely automatic, and the result was impressive, because the quality of the transcription is very good. Without a doubt, I’d recommend using this service – it will be really useful to professionals across multiple industries.”
With the list of languages available rapidly growing on the Speechmatics speech recognition system – we welcome everyone to sign-up for a free trial of our public speech transcription service at www.speechmatics.com.
Ricardo Herreros-Symons, Speechmatics
By now it is pretty obvious that speech recognition is taking over the world, and so long as it doesn’t go all Hal 9000 on us, then the future looks very interactive. The promise of a world where light, TVs and coffee machines can be activated without touch has already been realised. But this is a (relatively) simple process of crosschecking what the device thinks it has heard and comparing it to a known list of commands (I have completely oversimplified this – the time taken to reach this technical epoch of voice control is some indication as to the complexity of the process).
We are now seeing more companies working on tech which can delve into the complexities of our voice. Banks are now using your voice as your password, emotion engines can detect your emotion when calling a call centre to give the operative a heads-up if things are about to go south and suggest actions they can take to resolve the issue.
The latest breakthrough comes from Canary Speech – a US start-up – which has developed a way of analysing phone conversations for several neurological diseases, ranging from Parkinsons to Dementia.
A pinch of reality may be required though. This is early stages for the start up. Don’t expect GPs to be replaced anytime soon by a microphone but there is no reason not to think that between machine learning and voice recognition we couldn’t start to see chatbots being used for front line GP care – culminating in the inevitable prescription of Ibuprofen, a staple of the British doctor. The tech that Canary are investigating is not yet fully mature, they will need to rely heavily on using large data sets over a sensible period of time to teach their machine how to effectively identify a problem.
How this technology will be rolled out is a big issue to consider. At the moment, most calls to call centres are recorded for monitoring and quality purposes – that’s monitoring of the call centre operatives, not the caller. I’m not sure there are many people who would appreciate being told by Vodafone that they have identified them to have signs of dementia. That’s all yet to be ironed out as we get to grips with more of our data being analysed.
From Speechmatics’ point of view the more research that goes into using neural networks and machine learning the better. In-house we are getting better and better at finding more efficient ways of changing speech to text. We are able to do now on a phone – what five years ago – required banks of graphics processors. This has come about because the collective knowledge that computer science has developed in the last 5 years has advanced so much. Research breeds research. The more uses for speech recognition the more ways it can be streamlined.
Luke Berry, Speechmatics