Tag: language

What does AI and machine learning actually mean?

I recently read an article on how language led to the Artificial Intelligence revolution and the evolution of machine learning and it got me thinking. To start it’s good to know and understand what we are talking about.

Wikipedia says ‘Artificial Intelligence (AI) is intelligence exhibited by machines. In computer science, the field of AI research defines itself as the study of “intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of success at some goal’. This is a much harder goal to achieve than Machine Learning which is ’the subfield of computer science that, according to Arthur Samuel in 1959, gives “computers the ability to learn without being explicitly programmed”.’ There is much confusion about the perceived ‘buzz words’ of AI and machine learning as many companies say they use AI, whereas in practice they have only used machine learning, which is quite different and not an ‘intelligent agent’ as in the realm of AI.

Machine learning has transformed natural language processing (NLP), in fact the whole area of computational linguistics is that of applying machine learning to NLP. This is a different problem to whether AI needs NLP – it’s perfectly possible to contemplate an AI system that we don’t communicate with in a natural language, it could be a formal language, but natural communication with an AI is going to need natural language communication.

So, what’s the story of machine learning applied to speech recognition?

The article quotes Rico Malvar, distinguished engineer and chief scientist for Microsoft Research, “speech recognition was one of our first areas of research. We have 25-plus years of experience. In the early 90s, it actually didn’t work”. I felt it was worth commenting that this could be potentially misleading for the history of speech recognition. In the early 90s, speech recognition did work for a variety of specific commercial applications such as command and control or personal dictation such as Dragon Dictate.

However, in the 90’s there was an interesting dynamic of computing power and dataset size. In the DARPA evaluations we showed that we could build useful large vocabulary speech systems for a variety of natural speech tasks using both the standard hidden Markov models and using neural networks. Indeed, my team at the time pioneered the use of recurrent neural networks in speech recognition (which can be considered as the first deep neural networks). This funding resulted in extensive data collection so that we could build better speech recognition systems.

It was relatively straightforward to apply hidden Markov models to these large data sources (we just bought a lot more computers) but neural networks couldn’t be so easily scaled to more than one computer. As a result, all the good work in neural networks was put on hold until GPUs arrived when we could train everything on one computer again.  To some, such as Malvar, this was viewed as “The deep neural network guys come up and they invent the stuff. Then the speech guys come along and say, ‘what if I use that?’.” But in my opinion speech was the first big task for neural networks with image and text coming along later (Wikipedia’s view of history).

However you view history, the use of deep neural networks combined with the progression of computing power has drastically improved speech recognition technologies and is now easily consumable by the masses with global reach in a multitude of applications and use-cases.

Tony Robinson, Speechmatics

The language of Romance

Ceramic Bench Park Guell - Barcelona SpainWhen we hear Catalan nowadays we think of Barcelona, of Gaudí, of Miró and the strange sounding language with hints of Spanish that so many Barcelona dwellers speak – and this makes us question why we bothered learning Spanish at school.
However, the breadth and depth of Catalan stretches much further than that. It is spoken across Catalonia and in various variants in Valencia, the Balearics, parts of France, Sardinia and even appearing as the official language of Andorra.
It is a language that has had a potted history, declared socially unacceptable or illegal on numerous occasions throughout Spain’s past, yet it has continually rebounded even having its own Reneixença (renaissance) in the 19th century. Today, there are between 10 and 15 million speakers worldwide. And while it shares similarities to its cousins of the Romance languages evolved from Latin (predominantly Italian, French, Spanish, Portuguese, Romanian) – Catalan is very much its own language with a rich history of poetry, culture, art and important agency across industry, politics and commerce.
This became particularly notable last year when we released our Spanish automatic speech recognition system which generated significant interest across Spain, South America and beyond. However, we also quickly realised that, whether for call centres or media monitoring, education or subtitling – potential customers had a need for Catalan as well. Spanish was a good start but for true coverage across Spanish-speaking countries we simply had to have Catalan.
The Speechmatics languages team took it on as part of our traditional Christmas hackathon, and within 2 weeks we had a fully operating system that is available on our website and has already attracted significant interest with users across both our public cloud-based platform and private on-premises implementation.
Carme Calduch, a Catalan Lecteur of Cambridge and Queen Mary University of London tested our cloud-based Catalan transcription service. “Speechmatics has developed a fantastic tool, doing a manual transcription is a laborious task and using this service I was able to obtain a transcript in less than 3 minutes. It is completely automatic, and the result was impressive, because the quality of the transcription is very good. Without a doubt, I’d recommend using this service – it will be really useful to professionals across multiple industries.”
With the list of languages available rapidly growing on the Speechmatics speech recognition system – we welcome everyone to sign-up for a free trial of our public speech transcription service at www.speechmatics.com.

Ricardo Herreros-Symons, Speechmatics