Oxford University AI system lip-reads more accurately than humans

This system is able to watch silent speech clips and get about 50% of the words correct, while humans can guess only 12% words correctly

Scientists at the Oxford University say that they have invented an artificial intelligent system, which is better at lip-reading than humans, reported BBC. The system, which is called "Watch, Attend and Spell", has been created in collaboration with Google's DeepMind AI division.

According to BBC, the researchers used clips from the news channel's programmes, such as Breakfast, Newsnight, Question Time and others with subtitles aligned with the lip movements of the speakers to teach the AI. Then a neural network combining state-of-the-art image and speech recognition set to work to learn how to lip-read.

Following the examination of around 118,000 sentences in the clips, the system now has a stock of 17,500 words in its vocabulary.

Since the system has been trained with news clippings, it can easily guess some words, such as "Prime" is mostly followed by "Minister" or "Europian" by "Union" but, as of now it's not very adept at recognizing words, which are not generally spoken by newsreaders, asserted BBC.

Joon Son Chung, a doctoral student at Oxford University's Department of Engineering, explained how difficult the task was to teach the AI to guess words due to homophonic words or the ones, to pronounce which the mouth usually makes similar shape – for example, the set of words like mat, pat, bat etc. It is indeed the context that helps the AI to guess what would be the suitable word for the sentence.

Surely, a lot more work needs to be done on it, but the charity Action on Hearing Loss seems pretty excited about it. "AI lip-reading technology would be able to enhance the accuracy and speed of speech to text," says Jesal Vishnuram, the charity's technology research manager. "This would help people with subtitles on TV, and with hearing in noisy surroundings," he added.

According to Chung, the system can be proved to be extremely useful in helping people to dictate instructions to their smartphones in noisy environments, dubbing old silent films, etc. In many cases, the AI lip-reading system could be used to improve the performance of other forms of speech recognition.

They are working on it to make it lip-read in real time and the researchers are positive that the system will learn quickly, as it keeps watching TV.

But, one thing the Oxford researchers and hearing loss charity is sure abot, i.e., this AI would not steal the jobs of human lip-readers but it will work alongside and help the human professional lip-readers to be more accurate, as per the report.