Skip to main content

A revolution in speech animation?

I have always been fascinated by animated characters.  We know there is more to speech than simply words.  Facial expression adds significantly to our understanding.  As a deaf person I also know too well how precise movement of the lips and face help in my understanding of the spoken word. 

Forming speech is complex. About a hundred different muscles in the chest, neck, jaw, tongue, and lips must work together in forming speech. Every word or short phrase that is physically spoken is followed by its own unique arrangement of muscle movements.   No wonder then that animations can often appear flat and characterless. 

New research from the University of East Anglia (UK) could revolutionise the way that animated characters deliver their lines.

Animating the speech of characters such as Elsa and Mowgli has been both time-consuming and costly. But now computer programmers have identified a way of creating natural-looking animated speech that can be generated in real-time as voice actors deliver their lines.

The discovery was unveiled in Los Angeles at the world’s largest computer graphics conference - Siggraph 2017. This work is a collaboration which includes UEA, Caltech and Carnegie Mellon University

Researchers show how a ‘deep learning’ approach – using artificial neural networks – can generate natural-looking real-time animated speech.

As well as automatically generating lip sync for English speaking actors, the new software also animates singing and can be adapted for foreign languages. The online video games industry could also benefit from the research – with characters delivering their lines on-the-fly with much more realism than is currently possible – and it could also be it can be used to animate avatars in virtual reality.

A central focus for the work has been to develop software which can be seamlessly integrated into existing production pipelines, and which is easy to edit.

Lead researcher Dr Sarah Taylor, from UEA’s School of Computing Sciences, said: “Realistic speech animation is essential for effective character animation. Done badly, it can be distracting and lead to a box office flop.

“Doing it well however is both time consuming and costly as it has to be manually produced by a skilled animator. Our goal is to automatically generate production-quality animated speech for any style of character, given only audio speech as an input.”

The team’s approach involves ‘training’ a computer to take spoken words from a voice actor, predict the mouth shape needed, and animate a character to lip sync the speech.

This is done by first recording audio and video of a reference speaker reciting a collection of more than 2500 phonetically diverse sentences. Their face is tracked to create a ‘reference face’ animation model.

The audio is then transcribed into speech sounds using off-the-shelf speech recognition software.

This collected information can then be used to generate a model that is able to animate the reference face from a frame-by-frame sequence of phonemes. This animation can then be transferred to a CG character in real-time.

‘Training’ the model takes just a couple of hours. Dr Taylor said: “What we are doing is translating audio speech into a phonetic representation, and then into realistic animated speech.”

The method has so far been tested against sentences from a range of different speakers. The research team also undertook a subjective evaluation in which viewers rated how natural the animated speech looked.

Dr Taylor said: “Our approach only requires off-the-shelf speech recognition software, which automatically converts any spoken audio into the corresponding phonetic description. Our automatic speech animation therefore works for any input speaker, for any style of speech and can even work in other languages.

“Our results so far show that our approach achieves state-of-the-art performance in visual speech animation. The real beauty is that it is very straightforward to use, and easy to edit and stylise the animation using standard production editing software.”

Comments

Popular posts from this blog

Ian Duncan-Smith says he wants to make those on benefits 'better people'!

By any account, the government's austerity strategy is utilitarian. It justifies its approach by the presumed potential ends. It's objective is to cut the deficit, but it has also adopted another objective which is specifically targeted. It seeks to drive people off benefits and 'back to work'.  The two together are toxic to the poorest in society. Those least able to cope are the most affected by the cuts in benefits and the loss of services. It is the coupling of these two strategic aims that make their policies ethically questionable. For, by combining the two, slashing the value of benefits to make budget savings while also changing the benefits system, the highest burden falls on a specific group, those dependent on benefits. For the greater good of the majority, a minority group, those on benefits, are being sacrificed; sacrificed on the altar of austerity. And they are being sacrificed in part so that others may be spared. Utilitarian ethics considers the ba

Mr Duncan-Smith offers a disingenuous and divisive comparison

Some time ago, actually it was a long time ago when I was in my early teens, someone close to me bought a table. It was an early flat pack variety. It came with a top and four legs. He followed the instructions to the letter screwing the legs into the top. But when he had completed it the table wobbled. One leg he explained was shorter than the other three; so he sawed a bit from each of the other legs. The table wobbled. One leg, he explained, was longer than the other three. So, he sawed a bit off. The table wobbled. He went on cutting the legs, but the table continued to wobble. Cut, cut, cut! By this time he had convinced himself there was no alternative to it.  He ended up with a very low table indeed, supported by four very stumpy legs and a bit of cardboard placed under one of them to stop it wobbling on the uneven floor.  Mr Duncan-Smith argues that we need a 1% cap on benefits to be 'fair to average earners'. Average  earners have seen their incomes rise by less tha

His way or none? Why I can't vote for Jeremy

There is an assumption that all would be well with the Labour Party if people hadn't expressed their genuine concern with what they consider the inadequacies of Jeremy Corbyn's leadership. If only, it is said, the Parliamentary Labour Party and his Shadow Cabinet had supported him, instead of undermining him, all would have been fine. If they had been quiet and towed the line, then the party would not have been in the mess it is in. So, should they have stayed silent, or speak of their concerns? There comes a point when the cost of staying silent outweighs the cost of speaking out. This is a judgment. Many call it a coup by the PLP. They paint a picture of a right-wing PLP out of touch with the membership.  This is the narrative of the Corbyn camp. But Jeremy Corbyn, over the decades he has been in politics, showed the way.  It was Jeremy Corbyn who opposed almost all Labour leaders and rarely held back from speaking out, or voting time and again against the party line. As