SOFTWARE from a Scottish tech business has helped computer giant Mircosoft create the latest instalment of Gears of War video games franchise.

Animation software SGX has been developed by Edinburgh-based Speech Graphics and having been licensed to The Coalition, a Microsoft Studio, deals have now been struck with a further three global studios.

The SGX software generates facial animation and has been used by The Coalition to animate 35,000 lines of dialogue in the critically-acclaimed Gears of War 4.

The result of five years of research and development by Speech Graphics, SGX makes it possible for video game studios to animate in-game dialogue in-house, using only audio recordings, without having to outsource to specialists.

Michael Berger, chief technology officer and co-founder of Speech Graphics, said: “Automatic, accurate lip sync is one of the holy grails of computer facial animation. Our task is to create the impression that the animated face you see is the source of the sound you hear. This illusion is notoriously difficult to achieve: the movements of speech are fast, complex and subtle.

The company has been building an international reputation for its work in audio-driven animation and motion technology in video games and has also worked with the likes of Warner Brothers and Kanye West.

Having been spun-out of the University of Edinburgh’s School of Informatics in 2011 under the leadership of founders Mr Berger and Gregor Hofer, and Grand Theft Auto producer Colin Macdonald, Speech Graphics is aiming to become the main provider of lip-sync and facial animation – a sector forecast to reach over $500 million (£375m) – to the global video game market.

The company currently employs eight staff and has plans to recruit three more in the year ahead.

Having grown from its inception to this point without outside investment other than a series of grants, the company is now cash positive.

David Coleman, animation director of Gears of War 4, commented: “SGX goes beyond good lip sync. Speech contains energy and emotion, and that too can be decoded from the voice and synchronized in the face.

“Using all available acoustic information, our algorithms drive not just the mouth but the entire face from audio input, from syllables to scowls.”