- Google concerned by video and audio ’deep fakes’
- Dataset of synthetic voices released to researchers
- ‘Spoof challenge’ to distinguish between real and fake audio
Google has released thousands of phrases spoken by its text-to-speech technology as part of a challenge to researchers from around the world to combat fake audio.
In a blog post, Google AI software engineer Daisy Stanton said there had been ”an explosion of new research using neural networks to simulate a human voice”, but she said Google was concerned about the risks associated with being able to create realistic, human-like speech.
She said: ”Malicious actors may synthesize speech to try to fool voice authentication systems, or they may create forged audio recordings to defame public figures.
“Perhaps equally concerning, public awareness of “deep fakes” (audio or video clips generated by deep learning models) can be exploited to manipulate trust in media: as it becomes harder to distinguish real from tampered content, bad actors can more credibly claim that authentic data is fake.”
When Google launched its News initiative in March last year it promised to release the dataset to aid fake audio detection.
The dataset includes synthetic speech containing thousands of phrases spoken by Google’s deep learning text-to-speech (TTS) models which are drawn from English newspaper articles, and are spoken by 68 synthetic “voices” covering a variety of regional accents.
It has now been made available to all participants in the independent, externally-run 2019 automatic speaker verification (ASV) spoof challenge.”
The challenge invites researchers from across the globe to submit countermeasures against fake speech, with the goal of making ASV systems more secure.
The aim of the challenge is to train models on both real and computer-generated speech with those participating encouraged to develop systems that learn to distinguish between authentic and synthetic speech.
Earlier this year the European Broadcasting Union (EBU) president and BBC director general Tony Hall issued a rallying call to public service broadcasters urging them to “seize the opportunity to do great things,” and serve the public in the fight against fake news.
The EBU has a code of practice in an attempt to address the issue of fake news.
Google reported last year that AI intelligence was applied in detecting 80% of the 8.28 million videos removed from YouTube in the last quarter of 2017.
Meanwhile, Facebook acted against 1.9 million pieces of content on its platform in the first quarter of 2018, detected as fake accounts and fake news with AI-technology.
Stanton said: “We take seriously our responsibility both to engage with the external research community, and to apply strong safety practices to avoid unintended results that create risks of harm.
“We’re also firmly committed to Google News Initiative’s charter to help journalism thrive in the digital age, and our support for the ASV spoof challenge is an important step along the way.”
AI to boost creativity in film and TV
AI is set to transform production and boost creativity in TV, according to Endemol Shine Creative Networks chief executive Lisa Perrin who spoke at IBC2018 last September.
Endemol Shine is already employing AI techniques to log scenes on one of its flagship formats, Big Brother, which was previously done by “banks of people typing in scene descriptions and time codes,” Perrin said.
Last year, Endemol Shine worked with Microsoft to automate the production workflow on the Spanish edition of Big Brother and is now rolling out the new workflow to other Big Brothers around the world.
- Read more Endemol Shine embraces the AI revolution
AI-powered algorithms are able to analyse every second of every frame of video, making it possible for a sports production team to sift through a mountain of metadata and put together a montage of great plays in a few seconds.
The Wimbledon Championships in 2018 used IBM’s AI to automate the tagging and assembly of two-minute highlight reels for online publication. The system rates each play based on metrics such as crowd noise and player gesture to speed the search of creative editors to build more extensive highlights.
- Read more AI: Building the future of broadcast