IBC2018: The strongest case for the use of AI production tools is in lower cost genres of television, says BBC AI Research Engineer Craig White.

BBC AI Research Engineer Craig White has admitted that the BBC’s AI production systems, which the corporation is currently prototyping, ‘may never achieve the quality of a skilled craftspeople’.

“But the key question with AI-based production systems is at what point are the algorithms good enough,” said White, one of the lead authors behind BBC Research & Developments research paper AI in Production team – winner of IBC’s Best Paper Award 2018.

The BBC aims to apply its AI production tool Ed to events such as the Edinburgh Festival and said that ultimately the BBC aims to make its AI toolsets more widely available.

The BBC’s coverage of arts event The Edinburgh Festival is limited by resources constraints such as the availability of crew and kit and the cost of covering a festival which features 50,000 events in 300 different venues.

AI tools could also be used to cover live public or political debates, said White. “We have a lot of open talks across the UK where we could use these tools to create and make edits available.”

In high end genres of television such as drama and natural history the skills of crafts people would continue to be highly valued, said White, who added that the strongest case for the application of AI production tools would be in lower cost genres of television.

“With AI we feel there are a lot of technologies that look really nice on paper but when you apply them they fail miserably,” Joost de Wit, Media Distillery

To date Ed has been used with some success to create automated footage from sports events and studio-based panel shows using simple locked off camera set ups and automatically generating a series of wide shots and close ups of talking heads.

“We are designing in video editing techniques so that basic filmmaking rules such as the rule of thirds are adhered to. We are also working on ways to improve facial feature detection, facial land marking and inferring speech from lip features,” said White. The system is being built to create shots of around 2-6 seconds in duration.

Speaking at an IBC conference Tech Talk on AI in Production, White shared the floor with NHK Senior Manager Hiroyuki Kaneko and Media Distillery founder Joost de Wit, whose company extracts metadata from 8,000 hours of audio and video a day.

“With AI we feel there are a lot of technologies that look really nice on paper but when you apply them they fail miserably,” said Joost de Wit.

“Often it’s the difference between benchmark datasets which are used to develop the technology versus real data sets. Benchmark data is used to train, evaluate and compare and are clearly labelled where as in reality most objects are unknown,” said de Wit.

“Also there are no time constraints with a benchmark dataset – only accuracy counts and you have as much time as you need. But in real life commercial settings time and resources are limited. At 25 frames a second it takes 40 milliseconds to analyse a frame, so 300 concurrent streams take a lot of scalability to analyse.