NAB 2018: If there was a single phrase that swept the show floor in Las Vegas, it surely has to be artificial intelligence.
A remarkable number of booths sported reference to AI and its near cousin, machine learning.
But as always with the latest big thing, while people were talking about it few were genuinely doing it. And, truth to tell, not too many people knew what they wanted it for.
Perhaps the most pertinent comment was from Nigel Booth of IPV. “When I hear people talk about AI I tell them you’re probably not making full use of the metadata you’ve already got,” he told IBC365.
Despite Booth’s probably accurate thoughts, a lot of the buzz around AI was in understanding content and automatically generating metadata tags. Chris Witmayer of motorsports leaders Nascar Media Group described one of the problems of dealing with archive content.
“Although we have an entire archive that goes back to the 1930s, we can’t actually find anything efficiently,” he is quoted as saying in recent research from IABM. “If you can’t find anything you can’t sell it and you can’t make money. So this is big for us.”
Alongside IBM Watson, Microsoft Azure and Google were core suppliers for many of the proposals on offer. Google recently acquired Deep Learning to add sophisticated machine learning capabilities, and now offers machine learning APIs as part of its toolkit.
Metadata tagging was a common thread, as was “sentiment interpretation”. Typically that would be interpreting social media, although so far the results seem to be no more granular than “positive” or “negative”.
One exciting project in the way it uses machine learning was described by Jay Batista of Tedial. This was work done with a major European broadcaster on its sport output.
First, they implemented a new and very close integration between the EVS servers in the remote production truck and the Tedial asset management system. All the cameras are recorded and tracked, using speech to text to tag the content.
The system also uses visual cues to create tags: it learns to interpret signals from the referee or umpire, and knows when a red card is being waved, for example.
All this metadata is then processed by the machine learning engine, which creates clips of highlight action, automatically, on the fly. These clips are offered to the director, but they are also immediately posted to social media.
What makes this system really clever is that it then interprets the response on social media. If a clip gets a lot of Facebook likes or goes viral on YouTube, the automatic highlights creator uses this information to learn what makes a clip popular and so make more like it. The result is a new route to greater fan engagement.
Amagi is also working on automating the production of sports highlights. It looks for cues like an increase in crowd noise or the use of slow motion replays as indicators of heightened interest. It will then use these factors to produce a highlights package to the length required.
Another intelligent application developed by Amagi is the automatic segmentation and re-segmentation of programmes. By determining likely places for commercial breaks, the system can very quickly create new versions with different break patterns for different markets, or to create on air and online assets.
Server specialist EVS launched a powerful new microservices platform at NAB, Via, and one of its capabilities is machine learning. The first product to benefit is version two of its Xeebra referee support device.
To generate an accurate offside line, for example, has required careful calibration of the pitch before the game, a time-consuming and error-prone process. Spatial and image recognition in Xeebra now continually recalibrates itself from the broadcast cameras, giving confidence in the precision of pitch overlays.
Speech to text is a common application, to create metadata and for automating subtitles. At the Enco booth at NAB I was told that machine learning is improving the accuracy of its automated closed captions that “now it is finally good enough to save money on human captioners”.
Online broadcaster Vice has extended the speech recognition capabilities to automated translation. It publishes in 18 different languages, and its system – based on the Google translate API – allows its journalists to publish once and populate every language service.
Image analysis is a growing application. Object Matrix was showing a new approach to archive searching and asset management by searching faces, objects and logos.
Their idea is that you start with an image then ask for similar images. This requires the assets to be processed in place, rather than start with a comprehensive metadata library.
MultiCam has an established product line in PTZ camera systems for applications like web broadcasting of radio studios. It is now using image recognition to provide automatic framing. Systems using multiple cameras can be switched based on who is talking, with the system comparing the framing of the cameras to achieve a satisfying transition.
Ted Korte of measurement specialist Qligent spoke about a big data project recently completed for a major telco in the US. Its aim was to handle the “silent sufferers”: subscribers who don’t complain but don’t renew either.
The Qligent system employs networks of test probes at all parts of a network, determining the precise signal quality at every part of the system. This project cross-referenced this with a track of user actions, like the subscriber who selects a programme but does not stay with it.
Together, this generates several terabytes of data a day, so it needs an intelligent, learning machine to sift through it and determine where there might have been technical issues or less than perfect service, even if no-one complains. That gives the operator a chance to proactively contact the subscribers to say we’re aware there might have been some technical issues, and we are working to ensure it won’t happen again.
In the coming years, machine learning looks set to improve audience engagement and boost efficiency. And, as the Qligent case study suggests, to improve consumer perceptions.
Last word goes to Andy Quested of BBC and a number of standards bodies. He is an enthusiast for AI, not least because it is the only way we will be able to really establish measures of the quality of experience and ensure that we deliver against those standards.
Not at NAB 2018? Catch up on show trends and highlights in this IBC365 Webinar