Metadata is crucial as an enabler of automated production and targeted content. Would a standard help?
When a broadcaster shoots and distributes content, they are throwing away up to 95% of their raw material. For live broadcasts the figure is closer to 99%. Yet this massive wasted asset could be monetised if it can be accessed and shared online by anyone, or better still anything, within the media company.
Technology is arriving in the form of machine learning algorithms that will enable the automated production of video tailored to individuals on specific social media platforms, smartphones, streamed channels, and TV.
Some dub this Media 4.0 - the mass customisation and distribution of video content targeted to different channels (broadcast, digital and social media) using AI and metadata within an existing workflow.
Getting to this stage requires knowing what the content is and where it resides.
TVU Networks Chief Executive Paul Shen says: “The effort of locating content you’ve already shot is often costlier and potentially slower than going out and re-shooting material. With the increasing demand from consumers for customised video content combined with the coming 5G networks, producing more sophisticated stories faster will be critical to satisfy the market. The first step on this path has to be to index everything.”
The process of identifying and distributing video content so that media producers can follow their audience to whichever device they’re viewing their content is possible now but it’s a fragmented picture.
There have been many attempts to codify standards for metadata in the past, most notably a push by the EBU to adopt a standard known as EBU Core. There are also well established common descriptive and rights formats including TVA and ADI, and common identifiers such as EIDR (Entertainment ID Registry) and ISAN, and, of course, many technical metadata standards.
Each broadly supports four basic tenets of data structure, content, value and format/exchange.
Prime Focus Technologies Vice President and Global Head, Marketing & Communications, T. Shobhana, says: “Together these provide the rules for structuring content, allowing it to be reliably read, sorted, indexed, retrieved, and shared. When metadata records are formatted to a common standard, it facilitates the location and readability of the metadata by both humans and machines.”
Avid VP Platform & Solutions Tim Claman adds: “We think a standard for time-based metadata would aid in the discovery of content. It should feel similar to enabling a search on the internet. If web pages were not designed using a common language and if data were not represented in a consistent form it would be impossible to find anything online. The industry should learn from that and agree to a common language and a common structure for time-based metadata.”
However, he cautions on the practicality of achieving this. “The industry has a mixed track record of developing and implementing metadata standards. You can’t be overly prescriptive without being restrictive.”
While a global metadata standard may ease broadcaster workflows, this is not considered likely.
IPV Executive Vice President of Sales and Marketing Nigel Booth says: “A lot of different standards already exist, but asset management vendors want to differentiate themselves – and their use of metadata is one of the ways they do this. So, standardising how content is tagged isn’t likely to be popular.”
Tedial General Manager, US, Jay Batista agrees: “Different AI vendors are supplying engines with various tagging options, and they consider their logging parameters both proprietary and a competitive edge in the marketplace.”
Broadcasters would benefit the most from a unified metadata schema, he says. “Yet, many content producers believe they must maintain an internal core metadata index for their unique productions and business requirements, and often these internal data models are not shared.”
‘The effort of locating content you’ve already shot is often costlier and potentially slower than going out and re-shooting material’– Paul Shen
It is believed more realistic to develop a way of sharing data rather than standardising it.
“It’s more important to standardise how content can be uniquely identified,” says Paul Shen. “If it is simple and transparent enough, we may not even need a standards body.”
However, it can be challenging to do this even within one company, let alone sharing data with third party systems and external media partners.
“We have done MAM projects which have failed because it has proved hard to get all stakeholders in one organisation to agree,” says Claman. “Even when you do get agreement on the metadata governance it is often only for a period of time.”
He explains that Avid conceives of metadata in strata. “What these layers have in common is the ability to be expressed as individual frames or moments. If you can aggregate that time-based strata the more discoverable your content becomes.”
Avid advocates the idea of a consortia to devise such a standard, much like the way the industry united to forge standards around carrying audio, video and ancillary data over IP.
“Some vendors go into these [standardisation efforts] looking for an opportunity to differentiate and maybe claim intellectual property and get an edge,” warns Claman. “A consortium will work best if vendors follow the lead of users. It leaves less room for proprietary technology to be introduced into the mix.”
“If MAM providers are required by large broadcasters to standardise, it’s possible that vendors will be forced to collaborate to put forward a single-solution way of working,” says Booth. “An example of where this has happened is IMF (Interoperable Media Format).”
TVU revealed it is working with a number of equipment manufacturers and major broadcasters - believed to include Disney - on the best approaches to the issues. An initial meeting is being held in June.
“We want to create a consortium which would provide guidance to both manufacturers and media companies,” says Shen. “Every part of the industry needs to come together if [automated production] is to happen faster. I don’t believe any one company can do the heavy lifting.”
‘The risk and challenge here is not in our ability to move certain types of programming to an automated process, but rather the loss of editorial judgement’ – David Schleifer
Sharing, not conflicting
One aim is to address potential conflicts in working with metadata originated under different AI/ asset management protocols.
Primestream Chief Operating Officer David Schleifer says: “The immediate area where I would see conflict is in assuming that the value of metadata in a file would be the same regardless of the AI-driven dataset that generated it. As the area is still maturing, I would not assume that, for example, face recognition from one system would be equal to face recognition from another. In fact, one system may focus on known famous people while the other might be a learning algorithm to build collections – therefore, different data used in different ways built on similar technology.
“AI is a perfect example of where an existing metadata schema would need to be expanded,” he adds. “With AI we do not yet know where it is going or how it will be used, so the schema needs to be extensible, allowing for growth. At a high level you can sort all types of metadata into categories like ‘tied to the entire asset’ or ‘tied to moments in time or objects in the image at specific times’, and so on. But in the end, creating the schema first will always lead to revisions later.”
Standardising the ontologies (terms used to describe media) that are used within different domains would be useful when sharing content.
“Standardisation in this area would mean less confusion across industries,” says Booth. “For example, IPV’s Curator uses controlled vocabularies to ensure consistency and accuracy. Specific terms are selected and tagged instead of having different operators selecting their own terms.”
An alternative is the use of technologies like XML and Rest APIs, which are becoming increasingly popular as a format when data is exchanged.
“The challenge with descriptive metadata is that you don’t know ahead of time what is going to be interesting after the fact,” says Claman. “For this reason, for news and sports, you want as much automation of metadata creation as possible.
“We need extensible data models if we’re going to see widespread adoption.”
Booth calls for ‘metadata fusion’, a means of bringing together data that’s saved by contrasting systems and checking where it agrees. “Doing so means that you can improve reliability. An example of this is combining speech-to-text and object recognition – if they both identify similar metadata, it’s likely correct. The key thing is to understand the provenance of the metadata - as long as you capture it you can make a decision based on it.”
Downstream, licensing and reconciliation issues need to be considered and adhered to. Additionally, some content owners have clear contractual rules which restrict platform operators from modifying their data.
Piksel Joint Managing Director Kristan Bullett says: “Providing clear traceability of origination of metadata and also providing a mechanism to lock restrict modification of attributes that should not be modified.”
Piksel is initiating its own metadata group. It is joining up some disparate systems that will allow customers to purchase, ingest and manage localised metadata on a per-title basis, enabling advanced recommendations and content discovery functionalities.
The first metadata providers to join are Austrian content discovery specialist XroadMedia, Bindinc Metadata from the Netherlands, France’s Plurimedia and Mediadata TV from Spain. We don’t know at the moment whether providers like ThinkAnalytics, Rovi/TiVo and Gracenote will be ‘invited into the club’ or whether it will act as a purely competitive offer to these alternatives.
Established primarily to aid content editors in the quest to augment and enhance their existing metadata, Piksel said its ‘ecosystem’ will prove particularly useful for customers dealing with multilingual or cross-territory titles.
“Platform operators have been abstracted away from the responsibility for their metadata needs and need to work with the data that has been provided to them,” says Bullett. “Part of our vision is to bridge this gap and put that decision-making process into the hands of the people who are responsible for ensuring end customers get the best possible user experience.”
Automated production, personalised distribution
For production, TVU’s solution is MediaMind which embeds metadata on ingest using text to speech recognition, as well as an AI to identify objects and people into specific video frames in real time. Content can be searched with TVU’s own search engine and it has an API allowing broadcasters to integrate it with existing MAM systems for archival search.
“If the producer is just interested in a few frames out of a twenty-minute file, today’s manual search processes can make locating the exact frames time-consuming,” says Shen. “Using an Artificial Intelligence engine with object and voice recognition will automate the process of tailoring and distributing clips to the appropriate outlets.”
Tedial’s similar approach targets sports production. Its SMARTLIVE tool uses AI logging to automatically create highlight clips and pitch them to social media and distribution.
“Applications are being developed, especially in reality television production, where auto-sensing cameras follow motion,” says Batista. “AI tools such as facial recognition augment the media logging function for faster edit decisions as well as automatic social media deliveries.”
The current state of the art in AI only augments news and sports production and is intended to augment the human curated event presentation with automated story-telling.
The evolution of this suggests programmes will at some point be created entirely automatically to cater for different consumer tastes.
“With the growing capability of technology to collect every bit of data to analyse consumer behaviour, it could one day become plausible to create a formula for how content should be produced based on the target audience,” says Shen.
The BBC has been working on this, in the shape of object-based media, for nearly a decade and is expecting to deliver it within the next five years.
“The importance of metadata in this space will be crucial as an enabler of targeted content,” says Batista. “Object-based media is a hugely interesting concept and could transform the way content is consumed dramatically. There are challenges, obviously but you can easily see from a resource, storage, distribution, consumption and analytics perspective what opportunities this could bring.”
The media company of the future could be a mass producer of individually tailored content. A clue to how this would look is in music distribution. Five years ago, consumers tended to download music tracks to add to a personal collection. Today, it is more likely they will prefer a Spotify-like service to curate and stream the content they want for them.
“Media 4.0 will see video production move from a programme-centric to a story-centric process, where the content is automatically produced, targeted and distributed to the viewer,” says Shen. “Producers create the video content, and the AI engine automates the assembly of the material and delivers it in the most effective way to the target audience.”
Whether this is desirable or not for all content is another matter.
“The risk and challenge here is not in our ability to move certain types of programming to an automated process, but rather the loss of editorial judgement,” says Schleifer. “Systems that produce content in this manner will adhere to specific rules, and as a result will produce consistent content that will never challenge us to get out of our comfort zone. We need to figure out how a system like this can continue to push the envelope and challenge us.”