Technical paper: This paper discusses the prototypes built and end-user trials run in the European H2020 project MeMAD (Methods for Managing Audiovisual Data) for implementing more efficient media production based on semiautomated media enrichment tools.

Abstract

This paper discusses the prototypes built and end-user trials run in the European H2020 project MeMAD (Methods for Managing Audiovisual Data) for implementing more efficient media production based on semiautomated media enrichment tools.

The prototypes offer automated content annotation supported by machine translation, cross-language search and retrieval of material and automated multi-lingual video subtitling. Alternative evaluation approaches are described for experimental and close-to-production stage use cases, with the focus alternatively on refining the use cases with qualitative methods or measuring productivity with quantitative methods.

Main findings indicate curious user attitudes towards these types of technologies, with current working practices and individual preferences affecting the results quite strongly. Productivity of subtitling and translation work can be improved by incorporating automated speech recognition (ASR), natural language processing (NLP) and machine translation into the workflows. Using large quantities of metadata raises tool UX design questions and is not fully supported by existing tools. For most purposes tested, the users preferred having the additional metadata available, even in lower quality, instead of hiding or discarding low-quality data.

Introduction

Demonstrations of potential automated metadata extraction services (AME) such as face recognition, automated speech recognition, machine translation and even object detection and scene classification have in the past few years focused on early technical tests or stand-alone user interfaces built to demonstrate the concept. In order to properly evaluate the potential of these deep-learning-based technologies, for which the short-hand term “A.I.” is commonly conveniently used, in media production, the next larger step is to fit these services into existing ecosystems, architectures and workflows in a local context of a media company. This shift from proof-of-concepts (PoC) into production tests marks several important changes, challenges and practical considerations, most notably:

• Envisioned services are for the first time tested in end-to-end workflows instead of isolated sub-processes. Also, the evaluated user experience expands to include all the parts of the user work process and how different parts of the work tie into each other. • On top of the technical performance metrics, a layer of more business-oriented success criteria is introduced, such as productivity and user satisfaction.

Typically, also the amount of data increases between iterations in the evolution from proof-of-concept to in-production use, as amounts of content and number of services involved in a workflow increase. Furthermore, at the stage of production tests, the element of optimizing dataflows is present: Out of the large number of AI services, which ones should be combined and what parts of their data output should be used to create an optimal work process?

The European Horizon2020 project MeMAD attempts to research the challenges mentioned above, with research groups developing the algorithms and other core elements of machine learning technologies such as automated speech recognition (ASR), computer vision and machine translation (MT) for audiovisual media data. Building on these, the project pilots the use of these technologies as iterations of a project prototype, and the most promising elements are further evaluated in a close-to-production use by the Finnish Broadcasting Company Yle, the French National Audiovisual Institute INA, and other interested parties.

This paper focuses on the evaluation of the MeMAD technologies with focus on the stakeholder point of view. The project evaluation activities are referred to as a case study, demonstrating the methods and issues that are relevant in the stage of fitting the project technologies into existing professional production workflows. The full project evaluation reports can be found at https://memad.eu and they are summarized in this paper when needed. New evaluation results will be reported throughout the project and this paper describes the findings as of April 2020, shortly after the second of the three project evaluation rounds has finished.

Download the paper below