Amazon Rekognition Video is a machine learning (ML) based service that can analyse videos to detect objects, people, faces, text, scenes, and activities, as well as detect any inappropriate content. 


Amazon: Colour bars can be detected to remove them from VOD content

It can automate four common media analysis tasks – detection of black frames, end credits, shot changes, and colour bars using fully managed, ML-powered APIs from Amazon Rekognition Video.

These features enable users to execute workflows such as content preparation, ad insertion, and add ‘binge-markers’ to content at scale in the cloud. Videos often contain a short duration of empty black frames with no audio to demarcate ad insertion slots or end of a scene, the company notes: using Amazon Rekognition Video, it is possible to detect such sequences to automate ad insertion or package content for Video-On-Demand (VOD) by removing unwanted segments.

To implement interactive viewer prompts such as ‘Next Episode’ in VOD applications, the exact frames where the closing credits start and end in a video can be identified. Further, Amazon Rekognition Video enables users to detect shot changes, when a scene cuts from one camera to another. Using this information, it is possible to create promotional videos using selected shots, generate high-quality preview thumbnails by choosing key frames in shots, and insert ads without disrupting viewer experience, for example, by avoiding the middle of a shot when someone is speaking. Lastly, sections of video that display SMPTE (Society of Motion Picture and Television Engineers) colour bars can be detected to remove them from VOD content. Issues such as loss of broadcast signals in a recording, when colour bars may be shown continuously as the default signal, can also be detected.

With these APIs, Amazon says that users can easily analyse large volumes of videos stored in Amazon S3 and get SMPTE timecodes and timestamps for each detection - without requiring any machine learning experience. Returned SMPTE timecodes are frame accurate, which means that Amazon Rekognition Video provides the exact frame number when it detects a relevant segment of video, and also handles various video frame rate formats, such as drop frame and fractional frame rates under the hood. Using the frame accurate metadata from Amazon Rekognition Video, it is possible to either automate operational tasks completely, or significantly reduce the review workload of trained human operators. This enables the execution of media analysis workflows at scale in the cloud, with users paying only for the minutes of video they analyse. There are no minimum fees, licenses, or upfront commitments.