Detecting illegal credential sharing in video subscription

No comments

ABSTRACT

Today, the video industry faces new types of piracy and threats that cannot be prevented by embedding secure hardware or software in consumer devices. Unlike legacy set-top boxes (STBs), there is no hardware identity built into the second screen consumer video devices.

As a result, subscribers can manually enter their account credentials (username/password) and share them, both knowingly and unknowingly, with other non-subscribers. In this paper, we present a method for overcoming this problem, by detecting who is sharing their credentials.

We use machine learning techniques and advanced graph analysis to model different aspects of normal subscriber behaviour: temporal, spatial and watching habits. The models allow us to find anomalous behaviour among subscribers, to set up a threshold, and then to enable service providers to use consequences such as blacklisting devices and suspending sharing accounts.

INTRODUCTION

In this paper, we present our approach for detecting credential sharing in second screen devices, and then helping service providers overcome the credential sharing challenge. Using viewing records from a service provider’s logs, we model the typical behaviour of an account, and represent each account as an n-dimensional vector. We use this representation in order to determine a sharing score per account, which reflects the likelihood that an account will share its credentials.

We implemented machine learning algorithms based on a complex set of statistical, spatial, temporal and behavioural features.
We performed further analysis on the viewing records using dynamic graph analysis to determine the sharing type.
We distinguished between two main types of sharing activities, legal and illegal.
- Under illegal sharing, we observed the cases where the credentials are distributed for profit purposes.
- Legal sharing included the cases where the credentials are shared with family members or friends. Since we do not have any information about the actual family relation of the subscribers, we incorporate the assumption that family members tend to meet occasionally (e.g., a child living in dormitories) into our algorithms.

The viewing records (logs) we used in our initial trial were captured from second screen devices used by over a million customers and over hundreds of million viewing transactions, all received from a large known service provider.

Since this is an unsupervised problem, we had no training data about the actual sharers, hence, in order to validate our method, we performed post-hoc analysis on our results. This was done with the service provider to validate the shared accounts.