We Know What You're Watching (Even If It's Encrypted)

New methods show that encryption alone will not keep attackers from identifying your video streams

4 min read
Person watching a movie on a laptop
Photo: iStockphoto

I stand firm in the opinion that it’s my basic, human right to binge-watch six hours of trashy detective shows on a Friday night with a silent phone in my lap and a glass of wine in my hand. I would also argue it’s my right to do so shamefully and in private, divulging the secret of my wasted weekends to no one but Netflix.

Netflix, it seems would agree with me. The company has been protecting video streams with HTTPS encryption since the summer of 2016. But new research indicates that this strategy is not sufficient to keep third party service providers and motivated attackers from getting a peek at what I’m watching.

Two recent papers, one from West Point Academy, and one by a collection of authors at Tel Aviv University and Cornell Tech, lay out methods for identifying videos by performing straightforward traffic analysis on encrypted data streams. One approach opens the door for snooping by any party that has direct access to the network on which a user is watching videos, such as an ISP or a VPN provider. The other could be used by any attacker who is able to deliver malicious Javascript code to the user’s browser. But both inspect the size of data bursts being transferred across the user’s network in order to fingerprint individual videos and compare them to a database of known, previously characterized content.

Many commercial video streaming services (Netflix is not the only offender) use a set of methods that make this kind of fingerprinting possible. The first, called MPEG-DASH breaks down the content of a video into smaller parts. When you live stream a video, you are actually watching a long playlist of individual chunks that vary in their quality depending on the speed of your network. DASH specifies which chunks make it to your browser.

The second protocol is called variable bitrate encryption and it is a way of eliminating redundancy in successive data bursts to reduce the size of the files that get sent to you. As a scene plays out, VBR protocols compare every new video frame with the one that came before it and eliminate the features of the content that stayed the same. This means that streaming a chaotic action scene, where everything on the screen is constantly changing, would require a series of much larger data bursts, relative to the final credits of a movie, where nearly everything on the screen remains black.

These two features of the network traffic are unique enough that they can be used as fingerprints for individual videos.

“Is two minutes of a video really completely different than every other two minutes from every other video?” asks Andrew Reed, an assisant professor of electrical engineering at West Point Academy and co-author of one of the papers. “It turns out they are extremely different.” 

As a result, by simply cataloguing the timing and size of data bursts, which is information included in first 100kb of data sent in many live streams, Reed and his colleagues were able to compile a database of fingerprints for over 300,000 Netflix videos.

In order to use that information in an attack, the group required direct access to the network delivering the live stream, which allowed them to observe each time the server requests a new burst of data. Using such a technique, they matched up unknown videos with their known fingerprints with 90 percent accuracy after eight minutes of observation. 

In a complementary paper by Eran Tromer, a researcher at Columbia University and head of the Laboratory for Experimental Information Security at Tel Aviv University, Vitaly Shmatikov Cornell Tech, and their student Roei Schuster, the researchers also show how an attacker can collect information about the variable bitrate of a streaming video by running malicious Javascript code in the user’s browser. Such Javascript code can be sent by any website that the user visits, or a web advertising company. The code need only be running on the same network as the user, meaning that someone could bewatching a streamed TV show on their smart TV, while the javascript code was running on the same network on their phone, and the attack would still work.

When running, the code jams up the users network with extraneous traffic. Whenever the TV requests another burst of traffic, the adversary’s traffic gets delayd, allowing him to infer the burst time and size.

The signal collected by these means is much noisier in comparison to the direct measurements made by Reed and his team. But with a deep learning algorithm, Tromer’s group has shown that videos can be identified with over 90 percent accuracy.

These findings are especially relevant now, considering the recent changes made to FCC privacy rules, which give free reign to internet service providers to commoditize their customers browsing habits. But the problem lacks a simple fix. While there are ways to plug the information leaks these two research groups have identified, each would require some compromise in the efficiency with which streamed video gets transmitted, either by making the size of the data bursts more random or by reverting to delivering bits at a constant rate.

“All of these compromises have a cost associated that would be borne by the streaming providers and the users. Currently it is hard to see the incentive structure that would make the streaming providers pay these costs in order to protect their users,” says Tromer. “We hope that when people realize these privacy risks and their exposure to adversarial monitoring there will be increased demand for privacy-preserving streaming services.”

The Conversation (0)