The December 2022 issue of IEEE Spectrum is here!

Close bar

NSA Can Legally Access Metadata of 25,000 Callers Based on a Single Suspect’s Phone

Edward Snowden's revelations may have tightened the NSA's leash, but the agency can still request tens of thousands of records for each case

4 min read
NSA Can Legally Access Metadata of 25,000 Callers Based on a Single Suspect’s Phone
Illustration: iStockphoto

Despite changes to the law, the U.S. National Security Agency can still request metadata from tens of thousands of private phones if they are indirectly connected to the phone number of a suspected terrorist, according to a new analysis. The study is one of the first to quantify the impact of policy changes intended to narrow the agency’s previously unfettered access to private phone records, which was first revealed by Edward Snowden in 2013. 

For years before Snowden went public, the U.S. National Security Agency legally obtained metadata not only from suspects’ phones but also from those of their contacts and their contacts’ contacts (and even their contacts’ contacts’ contacts) in order to trace terrorist networks. This metadata included information about whom a user has called, when the call was placed, and how long these calls lasted.

Today, federal rules permit the NSA to recover metadata from phones within "two hops" of a suspect, which means someone who called someone who called the suspect in the past 18 months. Previously, federal regulations were more generous, permitting recovery of metadata from "three hops" away dating back to five years.

A new analysis led by researchers at Stanford University’s Computer Security Laboratory quantifies just what this policy change has meant, discovering that, under the old five-year three-hop rules the NSA could legally recover metadata from about 20 million phones per suspect and “the majority of the entire U.S. population” if it analyzed all its suspects.  Now, the stricter 18-month "two hop" rule permits the agency to recover metadata from about 25,000 phones with a single request, according to the Stanford study.

“I think there could be a national debate about what is an appropriate legal range for these sorts of things,” says Patrick Mutchler, a coauthor and PhD candidate studying computer security at Stanford University. “There's a tradeoff where additional privacy protections might make it harder for the government to stop threats, but the important thing is having the correct data in order to be able to make policy that matches with what we think is right.”

[shortcode ieee-pullquote quote=""If it's not difficult for two graduate students to figure out a fair number of owners of these phone numbers, I would suspect it's considerably easier for a larger organization with massive resources,"" float="left" expand=1]

The researchers assumed in each case that the agency would remove “hub” numbers that receive far more calls than usual from their queries. These numbers, such as the number which sends out Google password verifications to millions of users, have limited value in tracking terrorists. If the NSA did not remove hub numbers from its requests, it would have been possible for it to access the phone records of the majority of Americans starting with a single suspect  under the “three hop” rule.

These results reinforce concerns that the agency could at one time legally access millions of Americans’ phone records based on a single “seed” number. The NSA began using the five-year “three hop” rule to collect civilian phone records under the U.S.A. Patriot Act, which Congress passed shortly after 9/11. When former CIA employee Snowden publicly exposed this program’s reach in 2013, politicians and privacy advocates widely criticized it.

Before long, President Barack Obama had proposed limiting the NSA’s access to include only records within two hops of a suspect and Congress later revised the Patriot Act. However, Mutchler says it wasn’t entirely clear until now just what this policy shift accomplished in terms of the number of people whose metadata could be legally accessed under the new rules.

In their analysis published Monday in the Proceedings of the National Academy of Sciences, Mutchler and his colleagues collected call and text logs from 823 volunteers through an Android app. The median collection period was 59 days. During the course of the study, participants logged a total of 251,788 calls and 1.2 million text messages.

A Smaller Net

Below are the numbers of people the NSA could legally collect phone metadata from in the course of a single investigation before and after the USA Freedom Act. A “hop” indicates a direct connection—that is, one person called another or sent him or her a text.

The researchers used these records to calculate how many others might be within “two hops” of the average caller in the past 18 months and therefore fair game to the NSA. They found the agency could likely access the metadata of 25,000 users under the new provision. That’s far fewer than the approximately 20 million users whose metadata the agency could access from a single number through the old rule.

It’s important to note that the volunteers who participated in this study were not representative so their social networks may not reflect those of the overall population. Ninety percent of the volunteers were male and their median age was 33 years old. Since they signed up for the study, they are probably more technologically savvy and possibly more concerned about privacy issues than most people.

In addition to estimating the number of phone records within the NSA’s reach, the group also wanted to see if it could easily identify the owners of anonymous phone numbers. In the face of privacy concerns, federal agencies have sometimes claimed the metadata they collect is not “personally identifiable information.”

Using both automated tools and manual searches in publicly available databases hosted by Intelius, Yelp, and Facebook, the group showed it was possible to match an owner’s name to 82 percent of anonymous phone numbers in a subset of all those collected from volunteers.

“If it's not difficult for two graduate students to figure out a fair number of owners of these phone numbers, I would suspect it’s considerably easier for a larger organization with massive resources,” Mutchler says.

Lastly, the team used the metadata they collected to make predictions about users and their behavior. For example, a user who frequently texts and calls a particular number might be in a relationship with the person on the other end. Or if they call three businesses near Idaho Falls, Idaho in the course of a month, they might live in Idaho Falls.

The group used Facebook information shared with them by volunteers to refute or verify these theories. This information showed it was sometimes possible to accurately predict a person’s relationship status and location based solely on their phone’s metadata.

However, the researchers were not able to do so in every case. For example, the team accurately predicted the current city of just over half (57 percent) of 241 users who placed at least 10 calls to businesses that could be easily identified by public records during the course of the study.

The Conversation (0)

Why the Internet Needs the InterPlanetary File System

Peer-to-peer file sharing would make the Internet far more efficient

12 min read
Horizontal
An illustration of a series
Carl De Torres
LightBlue

When the COVID-19 pandemic erupted in early 2020, the world made an unprecedented shift to remote work. As a precaution, some Internet providers scaled back service levels temporarily, although that probably wasn’t necessary for countries in Asia, Europe, and North America, which were generally able to cope with the surge in demand caused by people teleworking (and binge-watching Netflix). That’s because most of their networks were overprovisioned, with more capacity than they usually need. But in countries without the same level of investment in network infrastructure, the picture was less rosy: Internet service providers (ISPs) in South Africa and Venezuela, for instance, reported significant strain.

But is overprovisioning the only way to ensure resilience? We don’t think so. To understand the alternative approach we’re championing, though, you first need to recall how the Internet works.

Keep Reading ↓Show less