How Tech Automated the January 6 Investigations

Three years on, databases, not tipsters, are generating more criminal charges

8 min read
people standing in front of the Washington capital building holding flags and protesting

Trump supporters storm the U.S. Capitol following a rally with President Donald Trump on 6 January 2021 in Washington, D.C.

Samuel Corum/Getty Images

Josh Coker’s Facebook page doesn’t show any MAGA memes or Trump quotes. He wasn’t live-streaming on 6 January 2021, and no one has ever stepped forward to identify him as one of the mob that stormed the U.S. Capitol that day.

But on 10 October last year, the Department of Justice charged Coker, who lives in Oregon, Ohio, with five counts connected to the failed insurrection. The FBI was able to identify Coker and gather information about his actions using only location data from his phone and image-recognition technologies. According to the charging document, agents did not even interview him before filing a criminal complaint.

“The January sixth cases are the test subjects for an experiment of digital surveillance that will impact us all.”
—Andrew Ferguson, American University Washington College of Law

As the January 6 investigations move into their fourth year, an IEEE Spectrum analysis of court filings suggests they are evolving from sprawling, labor-intensive efforts touching on many aspects of suspects’ lives to something far more streamlined and digital. Some, like Coker’s, have even reached the point of requiring little human input at all.

Automated enforcement of lesser infractions is nothing new–think neighborhood speed cameras. However, experts say that its use for serious felony charges could set a dangerous precedent for the U.S. justice system, raising questions about bias, misidentification, and the rise of the surveillance state.

“The reality now is with little effort police can sit at their desks and search the vast stores of data we leave behind,” says Andrew Ferguson, professor of law at the American University Washington College of Law. “The January sixth cases are the test subjects for an experiment of digital surveillance that will impact us all. Few legal protections stand in the way.”

A riot like no other

The mob that attacked the U.S. Capitol three years ago injured 138 police officers, inflicted tens of millions of dollars in costs, and resulted in the deaths of at least five people. It set into motion the largest criminal investigation in history, both by the number of defendants (now over 1,200) and the sheer quantity of evidence gathered—much of it digital.

Over a year ago, Spectrum began analyzing data from hundreds of criminal indictments to examine the role technology played in tracking down those responsible for the attacks.

The low-hanging fruit of people publicizing themselves as rioters has now mostly been plucked.

At that point, the key technology for identifying the rioters appeared to be social media. Thousands of Facebook and Twitter feeds provided a rich lode of data for official (and citizen) investigators to mine, and many of the 300,000 tips that the FBI received mentioned social media.

A chart showing cases in which social networks were cited. The most are Facebook, followed by Twitter, Instagram, and others.Many accused of crimes on 6 January proudly shared details of their actions to their feeds, resulting in tens of thousands of tips to the FBI. Social networks were cited in about two-thirds of investigations, from 2021 to the end of 2022 [above]. Over the last year, social media was mentioned in far fewer new charging documents.

Spectrum found that social networks were cited in about two-thirds of investigations. Facebook appeared in almost half of all cases, and almost every major social-media app was mentioned at least once.

But the low-hanging fruit of people publicizing themselves as rioters has now mostly been plucked. Since last year’s story, just three of over 200 new suspects were identified using Facebook.

Tips have also tailed off. Until the end of 2022, 63 percent of January 6 suspects were first named by witnesses. Last year, just 13 percent came to the attention of the authorities from human sources.

Instead, investigators have increasingly relied upon two automated technologies to find those involved in the insurrection.

Where the rioters were

Two geofence warrants served on Google delivered a trove of cell-tower data, bolstered with information from nearby Wi-Fi routers and Bluetooth beacons, to locate phones to within about 10 meters. Google recorded 5,723 devices in or near the Capitol during the riots.

A map of the Capital Building showing criss crossing blue dots and lines.Geofence data surrendered by Google for cellphone location tracking has proven indispensable for authorities charging Jan. 6 rioters. U.S. Department of Justice

After narrowing the results to only those most likely to have breached the Capitol building, Google eventually delivered the FBI the names, phone numbers, and emails associated with the accounts of 1,535 devices—including an iPhone registered to Josh Coker.

Along with rougher, but still useful, cell-tower location data from AT&T and Verizon, the Google geofence info was entered into a database called “Transition 2021–Capitol Devices,” according to court filings. The Department of Justice is using software from Splunk, a company that specializes in searching, monitoring, and analyzing machine-generated data.

Splunk told Spectrum that “public sector organizations around the world deploy Splunk—including all three branches of the U.S. federal government.”

The DOJ’s reliance on geofence data has soared. Around 20 percent of all January 6 cases filed in the last year cite location data as the way investigators initially identified suspects—that’s nearly five times the rate from before 2023.

But this could be the last major investigation that has such unfettered access to suspects’ movements. In December, Google announched changes to the way it records users’ location data. Soon, Google will store that data by default on users’ devices rather than in the cloud, and set them to delete in just three months. If users do choose to back up their location data online, they will be encrypted “so no one can read it, including Google.”

What the rioters looked like

The other technology that the FBI has come to rely on is facial recognition. Data for this is harder to extract from filings because investigators can be vague about the process of matching faces to names, including the source of images that identify people.

Over a third of January 6 cases filed in the past year cited facial recognition, compared with fewer than 5 percent prior to 2023.

But Coker’s charging document lays out the general idea: “Coker’s driver’s license photo was provided to a cooperating witness (CW1) with access to facial recognition technology. CW1 ran a facial recognition comparison of Coker’s driver’s license photo against all publicly available images, videos, and media captured during the civil unrest at the U.S. Capitol building on January 6, 2021.”

Two images showing someone identified via facial recognition in front of, and then in, the Capital Building.These images, from the charging document of an alleged January 6 rioter, were discovered—according to the document—by facial-recognition searches using the suspect’s driver’s license photo as comparison. U.S. Department of Justice

Those images have been gathered from surveillance cameras at the Capitol, police officers’ body-worn cameras, news and social media, suspects’ cellphones, and other digital accounts. Another court filing notes that the FBI “continuously populates” a “repository of images and videos” that it can run facial recognition searches on.

Those searches compare the January 6 images to an ever-growing body of reference images of people whose identity is known. Court filings mention U.S. passport photos, state driver’s license pictures, news media, and social media posts. And a recent filing suggests that investigators are now casting an even wider net.

Using CCTV, social media, and open-source video footage, the DOJ tracked the movements of a man holding a stick as he entered the Capitol on 6 January 2021 and then broke a window shutter. But they were apparently unable to put a name to the face until Daniel Tocci was pulled over for a broken headlight in Hadley, Mass., on 24 January 2023.

Side by side images of a man with a mustache in a green coat with a grey fur collar. In the left, he holds a stick in front of the Capital. On the right is a still from a video of him in his car.A court filing for another alleged January 6 rioter compared a photo of a man holding a stick preparing to enter the Capitol with a police body-cam image from a traffic stop in Hadley, Mass., two years later. U.S. Department of Justice

The filing notes that the Hadley Police Department provided video from its officer’s body-worn camera to the FBI, which was then able to match the face to its suspect, and finally arrest Tocci in November.

The use of facial-recognition technologies seems to have accelerated even beyond that of geofence data. Over a third of cases filed in the past year cited facial recognition, compared to fewer than 5 percent prior to 2023. In 25 cases, including Tocci’s, image recognition provided the first evidence of a suspect’s involvement in the attacks.

Technology isn’t perfect

Automating investigations in this way is convenient for the authorities when they are faced with thousands of possible suspects, and little other evidence to go on. But it comes with its own problems.

“It now seems to be much easier for the FBI and other police departments to hold on to images for a long time and just run these automated searches whenever they feel like it.”
—Jennifer Lynch, Electronic Frontier Foundation

Some facial-recognition technologies have proven to be less than reliable, with many media reports detailing false arrests after apparent misidentifications, particularly of Black people.

The increasing reach of cameras and sophistication of algorithms worries Jennifer Lynch, general counsel of the Electronic Frontier Foundation. “We suddenly seem to have this web of face recognition,” she says. “It’s been building for years, but it now seems to be much easier for the FBI and other police departments to hold onto images for a long time and just run these automated searches whenever they feel like it.”

There is also the possibility that someone lent their phone to a friend or family member on the day of the attacks, or that the location data falsely placed them inside the Capitol building. Google admits that its location data is only a “probabilistic estimate,” with each data point having its own margin of error. For each blue location circle plotted on a map, there is only an “estimated 68 percent chance that the user is actually within the shaded circle,” according to the company in court filings.

“That means that it may miss people who were actually there and it may identify people who were not within the geofence,” says Lynch. “It’s also hard to audit the data because it is Google’s proprietary algorithm.”

In response to Google’s changes to how it will handle location data, Lynch wrote: “We are cautiously optimistic that this will effectively mean the end of geofence warrants...However, we are not yet prepared to declare total victory...Google collects additional location information as well. It remains to be seen whether law enforcement will find a way to access these other stores of location data on a mass basis in the future.”

The problems with automated justice

Relying on automated technologies alone could open the door to miscarriages of justice, say the experts, especially when multiplied by the thousands of people at events like the January 6 riots.

“We as a society are not prepared for how the aggregation of digital surveillance technologies will strengthen police power at the expense of citizens.”
—Andrew Ferguson, American University Washington College of Law

“This is hugely problematic,” says Lynch. “When you take a human out of the loop, there’s such a high chance of somebody being accused of a crime that they didn’t commit. Even if they are ultimately determined to be innocent, that person still could face an arrest, spend time in jail, and lose their job or their right to see their children.”

Tocci’s charging document contains no evidence of his involvement in the insurrection beyond the single facial match with his traffic-stop video, and several other cases rest heavily on the combination of the Google geofence data and facial recognition.

“We as a society are not prepared for how the aggregation of digital surveillance technologies will strengthen police power at the expense of citizens,” says Ferguson. “True, sometimes it will be used to solve crimes. Other times, of course it will be used to suppress dissent or police poverty. But it is a debate we should be having now before this becomes too commonplace.”

Although the next presidential election is already looming, the Department of Justice wrote in December: “[Our] resolve to hold accountable those who committed crimes on January 6, 2021, has not, and will not, wane.”

And regardless of the changes Google says are coming soon, many more indictments could flow from geofence data the FBI aready has. Google supplied the FBI with location data for over 1,500 devices that were in the Capitol. To date, fewer than 200 people have been charged using that information.

The Conversation (1)
Shawn Tierney
Shawn Tierney12 Jan, 2024

It is sad to see our beloved IEEE getting political and spreading lies. I do expect a formal apology to the America People for posting this article furthering disinformation once the "existing" information on the truth of this event is no longer censored by those currently in power. Maybe then you can do a story on how technology was used to debunk this blatantly false narrative, and how it will help jail the treasonous architects of this event in our very own Government. And please, no more politically biased articles! We're engineers, not sheeple, and facts matter to us.