The Radical Scope of Tesla’s Data Hoard

Logs and records of its customers’ journeys fill out petabytes—and court case dockets

5 min read
Aerial view of a Tesla charging lot
Justin Sullivan/Getty Images

You won’t see a single Tesla cruising the glamorous beachfront in Beidaihe, China, this summer. Officials banned Elon Musk’s popular electric cars from the resort for two months while it hosts the Communist Party’s annual retreat, presumably fearing what their built-in cameras might capture and feed back to the United States.

Back in Florida, Tesla recently faced a negligence lawsuit after two young men died in a fiery car crash while driving a Model S belonging to a father of one of the accident victims. As part of its defense, the company submitted a historical speed analysis showing that the car had been driven with a daily top speed averaging over 90 miles per hour (145 kilometers per hour) in the months before the crash. This information was quietly captured by the car and uploaded to Tesla’s servers. (A jury later found Tesla just 1 percent negligent in the case.)

Meanwhile, every recent-model Tesla reportedly records a breadcrumb GPS trail of every trip it makes—and shares it with the company. While this data is supposedly anonymized, experts are skeptical.

Alongside its advances in electric propulsion, Tesla’s innovations in data collection, analysis, and usage are transforming the automotive industry, and society itself, in ways that appear genuinely revolutionary.

“Gateway log” files—periodically uploaded to Tesla—include seatbelt, Autopilot, and cruise-control settings, and whether drivers had their hands on the steering wheel.

In a series of articles (story 2; story 3), IEEE Spectrum is examining exactly what data Tesla vehicles collect, how the company uses them to develop its automated driving systems, and whether owners or the company are in the driver’s seat when it comes to accessing and exploiting that data. There is no evidence that Tesla collects any data beyond what customers agree to in their terms of service—even though opting out of this completely appears to be very difficult.

Almost every new production vehicle has a battery of sensors, including cameras and radars, that capture data about their drivers, other road users, and their surroundings. There is now a worldwide connected car-data industry, trading in anonymized vehicle, driver, and location data aggregated from billions of journeys made in tens of millions of vehicles from all the major automotive equipment manufacturers. But none seem to store that information and send it back to the manufacturer as regularly, or in such volume, or have been doing so for as long, as those made by Tesla.

“As far as we know, Tesla vehicles collect the most amount of data,” says Francis Hoogendijk, a researcher at the Netherlands Forensic Institute who began investigating Tesla’s data systems after fatal crashes in the United States and the Netherlands in 2016.

Spectrum has used expert analyses, NTSB crash investigations, NHTSA reports, and Tesla’s own documents to build up as complete a picture as possible of the data Tesla vehicles collect and what the company does with them.

To start with, Teslas, like over 99 percent of new vehicles, have event data recorders (EDRs). These “black box” recorders are triggered by a crash and collect a scant 5 seconds of information, including speed, acceleration, brake use, steering input, and automatic brake and stability controls, to assist in crash investigations.

But Tesla also makes a permanent record of these data—and many more—on a 4-gigabyte SD or 8-GB microSD card located in the car’s Media Control Unit (MCU) Linux infotainment computer. These time-stamped “gateway log” files also include seatbelt, Autopilot, and cruise-control settings, and whether drivers had their hands on the steering wheel. They are normally recorded at a relatively low resolution, such as 5 hertz, allowing the cards to store months’ or years’ worth of data, even up to the lifetime of the vehicle.

Because the gateway logs use data from cars’ standard control area network (CAN) buses, they can include the unique vehicle identification number, or VIN. However, no evidence suggests these logs could include information from the car’s GPS module, or from its cameras or (for earlier models) radars.

A bar graph labelled Maximum Vehicle Speed by Date - 2018 showing high vehicle speeds over the course of 4 months.In a Florida court, Tesla presented detailed data about the top speeds of a Model S involved in a fatal crash.Car Engineering/Tesla/Southern District of Florida U.S. Courts

When an owner connects a Tesla to a Wi-Fi network—for instance, to download an over-the-air update that adds new features or fixes bugs—the gateway log data is periodically uploaded to Tesla. Judging by Tesla’s use of gateway log data in the Florida lawsuit, Tesla appears to link that data to its originating vehicle and store it permanently. (Tesla did not respond to requests for clarification on this and other issues).

Teslas also have a separate Autopilot Linux computer, which takes inputs from the cars’ cameras to handle driver-assistance functions like cruise control, lane-keeping, and collision warnings. If owners plug their own USB thumb drives into the car, they can make live dashcam recordings, and set up Sentry Mode to record the vehicle’s surroundings when parked. These recordings do not appear to be uploaded to Tesla.

However, there are many occasions in which Tesla vehicles do store images and (in 2016 models onward) videos from the cameras, and then share them with the company. These Autopilot “snapshots” can span several minutes and consist of up to several hundred megabytes of data, according to one engineer and Tesla owner who has studied Tesla’s data-collection process using salvaged vehicles and components, and who tweets using the pseudonym Green.

As well as visual data, the snapshots include high-resolution log data, similar to that captured in the gateway logs but at a much higher frequency—up to 50 Hz for wheel-speed information, notes Hoogendijk.

Snapshots are triggered when the vehicle crashes—as detected by the airbag system deploying—or when certain conditions are met. These can include anything that Tesla engineers want to learn about, such as particular driving behaviors, or specific objects or situations being detected by the Autopilot system. (These matters will be covered in the second installment in our series, to be posted tomorrow.)

GPS location data is always captured for crash events, says Green, and sometimes for other snapshots. Like gateway data, snapshots are uploaded to Tesla when the car connects to Wi-Fi, although those triggered by crashes will also attempt to upload over the car’s 4G cellular connection. Then, Green says, once a snapshot has been successfully uploaded, it is deleted from the Autopilot’s onboard 32-GB storage.

In addition to the snapshots, the Autopilot computer also records a complete trip log every time a mid-2017 or later Tesla is shifted from Park to Drive, says Green. Trip logs include a GPS breadcrumb trail until the car is shifted back into Park and include speeds, road types, and when or whether Autopilot was activated. Green says that trip logs are recorded whether or not Autopilot (or Full Self-Driving) is used. Like the snapshots, trip logs are deleted from the vehicle after being uploaded to Tesla.

But what happens to this treasure trove of data? Tesla has sold about three million vehicles worldwide, the majority of which are phoning home daily. They have provided the company with billions of miles of real-world driving data and GPS tracks, and many millions of photos and videos. What the world’s leading EV automaker is doing with all that data is the subject of our next installment.


Update 5 Aug. 2022: Elon Musk announced this week that Tesla has now sold about three million vehicles worldwide (not two as we had originally reported).

{"imageShortcodeIds":[]}
The Conversation (1)
Lance Stronk10 Aug, 2022
SM

The was $10.5M awarded to the ‘fiery crash’ victims according to a news article I read. So, 1% is $105,000?

Bottom line is safety. Battery energy density and associated safety is still a concern. Interesting they didn’t collect data on the ignition of the fire…or maybe they did?

Self-Driving Cars Work Better With Smart Roads

Intelligent infrastructure makes autonomous driving safer and less expensive

9 min read
A photograph shows a single car headed toward the viewer on the rightmost lane of a three-lane road that is bounded by grassy parkways, one side of which is planted with trees. In the foreground a black vertical pole is topped by a crossbeam bearing various instruments. 

This test unit, in a suburb of Shanghai, detects and tracks traffic merging from a side road onto a major road, using a camera, a lidar, a radar, a communication unit, and a computer.

Shaoshan Liu

Enormous efforts have been made in the past two decades to create a car that can use sensors and artificial intelligence to model its environment and plot a safe driving path. Yet even today the technology works well only in areas like campuses, which have limited roads to map and minimal traffic to master. It still can’t manage busy, unfamiliar, or unpredictable roads. For now, at least, there is only so much sensory power and intelligence that can go into a car.

To solve this problem, we must turn it around: We must put more of the smarts into the infrastructure—we must make the road smart.

Keep Reading ↓Show less