Telecommunications

Identifying Credit Card Users With a Few Bits of Data

Anonymizing data doesn't protect privacy as well as you might think

Photo: iStockphoto

Anonymized credit card data can easily be used to identify credit card users, more evidence that anonymizing data does not protect privacy as well as often thought, scientists now find.

Personal information often gets anonymized by stripping it of names, home addresses, phone numbers and other obvious identifying details. Such data often get shared, and underlie popular services such as Google’s real-time traffic monitoring, which shows conditions on major thoroughfares in more than 50 different countries.

However, anonymized data can still reveal a great deal about individuals. For example, computational social scientist Yves-Alexandre de Montjoye at MIT and his colleagues recently found that anonymized cell phone data could be better at identifying users than fingerprints. At most, 11 randomly chosen interactions with cell phone networks were needed to identify a person by the routes he or she regularly traveled, while identifying someone by a fingerprint requires at least 12 reference points.

To see how well anonymized credit card data protected privacy, de Montjoye and his colleagues at MIT and Aarhus University in Denmark analyzed three months' worth of information from 1.1 million people living in an unidentified developed country in the Organization for Economic Cooperation and Development (OCED). They detailed their findings in the Jan. 30 issue of the journal Science.

The researchers found that knowing when and where four credit card transactions occurred was enough to identify 90 percent of people from this anonymized metadata. Even when the data are less specific — for instance, purchases within a certain geographic area instead of a certain shop, or within 15 days instead of one day — individuals could be re-identified with a half-dozen or so more additional data points. Adding one more piece of data, the price of a certain transaction, could increase the chance of re-identification by 22 percent on average. Women and people in higher income brackets proved easier to identify, potentially because they have distinctive patterns in how they divide their time between the shops they visit.

Although data sharing can provide invaluable services, these findings suggest "we ought to rethink and reform how we approach data protection," de Montjoye said. He and his colleagues are now developing strategies known as OpenPDS and SafeAnswers to protect the privacy of metadata, which recently won a SXSW Interactive Innovation Award.

IEEE Spectrum
FOR THE TECHNOLOGY INSIDER

Follow IEEE Spectrum

Support IEEE Spectrum

IEEE Spectrum is the flagship publication of the IEEE — the world’s largest professional organization devoted to engineering and applied sciences. Our articles, podcasts, and infographics inform our readers about developments in technology, engineering, and science.