Researchers, scientists, and pollsters regularly use large anonymous data sets when conducting research, but a new study published in Science magazine shows this data can be easily used by hackers for identity theft.
The study finds that traditional ways of anonymizing large data sets such as pulling things like names, account numbers, addresses, etc. out of the data don’t actually render it anonymous.
The study analysed 1.1 million credit card records and found that in the case of the large credit card data set, if you happened to know just four things a person purchased, you’d be able to uniquely identify that person in the data set, and therefore be able to see all the rest of their transactions, 90% of the time.
The authors of the study also pointed out that it’s possible that other types of data, like browsing history or financial data, could also have similar vulnerabilities.
Scientists and journalists both manipulate large data sets regularly and fields such as cancer research would be unable to function without access to an immense amount of anonymous data such as medical records. If these records turn out to be not as anonymous as previously thought, countless people’s credit card information, browsing history, and even phone records could be compromised.
Read the full paper here.