A statistical study by a computer scientist at the University of Virginia’s Predictive Technology Lab shows that it becomes easier to predict the locations of future crimes once data from Twitter is added to existing historic crime pattern data.
What’s more, because people who use their phones with GPS tracking switched on send tweets whose precise location can be identified, the statistical improvement in crime prediction can be narrowed down to the city block. The study was done by Matthew Gerber, who told The Atlantic:
“I was surprised,” says Gerber. “In the thousands of tweets that I’ve read, you don’t see people saying things like, ‘I’m going to rob somebody tonight.'”
In his study, Gerber asked, “can we use the tweets posted by residents in a major U.S. city to predict local criminal activity? This is an important question because tweets are public information and they are easy to obtain via the official Twitter service.”
It turns out you can, at least for some types of crime. (Gerber noticed an improvement in predictive power for 19 of 25 types of crimes.) Gerber looked at crimes reported in Chicago between January and March 2013, and then overlaid onto that GPS-tracked tweets from the same time period. His statistical method then looked for significant correlations between certain words in tweets and crimes that were reported. He produced this map showing how geo-located tweets enhance the predictive power of crime stats:
For example, certain words in local messages on Twitter enhanced the predictive power of crime records for these two types of crimes:
- Criminal damage: center, united, blackhawks, bulls
- Theft: aquarium, shedd, adler, planetarium
In the first example, tweets from around the basketball arena using those words were associated with criminal damage cases. In the second, thefts are associate with words suggesting tourist attractions. It’s not hard to imagine why that might be. Some crimes, however, produced odd correlations:
- Prostitution: lounge, studios, continental, village, ukrainian
Those words have enhanced predictive power under Gerber’s model. But presumably only the people tweeting them understand why they might be related to hookers.