Researchers from the Mitre Corporation just published a paper called “Discriminating Gender On Twitter” which details a program they built that infers a person’s gender based on what they tweet.
The program scans for specific words and patterns inside of tweets, and is correct 75.8% of the time, Fast Company reports.
Remarkably, even when only provided with one tweet, the program was correct 65.9% of the time.
The program, while sounding pretty complicated, is actually simple in concept. The “patterns” Mitre talks about are often just words people use to express thoughts in their daily lives.
For example, the program takes users who tweet words like “yoga,” “gosh,” and “bff” and predicts them to be more skewed towards the female end of the spectrum.
Based on how gender-based some words are determined to be, each word is assigned a +/- point value which skews a word towards male or female.
If the program is currently hypothesizing that a user is male, if the next tweet it reads contains the words “my husband,” 44.33 points are docked from the user’s male quotient.
Depending on how successful the program is proven to be, it could be used for ad-targeting, or for socio-linguistic research.
Don’t MIss: 12 Reasons You Absolutely Need To Be On Twitter