Indiana University issued this news release today:
BLOOMINGTON, Ind. -- Astroturfers, Twitter-bombers and smear campaigners need beware this election season as a group of leading Indiana University information and computer scientists today unleashed Truthy.indiana.edu, a sophisticated new Twitter-based research tool that combines data mining, social network analysis and crowdsourcing to uncover deceptive tactics and misinformation leading up to the Nov. 2 elections.
Combing through thousands of tweets per hour in search of political keywords, the team based out of IU's School of Informatics and Computing will isolate patterns of interest and then insert those memes (ideas or patterns passed by imitation) into Twitter's application programming interface (API) to obtain more information about the meme's history.
"When we identify a trend we go back and examine how it was started, where the main injection points were, and any associated memes," said Filippo Menczer, an associate professor of computer science and informatics. "When we drill down we'll be able to see statistics and visualizations relating to tweets that mention the meme and basically reconstruct its history."
The team will then generate diffusion network images that visitors to Truthy.indiana.edu can view as groups of nodes and edges that identify retweets, mentions, and the extent of the epidemic. Visitors to the site will also see the output of a sentiment analysis algorithm that examines and extracts mood-identifying words and then assesses them on a known psychometric scale. That algorithm identifies the meme on scales ranging from anxious to calm, hostile to kind, unsure to sure, and confused to aware.
Menczer got the idea for the Truthy website after hearing researchers from Wellesley College speak earlier this year on their research analyzing a well-known Twitter bomb campaign conducted by the conservative group American Future Fund (AFF) against Martha Coakley, a democrat who lost the Massachusetts senatorial seat formerly held by the late Edward Kennedy. Republican challenger Scott Brown won the seat after AFF set up nine Twitter accounts in early morning hours prior to the election and then sent out 929 tweets in two hours before Twitter realized the information was spam. By then the messages had reached 60,000 people.
Streaming Twitter data acquired in real-time is matched against keywords to exclude tweets unlikely to contain political discussion and extract memes (mentions, hash tags, and urls). Memes of interest are isolated by considering only those that have just undergone significant changes in volume, or those that account for a significant portion of the total volume. Memes are then inserted in a database and Twitter API is used to obtain more information on each.
Menczer explained that because search engines now include Twitter trends in search results, an astroturfing campaign -- where the concerted efforts of special interests are disguised as a spontaneous grassroots movement -- that includes Twitter bombs can jack up how high a result shows up on Google even if the information is false.
This is one reason Truthy.indiana.edu also relies on input from users to denote a meme as "truthy," or misinformation represented as fact. Having a crowdsourcing component will help the data mining effort and hopefully keep the loop between social media and search engines honest, researchers said.
"One of the concerns about social media is that people are being manipulated without realizing it because a meme can be given instant global popularity by a high search engine ranking, in turn perpetuating the falsehood," Menczer said.
As information scientists, the group is interested in understanding meme diffusion from various perspectives: Menczer, associate director of IU's Center for Complex Networks and Systems Research, focuses on data mining and meme burst modeling; Rudy Professor of Informatics Alessandro Vespignani's work relates to epidemic and contagion modeling; Associate Professor of Informatics Alessandro Flammini, also director of IU's Complex Systems Program, conducts complex network analysis, especially related to online text and social media; and Johan Bollen, associate professor of informatics and computing, has a background in cognitive science and specializes in sentiment and mood analysis from online text.
The website's name, Truthy, references a "stunt word" first employed by television comedian and political pundit Stephen Colbert in 2005 to satirize the use of emotional appeal as fact.