What we found in 3 million Russian troll tweets

>On July 31st, FiveThirtyEight shared 3 million Russian troll tweets to the public. With data so large and relevant to understanding Russian interference in American democracy, crowdsourcing this kind of research could unearth breakthrough insights.

As concerned citizens and patriots, Luminoso wanted to help analyze this data We saw the tweets were collectively all over the map (literally and figuratively). Some used hashtags; some used emojis; while others opted for slang, nonsense words, and timely memes (often riddled with spelling and grammar errors).

In other words, this is the kind of data Luminoso’s “common sense” AI eats for breakfast.

(In this analysis, we only looked at English language data. But in future updates, we’ll begin exploring the remaining 800,000 pieces of data containing Arabic, Russian, and Spanish among others.)

Here are some broad things we’ve found so far:

The Internet Research Agency targeted influencers. By our count, 27,323 tweets, or 1.2% of the 2.2 million tweets our AI looked at, “@” a politician, media personality, or influencer.

They mentioned influential politicians:

They also targeted media personalities:

And sent content to grassroots influencers:

Often, these tweets were sent to several personalities, thereby increasing the odds that one of them would spread the misinformation to their respective followers.

The strategy has persisted, with Russian trolls recently sending messages to the White House Twitter handle, to the POTUS handle, and to President Trump’s communications staff. The fact that this strategy has continued suggests that it has had some efficacy in spreading misinformation.

The ‘Newsfeed’ Twitter accounts focus on reposting news on politics, incidents that increase community anxiety, and, surprisingly, sports.

Our AI connects concepts to topics. So say you post two tweets about LeBron James. In the first, you describe one of his NBA Finals performances. Our AI recognizes that tweet as about sports. But if your second post discusses his now famous words to President Trump, that tweet is recognized as about politics.

We’re also able to output the most frequently mentioned topics. And while many tweets are conceptually around politics, many more focus on spreading community news that instills fears.

While there will be overlap among some of the topics, their persistent appearance paints a strategy that focused on reposting news on politics, sports, and anxiety-inducing incidents.

The Russian trolls were hyper aware of emerging conspiracy theories and helped propagate them. QAnon, for example, has 2,797 conceptual appearances in the data, frequently resting alongside classifying hash tags like #FollowTheWhiteRabbit (0.85 correlation), #TheStorm (0.84), and #PedoWood (0.80).

Looking at all the tweets holistically, we found other interesting correlations.

Nearly all tweets that hash tag #MAGA also hashtag #PJNET (0.95)
Tweets that mention CNN also mention lies (0.35), being exposed (0.47), Clinton (0.46) and her emails (0.29), Donald Trump (0.42), and liberals (0.48)
Cops are almost never talked about in the context of Donald Trump (negative correlation), but often in the context of the FBI (0.36) and terrorists (0.25)
Muslims are almost always talked about in the context of terrorists and ISIS (0.66 and 0.63 correlation)

The different Twitter account categories have drastically different approaches to what they talk about as well. Collectively, the top 5 concepts discussed are: Trump, Clinton, Obama, cops, and workouts. Below are 2 charts showing how the top two concepts appear among the Twitter troll categories:

We hope this first look is helpful for folks out there who want to learn more about what’s really in this massive dataset. And there’s a lot more that we can do.

Next, we’d like to look at other languages as a lot of the content is in Arabic and Russian. trends over time, which could reveal how the Russian troll strategy has evolved. We can also dig deeper into their influencer strategy, or find patterns that reveal how they convinced other users that the information they were sending was truthful.

Want to learn more about how Luminoso can discover the elusive “unknown unknowns” in your unstructured data? See a demo today.

What we found in 3 million Russian troll tweets

Related Posts

Why Explainability and Transparency Are Critical in AI-Driven Customer Analytics

My CSAT data isn’t telling me anything!

AI in Customer Experience Analysis

Step into the light

Marketing Analysts

About This Partnership

Reveal.ai

About This Partnership

Austin Advocates With (AAW)

About This Partnership

Insights Nectar

About This Partnership

The Oregon Values and Beliefs Center (OVBC)

About This Partnership

Bright Data

About This Partnership

Voyage Advisory

About This Partnership

Targa Consulting

About This Partnership

KAPS GROUP

About This Partnership

IBM

About This Partnership

Smart Insight

About This Partnership

EDLIGO

About This Partnership

Zyte

About This Partnership

Salesforce

About This Partnership

RainFocus

About This Partnership

HiFly Labs

About This Partnership

Data Ideology

About This Partnership

8x8

About This Partnership

Vatis Tech

About This Partnership

OnlineSales

About This Partnership

BabelStreet

About This Partnership

Paychex

About This Partnership

Experience

About This Partnership

Qlik

About This Partnership

Databricks

About This Partnership

Knowledge Works

About This Partnership

MinervaCQ

About This Partnership

Clarteza

About This Partnership

The Centre For Educational Effectiveness

About This Partnership

Reality Check

About This Partnership

Socratic Technologies

About This Partnership

McKinney

About This Partnership

Shapiro & Raj

About This Partnership

Company Name

About This Partnership