Using Twitter data to study politics? Fine, but be careful!

The role of social media in shaping the new politics is undeniable. Therefore the volume of research on this topic, relying on the data that are produced by the same technologies, is ever increasing. And let’s be honest, when we say “social media” data, almost always we mean Twitter data!

Twitter is arguably the most studied and used source of data in the new field of Computational Political Science, even though in many countries Twitter is not the main player. But we all know why we use Twitter data in our studies and not for instance data mined from Facebook: Twitter data are (almost) publicly available whereas it’s (almost) impossible to collect any useful data from Facebook.

That is understandable. However, there are numerous issues with studies that are entirely relying on Twitter data.

In a mini-review paper titled “A Biased Review of Biases in Twitter Studies on Political Collective Action“, we discussed some of these issues. Only some of them and not all, and that’s why we called our paper a “biased review”.

The reason that I’m reminding you of the paper now is mostly the new surge of research on “politics and Twitter” in relation to the recent events in the UK, US, and the forthcoming elections in European countries this summer.

Here is the abstract:

In recent years researchers have gravitated to Twitter and other social media platforms as fertile ground for empirical analysis of social phenomena. Social media provides researchers access to trace data of interactions and discourse that once went unrecorded in the offline world. Researchers have sought to use these data to explain social phenomena both particular to social media and applicable to the broader social world. This paper offers a minireview of Twitter-based research on political crowd behavior. This literature offers insight into particular social phenomena on Twitter, but often fails to use standardized methods that permit interpretation beyond individual studies. Read more….

fphy-04-00034-g001

Social Media: an illustration of overestimating the relevance of social media to social events from XKCD. Available online at http://xkcd.com/1239/

Advertisements

Elections and Social Media Presence of the Candidates

Some have called the forthcoming UK general election a Social Media Election. It might be a bit of exaggeration, but there is no doubt that both candidates and voters are very active on social media these days and take them seriously. The Wikipedia-Shapps story of last week is a good example showing how important online presence is for candidates, journalists, and of course voters. We don’t know how important this presence is in terms of shaping the votes, but at least we can look into the data and gauge the presence of the candidates and the activity of the supporters. In this post and some others we present statistics of online activity of parties, candidates, and of course voters. For an example, see the previous post on the searching behaviour of citizens around the debate times.

Who is on Twitter?

Candidates and parties are very much debated by supporters on social media, particularly Facebook and Twitter. But how active are candidates themselves on these platforms? In this post we show simply how many candidates from each party and in which constituencies have a Twitter account. Some of them might be more active than others and some might tweet very rarely, and we will analyse this activity in the next posts. Here we count only who has any kind of publicly known account.

t_all_small

Geographical distribution of candidates who have Twitter account.

The figure above shows the geographical distributions of candidates for each party and whether they have a Twitter account. There are some interesting results in there. For example, Labour has the largest number of Twitter-active candidates, whereas ALL the SNP candidates tweet. While LibDem and Green parties have the same number of accounts, normalised by the overall number of constituencies that they are standing in, Green seems to be more Twitter-enthusiastic. UKIP loses the Twitter game both in absolute number and proportion.

Who is on Wikipedia?

Having a Twitter account is something of a personal decision.  A candidate decides to have one and it’s totally up to them what to tweet. The difference in the case of Wikipedia, is that ideally candidates would not create or edit one about themselves. Also the type of information that you can learn about a candidate on their Wikipedia page is very different to what you can gain by reading their tweets.

Geographical distribution of the candidates, whom Wikipedia has an article about.

Geographical distribution of the candidates, whom Wikipedia has an article about.

The figure above shows the constituencies that the candidates standing in are featured in the largest online encyclopaedia, Wikipedia. Here, Tories are the absolute winners, in terms of the number of articles. Greens are the least “famous” candidates and LibDem are well behind the big two. In the next post we will explore often voters turn to Wikipedia to learn about the parties and candidates, and I’m sure by reading that you’ll be convinced that being featured on Wikipedia is important!

Gender?

All right, so far, Labour won Twitter presence and Tories took Wikipedia (remember all the SNP’s also have a Twitter account). But how about the gender of the candidates? Is there any gender-related feature in social presence pattern of the candidates?

First let’s have a look at the gender distribution of the candidates.

Geographical distribution of the candidates colour-coded by gender.

Geographical distribution of the candidates colour-coded by gender.

As you see in the figure above, there are fewer female candidates than male ones across all the parties. Only 12% of the UKIP candidates are female while the Greens have the highest proportion at 38%. Tories sit right next to UKIP on the list of the most male oriented parties. There is also a clear pattern that most of the constituencies in the centre have male candidates.

How about social media?

Among all the candidates, 20% of male candidates are featured in Wikipedia, whereas this is about 17% for female candidates. Almost half of the Tories male candidates are in Wikipedia, whereas this goes down to 28% for their female counterparts. Only Labour female candidates have more coverage in Wikipedia compared to the males of the party, but the difference is marginal. ّIn all the other parties, males have a higher coverage rate. The tendency of Wikipedia to pay more attention to male figures is a very well known fact. 

Twitter is different. Slightly more female candidates (76%) have a Twitter account than male candidates (69%). Almost all (96%) of Labour females tweet, and Tory female candidates are more active than their male candidates. This pattern however is lost for the UKIP candidates, as 52% of their males are on Twitter compared to only 44% of their female candidates (who have the lowest rate among all the party-gender groups).

Data

The data that we used to produced the maps and figures come mainly from a very interesting crowd-sourced project called yournextmp. However, we further validated the data using the Wikipedia and Twitter API’s. If you want to have a copy, just get in touch!