#DataHack3: Where are the Tweeting classes from?

This is a continuation into an analysis dive into South African election related Twitter data. Previously I created DataHack1 and  DataHack2.

Location(s), Location(s), Location(s)

A common criticism when analyzing social media data in South Africa is how it is not representative of the general public. I agree, to a point. I think as time goes on and the (mobile) internet becomes more ubiquitous in South Africa, social media, especially Twitter will have stronger and stronger representation. So the natural question to ask, then is: Where are the Election Tweets coming from?

To answer this question is not made easy by Twitter. A big part of all tweets sent do not have location information on them. For me, personally, this is a good thing as it means people are not publicly revealing their location and as such retain a little more privacy. As a data scientist, yeah it's not that great as I have to throw away a lot of information to find the Twitter users in the election datasets who happen to be tweeting with their location broadcasting. This turned out to not be that bad.

Tweet Locations

The visualization below is the locations of tweets sent on April 22nd. As you can see the map is very representative of the country. The metros have the highest densities but the locations reported cover most of the populated areas in the country.

User Self Reported Locations

The second visualization is of the self reported locations of users.This meansImined the user profiles and checkedwhatlocation they said they were from. Then I usedGoogleGeoLocation APIs to find their locations and then map them. Again the user locations are also all over the country.


As always, I made available the Twitter JSON/CSV dumps at my GitHub. Also included are GeoJSON files with the tweet locations and user locations on generated on a daily basis. Grab the continuously updated data here ->  github:za-2014-election-tweets

iPython Notebooks

1 Comment on “#DataHack3: Where are the Tweeting classes from?

  1. Hi
    It is really interensting your web.
    I want to do a similar investigation for Chilean comments on twitter.
    I would like to know the requirements that you add to MST2ED's python notebook or you have used for this amazing investigation.
    Thanks for your help

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.