Finding the influentials: Part 1 - Finding Ashton

During my cooling off period after submitting a paper, I thought it might be a good idea to finally put  up one of the projects I worked on while at Rutgers. I did this project while I was taking a graduate class on Graph Mining and Network Analysis. The class was given by Associate Professor Tina Eliassi-Rad. Who I also thank and acknowledge for guidance with the project. So this is the first in a 4/5 post series summarizing my project. As I write these posts I will also be looking back at my analysis then and fixing what needs to be fixed and making it more presentable for this format. So lets begin.

Social media campaigns have multifaceted goals. For some, it's just about spreading as much awareness about a brand as possible. While, for others, it's about getting people engage with the campaign Engage here might mean to click on links and get exposed to a primary ad that hopefully would lead to further brand exposure or, what I think is the real goal, people buying stuff. I am not a marketing student or professional so my terminology might be off. Anyway, apparently marketing research or concepts might all be fluff.

Social Media
Spreading your message on Social Media

Let's say we wanted to measure how effective a social media campaign is. Being a scientist I would want to have some metric/score that I can keep track of to quantify this. One way we can think of this is to see the effect of any social media campaign on the social network. Taking this approach, simply, one can track how many people mention a brand when the campaign is active. So how many times, in a 24 hour cycle, does the word "DietSodaX" show up on Twitter. Keep track of that number for a couple of days and you can then make inferences on how well your campaign is going. With tools like, you can further keep track of how many people click on your campaign, where they come from etc.

Moving a step further, what if we don't want to know just if people are actually "engaging" with our campaign but want to know who might also be enhancing or amplifying that engagement. So simply, we want to find the Oprah in the network. The person who is causing more and more people to see our campaign message and engage with it. So we want to look for influential people who get other people to follow our campaign. Someone could argue that we could just look at the people who mentioned our great new  "DietSodaX", and then look within them for those who have the most followers. This could be sufficient for some uses but in some cases but the many followers/friends might not mean clicks or engagement. Obviously if Oprah did retweet or share our campaign it would lead to crazy engagement but in some situations this is not so. Thus, we need to find these people using other means. Why find them? If we know who they are, we can specifically target them in future to get them to spread our messages more efficiently and wider. So think of it as instant endorsement deals.  This series will be using a correlation based method to quantify the influence of users on the spread of a social media campaign.

As a case study, I will be analyzing Twitter data collected during the airing of the first season Intersexions. Intersexions is a South African edutainment TV show that  has a goal of spreading HIV awareness, specifically awareness at how sexual networks (yay, more networks) impact the spread of HIV. You can see the character sexual relation network below.

Intersexions Sexual Relationship Network
Intersexions Sexual Relations Network (Source)

Thats all for now, wait, there's more.

Trivia: The title Finding Ashton alludes to Ashton Kutcher, who was the first Twitter user to reach 1 million Twitter followers. News Blurb:

Ashton Kutcher Hits 1 Million Twitter Followers

Part II, explaining the dataset, is next.

0 Comments on “Finding the influentials: Part 1 - Finding Ashton

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.