Visualization of a Twitter retweet network: art or useful data visualization?

Posted by on Jul 14, 2012 in Data Visualization, Information Visualization, R, r-project, Social Networks | 6 comments

This is a Twitter retweet network. When people tweet, they may get retweeted by other people, repeating the message for their followers to view. Each retweet is a one-way flow of information that links the first person to each person who retweeted them (forwarded the original tweet into their own network). So, in this visualization we are looking at a network of people (white nodes) linked (orange lines) by information flows in the form of 140 character retweets. But is this kind of visualization helpful for analysis or just a kind of computer generated eye-candy?

One of the criticisms this kind of plot frequently gets is that it is a ‘yarn-ball’, meaning it’s too complicated to make sense of. Indeed, this graph contains 44,783 nodes (people) and 163,329 edges (retweets). The data for the network is a subset of our 60+ million tweets related to Occupy. Specifically, this network is made up of retweets that contain the hashtags #OccupyOakland and ##OO from October 30th, 2011 to November 7th, 2011.

Despite the complexity, we can still make some observations about the data and the network. For example, the ring around the outside is unconnected to the core and represents people who tweeted with the #OccupyOakland or #OO hashtag, and were retweeted, but not by anyone in the core of the network. Also, these people did not retweet anything from anyone inside the core; at least, not in the nine days of data used in this plot. From poking around in the data I know there is a fair amount of flame (derogatory or insulting tweets lobbed into the information stream). Could some of these represent Occupy’s detractors?

We can also see that the core is extremely densely connected, but despite this there are a great many hops along links between people on the left side of the graph and those on the right (network diameter). I haven’t calculated statistics for this, but if we were to decide it that was important to know that it would be fairly easy to do so.

If we zoom in a bit we can see that the larger white blobs (my adviser calls them mushrooms) are actually clusters of users. Since they are all linked to the same node, we know that they all retweeted a single person. Thus, these mushrooms represent highly popular tweets that ‘went viral’ and effectively reached a broad audience. But some of these one-off large retweets do not appear to come from folks in the core. Could they be cases were someone was in the right place at the right time to report on an incident on interest?

I’ll zoom in one more time, and at this level we can see part of the ring structure. If you click the image (any of the three) you can enlarge it. If you do, you will see a common feature of data like these: the vast number of tweets that get retweeted are only retweeted once or twice. Very few tweets, relative to the entire set, get retweeted one hundred times or more. So some of the retweets in the ring could be flame, as I mentioned before, but could also just be an artifact of human attention dynamics.

So, art or fodder for analysis?

My own sense is that if I look at this image long enough, given what I have studied about social networks, communication theory and network structures, many many questions come to mind for me. But I also admit to just getting a kick of fiddling with R and plotting images I think are cool.

That’s all for now. You can contact me on Twitter @JeffHemsley. Happy to answer any questions.


  1. 7-16-2012

    Thank you for a great post :) May I ask about the software you used to generate these plots?

    • 7-19-2012

      Hi Alia,
      The data was crunched and plotted using R, an analysis and plotting tool.

  2. 7-18-2012

    I received an email question about what package in R I used for the plots: iGraph. I was a devote of Statnet and SNA but they don’t seem to handle large graphs as well as iGraph.

    Gephi is good too, but iGraph seemed to handle this dense network faster and allows nearly full control of vertex and edge attributes.

  3. 11-15-2012

    Jeff, it is great to see you working on such modern things. Love the graph. I would love to present this to some of my current students. What do you know about StatNet? I see it uses MCMC, as someone intereste in applications of Bayesian Statistics and R I would love to know more.

    Cheers, hope to see you in the Seattle Area some day.


    • 11-15-2012

      The folks I talk to all seem to think StatNet is tops for analysis, just not for visualization of larger graphs. I have read of some mapreduce-like (?) plugins that can work with StatNet so you can do large graphs, but I didn’t feel like going that route, and iGraph is pretty easy to learn. But iGraph doesn’t seem to have implemented random walk centrality (betweenness, closeness) yet, so for unconnected graphs, if I want some of these centrality scores, I still have to load StatNet.

      iGraph also has the ability to work interactively with graphs (well, statnet does too, but, yech).

  4. 7-19-2012

    Art or Sci: yeah, it is a real question in some academic circles.

    Thanks for the comment.


  1. Jeff Hemsley - Visualization of a Twitter retweet network: art or useful data visualization? | Coursera: Social Network Analysis - SNA | - [...] What we are looking at is a Twitter retweet network. When people tweet, they may get retweeted by other …
  2. Visualization of a Twitter retweet network: art or useful data visualization? | SoMe Lab | Visualization Gallery | - [...] Art or science?  Is that a real question outside of the classroom or bar?  [...]
  3. Network vizualization and meaning shifting due to algorithm settings | SoMe Lab - [...] The image for this post is the same OccupyOakland retweet network that I have used in other posts, but …

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>