One of the things I’ve grappled with in my studies of social media is how the nodes of networked dyads related to one another spatially. As a geographer, I’m well familiar with Tobler’s first law of geography: “Everything is related to everything else, but near things are more related than distant things.” I wanted to see if this held true for networked geographic imaginaries within our Occupy Twitter data set: do the ways in which users co-locate #occupy<city> hashtags within their tweets relate at all to the distance between the two cities mentioned?
Spoiler alert: no. But we learned some stuff along the way.
First, this is a map of every geolocated tweet within our Occupy corpus from Oct. 19 to Dec. 31. Keep in mind that just because a tweet has made it into our corpus does not necessarily mean that it is “occupy related” – just that the tweet or something within the tweet contained a keyword in which we were interested. We initially seeded our database with information from the occupytogether.org website, and we found quite a bit of geolocated activity. Geolocation is an “opt-in” technology, which means at some point, a user decided to turn on that function for their application. This might be a powerful way of expressing solidarity, a show of “activism tourism,” or even someone who turned it on previously and forgot (or didn’t care) to turn it back off again. Note the geographic spread of information.
This is a map of the connections between locational hashtags which we were collecting as keywords. Note how the geographies have changed drastically – eastern Asia disappears entirely, as does Africa and much of South America. This is illustrative of the “global north” biases within social media information that have been noted by Graham and Zook. But as they were the only hashtags we were sure that we completely obtained, we opted to continue the analysis with just these components.
When that map is created to illustrate the strength of each of these links as the number of co-occurrences of a hashtag dyad, the uneven geographic distribution of these links becomes even more apparent. At this point, statistical analyses were run to determine if there was a correlation between distance and the number of co-occurrences of a hashtag pair. Surprisingly, none was uncovered! We suspected that this might be partly owed to the total number of times a hashtag appeared in our corpus. So we turned to a Jaccard Index as a means of trying to eliminate the influence of overall number of appearances of a given hashtag within a database.
A Jaccard Index is a way of normalizing the number of dyads against the number of times each of those single hashtags appears within the sub-corpus being analyzed. It divides the intersection of the hashtags (A and B) by the union of the hashtags (A or B) to determine the strength of the relationship in comparison to the other relationships being studied. When we examined just those dyad pairs that had more than 100 co-occurrences, this map appeared:
Note that the rest of the world has seemingly disappeared. Because our analysis was bound by our data selection, we’ve introduced a strong North American bias to our analysis. In comparison to our initial map, it seems a bit strange. And using the Jaccard Index as a dependent variable against distance still shows no significant correlation.
This may mean that geographers will have to consider space differently; non-Cartesian conceptualizations of space are theoretically rich, but sometimes difficult to grasp empirically and/or quantitatively. Qualitative analytic techniques may be one possible response to this methodological challenge, as might be quantitative blockmodeling and treatment of scale (relational and Cartesian) as a better measure of “distance” than the classic great circles of the past.