Big Data Viz Rap: Visualize, Analyze and Realize

Posted by on Apr 12, 2013 in Data Visualization | 0 comments

DataVizRap_800x800Robert Mason, Shawn Walker and Jeff Hemsley participated in the University of Washington, Information School’s iAffiliates Day, “an event that fosters new partnerships and showcases the innovative work being done at the iSchool. The event is an unconference format with the theme of discovering information partnerships”. Participants give a two minute lightning talk intended to “enlighten, inspire, educate, or otherwise engage the audience” about a given topic. Jeff chose to “otherwise engage” the audience with a two minute rap about data visualization. Read more for the full text.

Read More

Black Boxes, Systems, and Social Media Studies

Posted by on Mar 16, 2013 in Data Visualization, Information Visualization, Methods, Modeling, Research | 0 comments

black_boxBlack box:  it’s generally what we call the flight data recorder that we all know about…but did you know it’s really bright orange in color (so it can be found easily)?

In the case of social media researchers, the situation today is that much of our research is like the flight data recorder: we collect, store, and report data and analyses, but we follow the dictum on the outside and “do not open” the box.

We’re discovering this is a mistake.

By keeping the black box closed, we can create a misleading impression when we report our research results.  We inhibit others from replicating our findings or testing the limits of our results if we do not fully disclose the details of our processes.  We may also miss the chance to ask research questions if we ignore the opportunities to explore the data by testing the sensitivity of our findings to changes in our research procedures.   There are some things we can do from the outside—approaches borrowed from systems theory and systems analysis approaches—but all of us will improve our research as we make our methods more visible…as we open up the black box.

Let’s look at some examples.  In conducting research with social media data, it’s helpful to think about the sequential ELT steps in data warehousing systems.  In following these steps, we:  Extract (data from streams or sources), Transform (the data by parsing it and including metadata that enable us to address our research questions), and Load (the transformed data into an accessible dataset).  And these are just the first steps—before we begin our analysis.  At each step, small variations in the procedures or rules we use can result in significant shifts to our later findings, to the questions we are capable of answering, and even to questions we can imagine asking.  For example, suppose we want to do an analysis of Twitter messages.  In extracting Twitter data, do we use the Twitter API?  If so, do we collect the data in real time (streaming API) or do we employ queries (search API), getting some retrospective tweets?  If we opt not to use the API, we could use one of several developer-based or commercial services (e.g., Gnip) to get our data, but can we afford it?  Each may have advantages, but the samples that result from each may be different.   If the samples differ, can we be confident in our research results in each case?

Read More

Visualizing threaded conversation volume and intensity

Posted by on Jan 24, 2013 in Data Visualization, Information Visualization, R, r-project | 5 comments

click for larger view

click for larger view

As a researcher interested in information flows in digital environments I’m often interested in finding patterns in social trace data. For this discussion we can think of digital social trace data as the text that people post into threaded topics on forums, like on Reddit or a Wiki Talk page on Wikipedia. One way to find patterns in this kind of data is to make visualizations based on different quantifiable dimensions in the data, for example, total topic volume per day, volume per thread per day, and, possibly, the intensity of the discussion (as interpreted by qualitative researchers). In the remainder of this post I will note what we can learn from our visualization as well as its limitations and then post the R code I used to make the plot.

Read More

R Gauge Plots

Posted by on Jan 17, 2013 in Data Visualization, Information Visualization, R, r-project | 9 comments

Click for larger viewGaston Sanchez’s post on R-Bloggers inspired me to waste a bit of time. He wanted to replicate the Google Charts widget to make gauges. I modified his code (below) in some minor ways and made a function out of it so you can alter the look and feel of your gauge. Feel free to pilfer and modify the R code…

 

Read More

Using R to visually compare the volume of different information sources

Posted by on Jan 16, 2013 in Data Visualization, Information Visualization, Media, R, r-project, Research | 0 comments

A couple of weeks ago Bob wrote about a post about a research note that was recently accepted to the iConference. In it we outline the beginnings of a research project where we look at the interaction of different media platforms (Twitter and Blogs) with more traditional sources. In this post I go through the R code we used to plot, and visually compare, the volume of different information sources.

The data for this example is randomly drawn along a Pareto distribution so anyone should be able to just open the file, run it and have plots. Like I did in the last R example, I have used comments in the code to explain what I’m doing in the creation of these plots. After the code I give a brief introduction on the tool I use to select colors.

Read More

Hockey, Basketball,…and Research?

Posted by on Dec 31, 2012 in Data Visualization, Gatekeeping, Information Visualization, Media, Modeling, Research | 0 comments

Recognizing patterns and rhythms in social media data

Wayne Gretsky is quoted as saying that a great hockey player plays where “the puck is going to be,” not where it is.  Gretsky, like the great NBA point guards (think Magic Johnson or Mark Price), was quick to detect emerging patterns in movement and flows–then take advantage of what was about to happen.  In our research efforts, we often try to detect patterns in order to explore what these patterns may tell us about underlying processes.

PatternsThe SoMe Lab is examining patterns in the movement and flows of information between and among social media platforms.  We observe that traditional media news may inform or trigger information exchanges in the blogosphere or Twitter; and vice versa.  We want to look closely at these patterns to gain insights into phenomena such as virality, the birth and life cycle of interest networks, and the dynamics of a fluid cast of gatekeepers.   The accompanying image illustrates the patterns that distinguish the volume of tweets, blog posts, and traditional news items following the pepper spraying incident at UC-Davis November 18, 2011.

 

Read More