
Storytelling with data visualization is still very much in its “Wild West” phase, with journalism outlets blazing new paths in exploring the burgeoning craft of integrating the testimony of data together with compelling narrative. Leaders such as The News York Times create impressive data-driven presentations like 512 Paths to the White House (seen above) that weave complex information into a palatable presentation. But as I look out at the kinds of meetings where data visualizers converge, like Eyeo, Tapestry, OpenVis, and the infographics summit Malofiej, I realize there’s a whole lot of inspiration out there, and some damn fine examples of great work, but I still find it hard to get a sense of direction — which way is West, which way to the promised land?
And it occurred to me: We need a science of data-visualization storytelling. We need some direction. We need to know what makes a data story “work”. And what does a data story that “works” even mean?
Examples abound, and while we have theories for color use, visual salience and perception, and graph design that suggest how to depict data efficiently, we still don’t know, with any particular scientific rigor, which are better stories. At the Tapestry conference, where I attended, journalists such as Jonathan Corum, Hannah Fairfield, and Cheryl Phillips whipped out a staggering variety of examples in their presentations. Jonathan, in his keynote, talked about “A History of the Detainee Population” an interactive NYT graphic (partially excerpted below) depicting how Guantanamo prisoners have, over time, slowly been moved back to their country of origin. I would say that the presentation is effective. I “got” the message. But I also realize that, because the visualization is animated, it’s difficult to see the overall trend over time — to compare one year to the next. There are different ways to tell this story, some of which may be more effective than others for a range of storytelling goals.

Critical blogs such as The Why Axis and Graphic Sociology have arisen to try to fill the gap of understanding what works and what doesn’t. And research on visualization rhetoric has tried to situate narrative data visualization in terms of the rhetorical techniques authors may use to convey their story. Useful as these efforts are in their thick description and critical analysis, and for increasing visual literacy, they don’t go far enough toward building predictive theories of how data-visualization stories are “read” by the audience at large.
Corum, a graphics editor at NYT, has a descriptive framework to explain his design process and decisions. It describes the tensions between interactivity and story, between oversimplification and overwhelming detail, and between exploration and decoration. Other axes of design include elements such as focus versus depth and the author versus the audience. Author and educator Alberto Cairo exhibits similar sets of design dimensions in his book, “The Functional Art“, which start to trace the features along which data-visualization stories can vary (recreated below).

Such descriptions are a great starting point, but to make further progress on interactive data storytelling we need to know which of the many experiments happening out in the wild are having their desired effect on readers. Design decisions like how and where annotations are placed on a visualization, how the story is structured across the canvas and over time, the graphical style including things like visual embellishments and novelties, as well as data mapping and aggregation can all have consequences on how the audience perceives the story. How does the effect on the audience change when modulating these various design dimensions? A science of data-visualization storytelling should seek to answer that question.
But still the question looms: What does a data story that “works” even mean? While efficiency and parsimony of visual representation may still be important in some contexts, I believe online storytelling demands something else. What effects on the audience should we measure? As data visualization researcher Robert Kosara writes in his forthcoming IEEE Computer article on the subject, “there are no clearly defined metrics or evaluation methods … Developing these will require the definition of, and agreement on, goals: what do we expect stories to achieve, and how do we measure it?”
There are some hints in recent research in information visualization for how we might evaluate visualizations that communicate or present information. We might for instance ask questions about how effectively a message is acquired by the audience: Did they learn it faster or better? Was is memorable, or did they forget it 5 minutes, 5 hours, or 5 weeks later? We might ask whether the data story spurred any personal insights or questions, and to what degree users were “engaged” with the presentation. Engaged here could mean clicks and hovers of the mouse on the visualization, how often widgets and filters for the presentation where touched, or even whether users shared or conversed around the visualization. We might ask if users felt they understood the context of the data and if they felt confident in their interpretation of the story: Did they feel they could make an informed decision on some issue based on the presentation? Credibility being an important attribute for news outlets, we might wonder whether some data story presentations are more trustworthy than others. In some contexts a presentation that is persuasive is the most important factor. Finally, since some of the best stories are those that evoke emotional responses, we might ask how to do the same with data stories.
Measuring some of these factors is as straightforward as instrumenting the presentations themselves to know where users moved their mouse, clicked, or shared. There are a variety of remote usability testing services that can already help with that. Measuring other factors might require writing and attaching survey questions to ask users about their perceptions of the experience. While the best graphics departments do a fair bit of internal iteration and testing it would be interesting to see what they could learn by setting up experiments that varied their designs minutely to see how that affected the audience along any of the dimensions delineated above. More collaboration between industry and academia could accelerate this process of building knowledge of the impact of data stories on the audience.
I’m not arguing that the creativity and boundary-pushing in data-visualization storytelling should cease. It’s inspiring looking at the range of visual stories that artists and illustrators produce. And sometimes all you really want is an amuse yeux — a little bit of visual amusement. Let’s not get rid of that. But I do think we’re at an inflection point where we know enough of the design dimensions to start building models of how to reliably know what story designs achieve certain goals for different kinds of story, audience, data, and context. We stand only to be able to further amplify the impact of such stories by studying them more systematically.


Home
Visualization, Data, and Social Media Response
I’ve been looking into how people comment on data and visualization recently and one aspect of that has been studying the Guardian’s Datablog. The Datablog publishes stories of and about data, oftentimes including visualizations such as charts, graphs, or maps. It also has a fairly vibrant commenting community.
So I set out to gather some of my own data. I scraped 803 articles from the Datablog including all of their comments. Of this data I wanted to know if articles which contained embedded data tables or embedded visualizations produced more of a social media response. That is, do people talk more about the article if it contains data and/or visualization? The answer is yes, and the details are below.
While the number of comments could be scraped off of the Datablog site itself I turned to Mechanical Turk to crowdsource some other elements of metadata collection: (1) the number of tweets per article, (2) whether the article has an embedded data table, and (3) whether the article has an embedded visualization. I did a spot check on 3% of the results from Turk in order to assess the Turkers’ accuracy on collecting these other pieces of metadata: it was about 96% overall, which I thought was clean enough to start doing some further analysis.
So next I wanted to look at how the “has visualization” and “has table” features affect (1) tweet volume, and (2) comment volume. There are four possibilities: the article has (1) a visualization and a table, (2) a visualization and no table, (3) no visualization and a table, (4) no visualization and no table. Since both the tweet volume and comment volume are not normally distributed variables I log transformed them to get them to be normal (this is an assumption of the following statistical tests). Moreover, there were a few outliers in the data and so anything beyond 3 standard deviations from the mean of the log transformed variables was not considered.
For number of tweets per article:
I ran an ANOVA with post-hoc Bonferroni tests to see if these means were significant. Articles with both a visualization and a table (case 1) have a significantly higher number of tweets than cases 3 (p < .01) and 4 (p < .05). Articles with just the visualization and no data table have a higher number of average tweets per article, but this was not statistically significant. The take away is that it seems that the combination of a visualization and a data table drives a significantly higher twitter response.
Results for number of comments per article are similar:
Again with the ANOVA and post-hoc Bonferroni tests to assess statistically significant differences between means. This time there was only one statistically significant difference: Articles with both a visualization and a table (case 1) have a higher number of comments than articles with neither a visualization nor a table (case 4). The p value was 0.04. Again, the combination of visualization and data table drove more of an audience response in terms of commenting behavior.
The overall take-away here is that people like to talk about articles (at least in the context of the audience of the Guardian Datablog) when both data and visualization are used to tell the story. Articles which used both had more than twice the number of tweets and about 1.5 times the number of comments versus articles which had neither. If getting people talking about your reporting is your goal, use more data and visualization, which, in retrospect, I probably also should have done for this blog post.
As a final thought I should note there are potential confounds in these results. For one, articles with data in them may stay “green” for longer thus slowly accreting a larger and larger social media response. One area to look at would be the acceleration of commenting in addition to volume. Another thing that I had no control over is whether some stories are promoted more than others: if the editors at the Guardian had a bias to promote articles with both visualizations and data then this would drive the audience response numbers up on those stories too. In other words, it’s still interesting and worthwhile to consider various explanations for these results.