Authoring Data-Driven Documents

Over the last few months I’ve been learning D3 (Data-Driven Documents), which is a really powerful data visualization library built for javascript. The InfoVis paper gets to the gritty details of how it supports data transformations, immediate evaluation of attributes, and a native SVG representation. These features can be more or less helpful depending on what kind of visualization you’re working on. For instance, transformations don’t really matter if you’re just building static graphs. But being able to inspect the SVG representation of your visualization (and edit it in the console) is really quite helpful and powerful.

But for all the power that D3 affords, is programming really how we should be (want to be?) authoring visualizations?

Here’s something that I recently made with D3. It’s a story about U.S. manufacturing productivity, employment, and automation told across a series of panels programmed using D3.

Now, of course, the exploratory data analysis, storyboarding, and research needed to tell this story were time-consuming. But after all that, using D3 to render the graphs I wanted was substantially more tedious and time-consuming than I would have liked. I think this was because (1) my knowledge of SVG is not fantastic and I’m still learning that, but more importantly (2) D3 supports very low-level operations that make high level activities for basic data storytelling time-consuming to implement. And yes, D3 does provide a number of helper modules and layouts, but these aren’t documented with clear examples using concrete data that would make it obvious how to easily utilize them. Having support for the library on jsFiddle, together with some very simple examples would go a long way towards helping noobs (like me!) ramp up.

But, really, where’s the flash-like authoring tool of data visualization? Such a tool could be used to interactively manipulate a D3 visualization and, when you’re done, output HTML + CSS + D3 code to generate your graphs (including animation, transitions, etc.). The tool would also include basic graph templates that could be populated with your data and customized. Basic storytelling functions for highlighting important aspects or comparisons of the data (e.g. through animation, color, juxtaposition, etc.), or using text to annotate and explain the data could also be supported. D3 suffers from a bit of a usability problem right now, and powerful as it is, authoring stories with visualization doesn’t need to be, nor should it be, bound up in programming.