OpenVis is for Journalists!

Note: A version of the following also appears on the Tow Center blog.

Last week I attended the OpenVis Conference in Boston, a smorgasbord of learning dedicated to exploring the use and application of data visualization on the open web, so basically not using proprietary standards. It was hard not to get excited, with a headline keynote like Mike Bostock, the original creator of the popular D3 library for data visualization and now a graphics editor at the New York Times.

Given that news organizations are leading the way with online data storytelling, it was perhaps unsurprising to find a number of journalists presenting at the conference. Kennedy Elliot of the Washington Post talked about coding for the news, imploring attendees to think more like journalists. And we also heard from Lisa Strausfeld and Christopher Cannon who run the new Bloomberg Visual Data lab, and from Lena Groeger at ProPublica who spoke about “thinking small” in visualization.

But even the less overtly journalistic talks somehow seemed to have strong ties and implications for journalism, on everything from storytelling and authoring tools to analytic methodologies. Let me pick on just a few talks that exposed some particularly relevant implications for data journalism.

First up, David Mimno, a professor at Cornell, gave a tour of his work in visualizating machine learning algorithms online to help students learn how those algorithms are working. He demonstrated old classics like k-means and linear regression, but the algorithms were palpable seeing them come to life through animated visualizations. Another example of this comes from the machine learning demos page, which animates and presents an even greater number of algorithms. Where I think this gets really important for journalists is with the whole idea of algorithmic accountability, and the ability to use visualization as a way for journalists to be transparent about the algorithms they use in their reporting.

A good example of where this is already happening is the explanation of the NYT4thDownBot where authors Brian Burke and Kevin Queally use a visualization of a football field (shown below) to explain how their predictive model differs from what actual football coaches tend to do. To the extent that algorithms are deserving of our scrutiny, visualization methods to communicate what they are doing and to somehow make them legible to the public seems incredibly powerful and important for us to work more on.

Alexander Howard recently wrote about “the difficult, complicated process of reporting on data as a source” while being as open and transparent as possible. If there’s one thing the recent launch of 538 has taught us is that there’s a need (and demand) to make the data, and even the code or models, available for data journalism projects.

People are already developing workflows and tools to make this possible online. Another great talk at OpenVis was by Dr. Jake Vanderplas, an astrophysicist working at the University of Washington, who has developed some really amazing open source technology that lets you create interactive D3 visualizations in the browser directly from IPython notebooks. Jake’s work on visualization takes us one step closer to enabling a complete end-to-end workflow for data journalists: data, analysis, and code can sit in the browser and directly render interactive visualizations for the end user. The whole stack is transparent and could potentially even enable the user to tweak, tune, or test variations. To the extent that reproducibility of data journalism projects becomes important to maintain the trust of the audience these sorts of platforms are certainly worth learning more about.

Because of its emphasis on openness and the relationship to transparency and the desire to create news content online, expect OpenVis to continue to develop next year as a key destination for journalists looking to learn more about visualization.