Making Data More Familiar with Concrete Scales

Note: A version of the following also appears on the Tow Center blog.

 

As part of their coverage of the Snowden leaks, last month the Guardian published an interactive to help explain what the NSA data collection activities mean for the public. Above is a screenshot of part of the piece. It allows the user to input the number of friends they have on Facebook and see a typical number of 1st, 2nd (friends-of-friends), and 3rd (friends-of-friends-of-friends) degree connections as compared to places where you typically find different numbers of people. So 250 friends is more than the capacity of a subway car, 40,850 friends-of-friends is more than would fit in Fenway Park, and 6.7 million 3rd degree connections is bigger than the population of Massachusetts.

When we tell stories with data it can be hard for readers to grasp units or measures that are outside of normal human experience or outside of their own personal experience. How much *is* 1 trillion dollars, or 200 calories, really? Unless you’re an economist, or a nutritionist respectively it might be hard to say. Abstract measures and units can benefit from making them more concrete. The idea behind the Guardian interactive was to take something abstract, like a big number of people, and compare it to something more spatially familiar and tangible to help drive it home and make it real.

Researchers Fanny ChevalierRomain Vuillemot, and  Guia Gali have been studying the use of such concrete scales in visualization and recently published a paper detailing some of the challenges and practical steps we can use to more effectively employ these kinds of scales in data journalism and data visualization.

In the paper they describe a few different strategies for making concrete scales, including unitization, anchoring, and analogies. Shown in the figure below, (a) unitization is the idea of re-expression one object in terms of a collection of objects that may be more familiar (e.g. the mass of Saturn is 97 times that of Earth); (b) anchoring uses a familiar object, like the size of a match head, to make the size of another unfamiliar object (e.g. a tick in this case) more concrete; and (c) analogies make parallel comparisons to familiar objects (e.g. atom is to marble, as human head is to earth).

All of these techniques are really about visual comparison to the more familiar. But the familiar isn’t necessarily exact. For instance, if I were to compare the height of the Empire State Building to a number of people stacked up, I would need to use the average height of a person, which is really an idealized approximation. So it’s important to think about the precision of the visual comparisons you might be setting up with concrete scales.

Another strategy often used with concrete scales is containment, which can be useful to communicate impalpable volumes or collections of material. For example you might want to make visible the amount of sugar in different sizes of soda bottles by filling plastic bags with different amounts of granular sugar. Again, this is an approximate comparison but also makes it more familiar and material.

So, how can you design data visualizations to effectively use concrete scales? First you should ask if it’s an unfamiliar unit or whether it has an extreme magnitude that would make it difficult to comprehend. Then you need to find a good comparison unit that is more familiar to people. Does it make sense to unitize, anchor, or use an analogy? And if you use an anchor or container, which one should you choose? The answers to these questions will depend on your particular design situation as well as the semantics of the data you’re working with. A number of examples that the researchers have tagged are available online.

The individual nature of “what is familiar” does beg the question about personalization of concrete scales too. Michael Keller’s work for Al Jazeera lets you compare the number of refugees from the Syrian conflict to a geographic extent in the US, essentially letting the user’s own familiarity with geography guide what area they want to compare as an anchor. What if this type of personalization could also be automated? Consider logging into Facebook or Twitter and the visualization adapting to use concrete scales to the places, objects, or organizations you’re most familiar with based on your profile information. This type of automated visualization adaptation could help make such visual depictions of data much more personally relevant and interesting.

Even though concrete scales are often used in data visualizations in the media it’s worth also realizing that there are some open questions too. How do we define whether an anchor or unit is “familiar” or not, and what makes one concrete unit better than another? Perhaps some scales make people feel like they understand the visualization better or help the reader remember the visualization better. There are still many open questions for empirical research.