10
Jul

What are the iconic data graphs of the past 10 years?

Tweet about this on Twitter28Share on Facebook7Share on Google+3Share on LinkedIn2Email this to someone

This article in the New York Times about the supposed death of photography got me thinking about statistics. Apparently, the death of photography has been around the corner for some time now:

For years, photographers have been bracing for this moment, warned that the last rites will be read for photography when video technology becomes good enough for anyone to record. But as this Fourth of July showed me, I think the reports of the death of photography have been greatly exaggerated.

Yet, photography has not died and, says Robin Kelsey, a professor of photography at Harvard,

The fact that we can commit a single image to memory in a way that we cannot with video is a big reason photography is still used so much today.

This got me thinking about data graphics. One long-time gripe about data graphics in R has been it's horrible lack of support for dynamic or interactive graphics. graphics. This is an indisputable fact, especially in the early years. Nowadays there are quite a few extensions and packages that allow R to create dynamic graphics, but it still doesn't feel like part of the "core". I still feel like when I talk to people about R, the first criticism they jump to is the poor support for dynamic/interactive graphics.

But personally, I've never thought it was a big deal. Why? Because I don't really find such graphics useful for truly thinking about data. I've definitely enjoyed viewing some of them (especially some of the D3 stuff), and it's often fun to move sliders around and see how things change (perhaps my favorite is the Baby Name Voyager or maybe this one showing rapper wealth).

But in the end, what are you supposed to walk away with? As a a creator of such a graphic, how are you supposed to communicate the evidence in the data? The key element of dynamic/interactive graphics is that it allows the viewer to explore the data in their own way, not in some prescribed static way that you've explicitly set out. Ultimately, I think that aspect makes dynamic graphics useful for presenting data, but not that useful for presenting evidence. If you want to present evidence, you have to tell a story with the data, you can't just let the viewer tell their own story.

This got me thinking about what are the iconic data "photos" of the past 10 years (or so). The NYT article mentions the famous “Raising the Flag on Iwo Jima” by AP photographer Joe Rosenthal as an image that many would recognize (and perhaps remember). What are the data graphics that are burned in your memory?

I'll give one example. I remember seeing Richard Peto give a talk here about the benefits of smoking cessation and its effect on life expectancy. He found that according to large population surveys, people who quit smoking by the age of 40 or so had more or less the same life expectancy as those who never smoked at all.  The graph he showed was one very similar to Figure 3 from this article. Although I already knew that smoking was bad for you, this picture really crystalized it for me in a specific way.

Of course, sometimes data graphics are memorable for other reasons, but I'd like to try and stay positive here. Which data graphics have made a big impression on you?

  • http://kbroman.wordpress.com/ Karl Broman

    The "pie I have eaten; pie I have not yet eaten" pie chart.

  • Cornelioid

    The plot of string counts for "creationis" versus "intelligent design" in sequential editions of the "Pandas" textbook is perhaps iconic of the contemporary so-called evolution wars, and probably well-remembered by anyone who followed the Kitzmiller v Dover trial.

  • jshoyer

    My favorite is Cleveland's CO2 time series plots in 'Visualizing Data'. Unfortunately I can't quite his code to run.
    http://www.stat.purdue.edu/~wsc/visualizing.html

    It seems like the ideal is a simple series of plots that tell a story on their own, but also allow brushing/linking and identification of specific data points.

    Thanks for all the great recent posts Roger!

  • Francis Gagnon
    • Roger Peng

      Nice! Thanks.

  • Nick Schurch

    "I think that aspect makes dynamic graphics useful for presenting data, but not that useful for presenting evidence. If you want to present evidence, you have to tell a story with the data, you can't just let the viewer tell their own story."

    I don't agree. The disadvantage of static graphs is that the viewer can see the story you've set out, but can't test whether they believe your story because they can't explore the data themselves and find out how robust or cherry picked your story it. they have to trust you on faith.

    With an interactive plot this is also true to some extent, but a bit less so, and there is nothing to stop you setting out the starting point default for the visualization to tell the story you want. In my head, the interactive visualizations should start out from the static visualization with your story, but then allow people to investigate the story and data for themselves to see if/how much they agree with you!