Stats Legend Andrew Gelman Explains What Makes For Great Charts And Graphs, And How You're Doing It All Wrong

Data visualisation Gelman

Photo: Andrew Gelman and Antony Unwin

Andrew Gelman is the Director of the Applied Statistics centre at Columbia University and is one of the most respected data scientists in the field. He’s penned a number of texts that have gone on to become classics.

He — with former Harvard Statistics Department Chair Donald Rubin — wrote the definitive academic text on Bayesian Analysis.

Gelman also wrote the political statistics book Red State, Blue State, Rich State, Poor State, another must-read.

In short, Gelman knows quite a bit about data visualisation. He and Dr. Antony Unwin from the University of Augsburg collaborated on this presentation where they take a hard look at what’s wrong with the recent trends of data visualisation and infographics. 

The takeaway is that while there have been great leaps in visualisation technology, some of the visualizations that have garnered the highest praises have actually been lacking in a number of key areas. 

Specifically, the pair does a takedown of the top visualizations of 2008 as decided by the popular statistics blog Flowing Data.

The pair wrote an in-depth article and a followup giving great details and insight on the status of infographics, but this slideshow is a must read for anyone who is interested in the future of data visualisation. 

Good infographics balance fulfilling dual goals of communication and discovery — they have to communicate data clearly and also encourage exploration

Here's an example of a good infographic — it's compelling, enables discovery, communicates a point...

...and is much, much more visually compelling than a simple scatterplot.

The duo wanted to know why they resoundingly disliked some of the most highly acclaimed data visualizations of the year

Here's one of their main beefs. Infographics like this are becoming very popular:

It's gotten to the point where USA Today's goofy graphics are easily satirized

Still, they argue, there are a number of bright spots in the world of charts.

They critiqued this infographic — it's full of extensive detail presented in an exciting way, but just proves that summer is hotter than winter and rain is occasionally above-average.

There has been a lot of work on statistical graphics, both in academia and in psychological research.

The essential questions: Why do we represent data visually? What are the difference between information visualisation (Infovis) and statistical graphing?

The duo looks at Nathan Yau of Flowing Data's top 5 visualizations of the year, and didn't like what they saw

Readers don't get a better understanding of the data from Wordle. Because of the randomness, readers are just entertained by trying to figure out how the visualisation works.

This graphic is a visualisation of box office returns over the course of one year

Unwin and Gelman didn't like this, because the colour scheme makes it difficult to interpret and over complicates simple information in the interest of looking cool

This data, they argue, would have been much more compelling broken into two charts — one with total movie sales over time, the other showing individual film trajectories

The other graphics — two were music videos that used data, the other showed air traffic visualized over Britain — were merely eye-catching, and weren't informative

The two different motivations for visualisation — the statistical and the graphic design — have a middle ground

The Flowing Data top five weren't the only visualizations Gelman and Unwin didn't like — They didn't love The Guardian's favourite visualisation, a map from David McCandless showing plane crash locations.

Basically: What's the point of both a dot and a circle? That's redundant.

One example they cite of a seamless, clear merger of statistical info and visual design is Florence Nightingale's coxcomb of Crimean War deaths by month

It's attention grabbing, you can get a whole view of the data quickly, but you can also study it and discover intricacies

Compared to Nightingale's engaging visualisation, the basic line graph is astoundingly boring

On the other hand, the pair thinks that this visualisation doesn't convey data with clarity

While a classic scatterplot illustrates the information with excellent clarity

One enemy of clear presentation could be the software used to design graphics.

Kate Harding wanted to graph the estimations of height and weight made by respondents to this picture

Even esteemed organisations like the BBC can make a useless graph because of mediocre software — it's not their fault, according to Gelman and Unwin

Also from the database: At the beginning of the century, boys' names mostly ended with one of 10 different letters.

By the middle of the century, that declined to around six.

Here are the main takeaways from the presentation

Now, though, the comparison is much more clear with the graph on the right because the comparison of interest is highlighted.

In short: Statistical graphics have clarity and are vast improvements on tables and numbers

Information visualizations have benefits, mostly in the form of engagement and piqued curiosity

The whole point Gelman and Unwin want to make is that by blending the two approaches, we can create better visualizations.

There's more to understanding statistics than just charts.

NOW WATCH: Money & Markets videos

Want to read a more in-depth view on the trends influencing Australian business and the global economy? BI / Research is designed to help executives and industry leaders understand the major challenges and opportunities for industry, technology, strategy and the economy in the future. Sign up for free at research.businessinsider.com.au.