Babies Learning Language: Don't bar barplots, but use them cautiously

Friday, November 4, 2016

Don't bar barplots, but use them cautiously

Should we outlaw the the commonest visualization in psychology? The hashtag #barbarplots has been introduced as part of a systematic campaign to promote a ban on bar graphs. The argument is simple: barplots mask the distributional form of the data, and all sorts of other visualization forms exist that are more flexible and precise, including boxplots, violin plots, and scatter plots. All of these show the distributional characteristics of a dataset more effectively than a bar plot.

Every time the issue gets discussed on twitter, I get a little bit rant-y; this post is my attempt to explain why. It's not because I fundamentally disagree with the argument. Barplots do mask important distributional facts about datasets. But there's more we have to take into account.

Here's my basic argument:

Hey #barbarplots folks: I agree with you that plotting variability is important, but the world of data is big! /1
— Michael C. Frank (@mcxfrank) August 10, 2016

Sometimes you need a summary stat because you have lots of observations, sometimes because there are many conditions to compare. /2
— Michael C. Frank (@mcxfrank) August 10, 2016

Bars are overused when neither of those apply; they shouldn't always be the default. Lines and points are usually cleaner. /3
— Michael C. Frank (@mcxfrank) August 10, 2016

ANOVA also overused and used inappropriately - would be silly to ban it. Same for bars. Move the defaults and #barwithcaution. /end
— Michael C. Frank (@mcxfrank) August 10, 2016

When I originally posted that rant, I was in transit and didn't get a chance to illustrate my point, so there was a lot of back-and-forth about what good use cases for bars would be.* The basic one that comes to mind for me is in analyzing datasets where there are many discrete independent variables (e.g., conditions, experiments) and not many observations per participant. This structure describes many experiments I've worked on.

I put together an example visualization, based on Experiment 4 of this paper. All code and data here, in the experiment 4 analysis script. Here's the plot we put in the paper:

I chose a barplot because there were a lot of planned age groups and conditions and it seemed like an easy way to represent that discrete structure in the data, along with summary means and 95% CIs. I like to visualize by-subject distributions (I was actually a bit fetishistic about it in my early papers), but the data I was plotting here had only four observations per child. As a result, simple jitter plots look crazy:

And box plots are useless.

Violin plots are useless too.

The best alternative I saw was this one, but it still looks too sparse to me:

Having posted these to twitter along with the data, TJ Mahr rose to the challenge to do better:

@annemscheel @mcxfrank hmmm didn't realize only 4 trials per cell. not much y variability to be revealed by points. pic.twitter.com/bLaRixZV4q
— tj mahr (@tjmahr) November 4, 2016

I like this representation, and with some tweaking it might be a nice alternative to put in a paper like the one I wrote.

But here's my point: these visualizations are good for different things. The barplot is simple and easy to read – and it compresses well. (This point is made by Heer & Bostock, 2010, as well). Consider what happens when we shrink the plotting space for these (using my version of TJ's so I can hold image size constant):

Or even tinier:

My sense is that the barplot holds up to compression much better, at least modulo the font size. In addition, I would never show the jitterdodge masterwork to a popular audience (or even really to a class). It's just got too much going on.

My broader point: banning particular data analyses or visualizations just doesn't seem like the right answer. Particular visualizations can be right for certain contexts, for certain audiences, and for certain data types. The world of data is broad. We can change the defaults, but we shouldn't ban something that has important uses.

---
* Everyone in the discussion agrees that bars are fine for visualizing single discrete values, e.g. as in the counts in a histogram.

4 comments:

UnknownNovember 5, 2016 at 2:53 AM
First, we all agree about changing the default, so that's a great start. Barplots have too many problems (see links at the end). It's just easier to catch attention with #barbarplots than #thinkabouttheviolins or #boxesoverbars or the previously used #showyourdistributions. I'm thrilled it reached so many people, by the way, and we're now rethinking the way we present data.

I have some specific issues with the present example:

- I think implicitly we interpret bars as signifying some roughly normally distributed data around the displayed mean with a variance indicated by the error bar. In some ways barplots are a graphical equivalent of an independent samples t-test (if it were paired, I'd expect a bar of a difference score because that's what the t-test is based on, too).
In that case, I'd call the barplot misleading in this specific case, because when looking at raw data we don't see the expected distribution at all, quite the contrary.

- Visual perception literature shows that size/surface are visually very salient, we interpret larger bars as more important. But systematic below chance performance doesn't necessarily mean kids performed worse here (they might follow strategies, something I've encountered before in a picture selection task, to my own great surprise thanks to lots of barplots). Plus, on chance performance looks "better" than below chance, which is also not always the most accurate interpretation. It's not the case that those participants knew less, for example.

- If we were to choose figures by how readable when compressed, are there any alternatives to filled boxes in general? Then all line plots need to go, which isn't good news for eye tracking and ERP papers. The same holds for scatterplots, so correlations and meta-analyses are in trouble, too. I'm not sure how robust a criterion this would be.
A filled boxplot with an emphasized median line, by the way, has many of the visual properties of barplots and might hold up to compression similarly well.

To respond to the last point:
For teaching, I'd actually prefer to use a representation of the raw data. Students and fellow scientists form other fields often have no clue what our data look like. I think it's important to communicate the variability and distribution of our data better instead of making a "neat" impression.
In outreach contexts, I'd not present as much data to begin with to stick with a simple message. But that depends on the audience, and the goal of the talk.

Some more information:
http://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1002128

https://cogtales.wordpress.com/2016/06/06/congratulations-barbarplots/
ReplyDelete
Replies
Sho TsujiNovember 5, 2016 at 10:14 PM
Thanks for this post. It is wonderful that there is discussion about our standards of plotting, which is really the core aim of #barbarplots.

One central aspect your post touches upon is simplicity, which is I feel a point worth elaborating on. It's actually a bit ironic that we at #barbarplots favored, in the choice of our hashtag, simplicity and provocativeness over complexity and accuracy (and thus did not use #thinkabouttheviolins or #usebarplotsthoughtfully; certainly a decision not immune to criticism), and that we are nevertheless arguing against barplots in order to favor complexity and accuracy over simplicity and provocativeness. It is ironic but also telling, because there are obviously arguments for both of these extremes, and that this depends on context and audience, as you rightly point out.
I would, though, argue that the context in which we represent data to a scientific audience (i.e., in your article) is not the place for simplification. I think it is GREAT that we can see where the data is coming from in TJ's version of the graph. It seems to me even especially great for classrooms - sure, it will take students a bit longer to spot the mean differences, but that will likely lead to a better understanding of where these differences are coming from, including an understanding of sample and trial sizes in developmental studies.
As I mentioned, we chose our oversimplifying hashtag, since our primary aim was to raise awareness of a problematic default. So by analogy (even if it might be a somewhat far-fetched one), I would argue that the easy-to-read features of a barplot in the case of your data would be the method of choice e.g. in a situation where you'd present your data to policy makers that just need to understand that your intervention changes outcomes.

ReplyDelete
Replies

Add comment