Data Presentation Matters: A Partial Solution to the Reproducibility Crisis

There’s a very good, open access article, “Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm“, making the rounds. Fig. 1 (.tiff) pretty much sums up the key point:

Fig 1
Fig 1. Many different datasets can lead to the same bar graph. The full data may suggest different conclusions from the summary statistics. The means and SEs for the four example datasets shown in Panels B–E are all within 0.5 units of the means and SEs shown in the bar graph (Panel A). p-values were calculated in R (version 3.0.3) using an unpaired t-test, an unpaired t-test with Welch’s correction for unequal variances, or a Wilcoxon rank sum test. In Panel B, the distribution in both groups appears symmetric. Although the data suggest a small difference between groups, there is substantial overlap between groups. In Panel C, the apparent difference between groups is driven by an outlier. Panel D suggests a possible bimodal distribution. Additional data are needed to confirm that the distribution is bimodal and to determine whether this effect is explained by a covariate. In Panel E, the smaller range of values in group two may simply be due to the fact that there are only three observations. Additional data for group two would be needed to determine whether the groups are actually different.

Mad Biologist here. Three things to note. First, the method used to determine significance can fundamentally change whether or not the result is ‘reportable’ (i.e., p < 0.05). Second, by using a bar graph, we potential miss some interesting biology, as shown by panel D; there, it's quite possible the pattern is affected by an underlying variable (which is why there are two distinct clusters for each column).

But a third point is, to me, critical: when you present the number of observations in the figure, you are much more credulous about the true, biological significance of the result. I believe scientists would be less likely to make public claims about the strength of their work if the actual strength of many papers–the sample size and distribution of the data–were easy to access. As importantly, science communicators, especially journalists, would find it easier to determine if the claims of the paper–or the university press release–were backed by a lot of data.

It’s worth noting that the above figure is a hypothetical example. You might be asking if sample sizes like those shown in the graph are representative or straw men. Well, in a supplemental figure (boo! hiss!), the authors describe what they found in looking at three months worth of tables in the top quarter of physiology journals. In three-quarters of all papers, the maximum sample size–the largest group of points in a single column in the above figure–was fifteen; half of all papers had ten or fewer data points. In three-quarters of all papers, the minimum sample size–the fewest group of points in a single column in the above figure–was six; half of all papers had four or fewer data points.

It’s not difficult to see how this could lead to a reproducibility problem, especially with small effect sizes.

In fairness, many experiments are expensive, and many labs aren’t set up to do high throughput science, especially on a typical R01 budget, so I don’t think there’s ill intent. But these realities should emphasize the importance of providing truly informative figures.

Cited article: Weissgerber TL, Milic NM, Winham SJ, Garovic VD. Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol. 2015 Apr 22;13(4):e1002128. doi: 10.1371/journal.pbio.1002128.

1 Response to Data Presentation Matters: A Partial Solution to the Reproducibility Crisis

jrkrideau says:

April 28, 2015 at 2:05 pm

When I first saw the figure my initial thoght was “ah, the ancombe data set revisited” but it appears that the authors are addressing a more serious data presentation issue.

I had rather thought (well hoped) that most researchers had stopped using dynamite plots (technical/derogatory) term for those histograms with the ‘error’ bars of which we can only see half of the actual bar.

I know of one medical stats dept where, the last time I looked at their website, statisticians were explictly told that they did not have to collaborate on projects were dynamite plots were used.

Loading...

Comments are closed.