There’s a very good, open access article, “Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm“, making the rounds. Fig. 1 (.tiff) pretty much sums up the key point:
Fig 1. Many different datasets can lead to the same bar graph. The full data may suggest different conclusions from the summary statistics. The means and SEs for the four example datasets shown in Panels B–E are all within 0.5 units of the means and SEs shown in the bar graph (Panel A). p-values were calculated in R (version 3.0.3) using an unpaired t-test, an unpaired t-test with Welch’s correction for unequal variances, or a Wilcoxon rank sum test. In Panel B, the distribution in both groups appears symmetric. Although the data suggest a small difference between groups, there is substantial overlap between groups. In Panel C, the apparent difference between groups is driven by an outlier. Panel D suggests a possible bimodal distribution. Additional data are needed to confirm that the distribution is bimodal and to determine whether this effect is explained by a covariate. In Panel E, the smaller range of values in group two may simply be due to the fact that there are only three observations. Additional data for group two would be needed to determine whether the groups are actually different.
Mad Biologist here. Three things to note. First, the method used to determine significance can fundamentally change whether or not the result is ‘reportable’ (i.e., p < 0.05). Second, by using a bar graph, we potential miss some interesting biology, as shown by panel D; there, it's quite possible the pattern is affected by an underlying variable (which is why there are two distinct clusters for each column).
But a third point is, to me, critical: when you present the number of observations in the figure, you are much more credulous about the true, biological significance of the result. I believe scientists would be less likely to make public claims about the strength of their work if the actual strength of many papers–the sample size and distribution of the data–were easy to access. As importantly, science communicators, especially journalists, would find it easier to determine if the claims of the paper–or the university press release–were backed by a lot of data.
It’s worth noting that the above figure is a hypothetical example. You might be asking if sample sizes like those shown in the graph are representative or straw men. Well, in a supplemental figure (boo! hiss!), the authors describe what they found in looking at three months worth of tables in the top quarter of physiology journals. In three-quarters of all papers, the maximum sample size–the largest group of points in a single column in the above figure–was fifteen; half of all papers had ten or fewer data points. In three-quarters of all papers, the minimum sample size–the fewest group of points in a single column in the above figure–was six; half of all papers had four or fewer data points.
In fairness, many experiments are expensive, and many labs aren’t set up to do high throughput science, especially on a typical R01 budget, so I don’t think there’s ill intent. But these realities should emphasize the importance of providing truly informative figures.
Cited article: Weissgerber TL, Milic NM, Winham SJ, Garovic VD. Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol. 2015 Apr 22;13(4):e1002128. doi: 10.1371/journal.pbio.1002128.