The Mad Biologist, of course, is never a bad reviewer…
Every so often, there’s a spate of articles about scientific fraud (here’s a recent entry). I don’t mean to downplay the issue–it is serious. In my daily work though, I encounter far more issues with poor reviewing and editing: that is, errors*. One of the things that keeps me off the streets and out of trouble is bioinformatic curation, which might (?) sound glamorous but is really nothing more than reading scientific literature closely and then extracting useful information to be added to databases and used in various software programs. For example, a paper might describe a mutation in a gene that confers resistance to an antibiotic, so I need to figure out what the mutation is and if it’s actually worth incorporating: I might not agree with the assessment of the authors (e.g., what they’re calling ‘resistance’ probably shouldn’t be considered resistance), or there might be some other data issue.
Let me give you a few, somewhat anonymized, examples:
- One paper described four different mutations that conferred resistance to an antibiotic–this was not some ‘throwaway’ point in the supplemental materials, but the primary point of the paper. Two had incorrectly identified locations. Because they had submitted the reads to NCBI (the U.S. government’s publicly available sequence repository), I was able to reassemble the genomes these results were derived from, and identify the correct positions. Both were clear typographical errors (e.g., position 82 instead of 92), and could not be attributed to other things (e.g., for the cognescenti, this isn’t an issue of quibbling about start sites).
- One paper described a series of ‘new’ genes (and they conferred the expected phenotypes) that were completely identical to existing genes except for the last twenty amino acids or so. Both visual inspection of the sequences and the reassembly of the genomes in which they were found showed these were obvious frameshift mutations, and not ‘novel’ at all. That is, this was a clear-cut case of assembly error.
- Another paper had mutations that did not align to the reference, but in a weird way: they were off by 1, 4, 5, and 6 as one progressed from the 5’end (the ‘front’ of the gene) to the 3′ end. What (I think) happened is, when they downloaded the sequence, they did not have it in a single line (e.g., ATGCATTCCC…). Instead, there were line breaks in the standard NCBI format of 70 characters per line, and they had ‘off by one errors’: the off by 1 mutation occurred on the second line, the off by 4 mutation occurred on the fifth line, and so on. Again, this is a very simple set of errors.
- Yet another case involved a position on a sequence, let’s say position 3,554, and the position was referred to nearly two dozen times in the paper (it was a key point of the paper), but it was clear from the only figure in the paper that it was position 8,554–and the deposited sequence backed that up.
I’ll stop with the examples because this is depressing, but I could go on for thousands of words, if I were so inclined (life is too short however). But there are several important things to note. First, it’s a pain in the ass to extract these data, and some people just might give up, which is a failure of communication (I stick with it because I am paid to do so–it’s not the fun part of my job). Second, which is far more important than the Mad Biologist’s suffering, how do we trust the other parts of the paper? These examples are not about some minute part of a supplemental table, but are the critical findings of the paper. So how does a reader handle this? Ignore everything else? Pick only the good parts? Third, all of these issues should have been caught. Reviewers should check the key findings of the manuscript in review (and editors should make sure they’ve done so).
I don’t think these are cases of fraud at all (I’ve seen very few cases that even make me suspect fraud), but these are errors, ones that could have and should have been caught in the review process. So I do think fraud is an obvious and documented problem, but poor reviewing and editing also is a serious problem.
I will now go outside and yell at clouds.
*I have seen the very occasional article that makes me wonder if the whole damn thing was made up however.