Note that I said cranky, not mad. Mad is reserved for moral degenerates who cut funding to assist people with cerebral palsy. But cranky? Yes. Recently, I’ve come across a couple of papers that describe interesting collections of E. coli. For example, one paper isolated a bunch of E. coli from soil and water in Hawaii to determine if there is a dominant point source of fecal contamination and if there are sustainable populations of E. coli in Hawaiian soils (which would mean we can’t use simple counts of E. coli to determine if a water source is contaminated by feces). Another paper isolated E. coli from a Californian river system to determine if these isolates were primarily of human or animal origin. As studies go, they’re fine.
Except for one thing: how they genetically characterized the E. coli. One study used rep-PCR, the other pulsed field gel electrophoresis. You basically end up running pieces of DNA through an agarose gel, and end up with something that looks like this:
Depending on how thick the gel is, how strong the illuminator is your using to backlight the gel, how much DNA is used (not to mention with rep-PCR, all the potential PCR replication issues), you can get different banding patterns with the same strain.
But that’s not what makes me cranky.
What makes me cranky is that these methods aren’t very ‘portable.’ It’s really hard to compare them between labs (and, for rep-PCR even different PCR machines in the same lab). PFGE can also have significant lab-to-lab variation. State public health labs and CDC, which inexplicably still use PFGE for outbreak surveillance, spend a lot of time ensuring quality control–it’s not a trivial issue.
In other words, I have no idea what the genotypes of these E. coli are except within the context of study. Contrast this with a sequenced-based method such as multilocus sequence typing (‘MLST’) where short segments of multiple genes are sequenced. Not only are these data portable, in that a sequence in one lab can be readily compared to a sequence generated in another lab, but MLST is independent of sequencing technology. For instance, when I determined that the German outbreak E. coli HUSEC041 was actually closely related to a strain of E. coli we had seen a decade earlier*, I used the sequence data generated by BGI and compared to sequence in a database that had been generated years earlier by a completely different technology.
Between the two studies I mentioned–and, again, they’re not bad in terms of the science–there are over 1,800 E. coli isolates in those two papers alone (and the field of water management generates many thousands of such isolates every year). Wouldn’t it be nice to know if any of them are related to HUSEC041? Or O157:H7? Or some other E. coli of interest? I think so, but I’m just a Mad Biologist….
Now, there are two points against MLST that can be made. First, is that MLST doesn’t always have the resolution that PFGE does. After, if a strain acquires a plasmid or a phage, the size of the chromosomal pieces will change (PFGE chops up the genome by using enzymes that cut the genome in specific places). I would argue that if you want that level of fine scale resolution, MLST combined with a really rapidly evolving target like hypervariable tandem repeat loci is the way to go (rep-PCR, on the other hand, doesn’t evolve that quickly, so it’s not really an improvement to begin with).
The other issue is cost. I would argue this is overblown: even if you pay a private company to do the sequencing, the total cost runs to about $90 per isolate (and that’s retail–bulk customers usually get discounts these days). But as I’ve discussed regarding sequencing technologies, there’s a difference between price and cost: the real cost of running PFGE on a bunch of strains is pretty close to MLST once labor costs and throughput times are considered.
All these strains and no idea what they are.
We are very cranky.
*From what I’ve been able to gather from multiple sources, the Karch laboratory at the University of Munster had this information a week before I ‘discovered’ it, but public health authorities went with the killer bacterium version. That was incompetence bordering on the criminal. On the other hand, it does demonstrate the power of simple sequence as a corrective to official idiocy.
Cited articles: Lindstedt, B.-A., Vardunda, T., & G. Kapperud. Multiple-Locus Variable-Number Tandem-Repeats Analysis of Escherichia coli O157 using PCR multiplexing and multi-colored capillary electrophoresis. Journal of Microbiological Methods 58: 213-222. doi:10.1016/j.mimet.2004.03.016
Ibekwe AM, Murinda SE, Graves AK, 2011 Genetic Diversity and Antimicrobial Resistance of Escherichia coli from Human and Animal Sources Uncovers Multiple Resistances from Human Sources. PLoS ONE 6(6): e20819. doi:10.1371/journal.pone.0020819
Goto, D.K., and T. Yan. 2011. Genotypic Diversity of Escherichia coli in the Water and Soil of Tropical Watersheds in Hawaii. Applied and Environmental Microbiology 77: 3988-3997. doi:10.1128/AEM.02140-10.