A while ago, I argued that the way genomics can shake its money maker is to get involved in microbial epidemiology, especially bacterial epidemiology, for the simple reason that the genome sequence itself is the clinical intervention. If you can tell a hospital’s (or a hospital network’s or local public health officer) infectious disease specialist, “Last month you had two of this clone, and this month you have ten of the same clone–you might be experiencing the beginnings of an outbreak”, that is useful information. Preventive measures can be taken.
Importantly, this is something already being done, but with mid-twentieth century methods (albeit automated), and has poor resolution, typically at the species level of resolution.
The question is what does “the same clone” mean? This isn’t a ‘how many angels can you fit on the head of a pin?’ question (besides, the answer is 433,234. Everybody knows that). There is a technical issue and a biological one.
When we ‘call SNPs’–that is, look for individual nucleotide changes (e.g., an “A” to a “T”) which are called single nucleotide polymorphisms or ‘SNPs’–we typically, even with the best methods, have somewhere between a 1 out of 1 million to 2.5 million error rate. And that’s under the optimal conditions: we’ve decided not to look at hard to call regions of the genome, we have dialed in the best parameters, and we have a good reference genome for comparison.
This is pretty good until you realize that, if you sequence the same bacterial isolate multiple times, you probably will get 1-3 ‘unique’ differences each time–and these are all false. Problem. Now, there are things we can do to rule out some, maybe even most, of the false SNPs, but I don’t think they’re feasible in a high-throughput clinical setting (research projects are a different matter–we can identify and fix those).
So accuracy is one problem. But then there’s the biology too (stupid biology!).
It’s hard to know what “the same clone” means. In long term infections (e.g., here) that last for weeks, multiple SNPs can accumulate within the same patient. Not many mind you, but enough that it becomes difficult to distinguish among what SNPs evolved within a patient (if the infection lasted for a while), what SNPs evolved between patients or facilities, and what SNPs are found in strains that actually represent multiple origins for your outbreak.
Ultimately, for this to be useful to a clinician, we have to translate the genome data into “you have ten of this clone”, but, even with the best methods, what “this clone” means might not always be easy to determine.
For the cognoscenti: I’m talking about routine surveillance akin to what clinical laboratories do with antibiotic resistance phenotypes. If you’re working with a small number of isolates from a single outbreak, you can usually figure this out (especially if you have some genomics researchers backing you up). But that’s not how routine ‘background’ surveillance systems work.