40,000 Large Genomes Per Year? Really?

So this story about the Smithsonian’s Global Genome Initiative makes my head all explody:

The plan is to eventually freeze embryos, seeds, and other genetic samples from as many of Earth’s life-forms as possible. The project will make use of the Smithsonian Institute’s biorepository, a 6,500-square-foot, $9 million storage facility that has space for more than 4.2 million tiny vials of cryogenically frozen tissue samples.

That’s a great thing to be doing! But then (boldface mine):

Since the first genome was sequenced (of Haemophilus influenzae, a bacteria that was mistakenly thought to cause the flu) in 1995, just 4,400 genomes have been sequenced, according to the Department of Energy’s Joint Genome Institute. “Non-charismatic forms of life, your spiders, marine invertebrates, crustaceans, flies—there’s almost nothing known about them as far as their genomes go,” said Coddington.

While it once cost six figures and took many years to fully sequence a genome, it can now be done in a matter of days for a few thousand dollars. That price is expected to fall to about $1,000 over the next several years—a cost low enough for the Smithsonian to start sequencing genomes itself, something that Coddington said was unimaginable even two years ago….

Coddington envisions a day when the process is nearly entirely automated. In the past, genome sequencing was an “artisanal” process, with specialists extracting genetic material by hand and sequencing it, he said. Now “we’re interested in using robotics and extracting whole genomic DNA from hundreds of species a day.”

Within five years, Coddington wants to have 200,000 genomes done, roughly one for every genus of life identified on Earth. That, he said, should “essentially preserve most of the diversity of the global genome.” From there, the team will decide whether it’s feasible to sequence all 1.9 million identified species. Scientists believe that less than 20 percent of Earth’s species have been described.

We’ve been through this before: it costs more $5,000 to sequence large genomes. The price of running prepared DNA through a machine is a few thousand dollars per genome (depending on what the price covers), but that’s a small part of the overall cost. It’s also not clear that this cost will be dropping significantly any time soon (I think nanopore technologies will dramatically increase sequence quality, but I don’t think they’ll lower costs per nucleotide for large genomes).

There’s also some confusion in the way the article is written (I hope the Smithsonian isn’t convinced they can sequence 40,000 large genomes annually). I can see successfully extracting DNA from 40,000 samples per year. But sequencing 40,000 large genomes per year? It could be done, but not by one center (not even BGI…). They wouldn’t be very good genomes either–they would be very fragmented (though a bad genome is better than no genome I suppose).

I like the Global Genome Initiative, but I hope someone isn’t selling the Smithsonian a false bill of goods about what can be accomplished.

This entry was posted in Genomics, Museums etc.. Bookmark the permalink.

3 Responses to 40,000 Large Genomes Per Year? Really?

  1. “[A] Bad genome is better than no genome I suppose.” Hmmm, depends how bad. The false conclusions drawn from the pie-chart analysis would likely be a waste of money and hinder those of us trying to secure the funds to do it properly.

  2. Gene Doctor says:

    Seems to me that sequencing a genome and analyzing a genome are two separate things. Does this price include all of the digital analyses, finding the genes buried within the genome? Or is this just exome sequencing? Interesting nonetheless.

  3. I think you can get ~80% of the basic info of interest from a fragmented genome, quickly and easily assembled from short-insert Illumina shotgun. This would already be immensely valuable, especially if paired with mRNAseq (which is frankly more useful at this point in time). But euk genomes are hard to assemble into long contigs/scaffolds with just Illumina sequencing, which is what they must be proposing to do.

Comments are closed.