The Fragility of Scientific Infrastructure: The MLST Edition

In the aftermath of Hurricane (Tropical Storm?*) Sandy, NYU and other New York area institutions suffered significant damage to their research collections, including unique animal strains, highlighting the fragility of much of our scientific infrastructure. While popular treatments of science, such as movies and television, make it appear that scientists have these tremendous facilities, protected by massive steel vaults with multiple backups (all funded by copious amounts of money), the reality is the opposite. Often, it’s one researcher on a shoestring grant who is struggling to maintain an irreplaceable collection or resource or who has knowledge and expertise in a particular subfield. When that scientist loses her funding or retires (or dies), that bit of infrastructure vanishes for good.

Which brings us to some troubling news about MLST–which is short for MultiLocus Sequencing Typing. MLST is a widely used bacterial typing system that involves sequencing small regions of bacterial genomes (usually seven, of about 500 nucleotides long). It allows us to group and name bacteria within a given species, so scientists can communicate with each other. During the 2011 German E. coli O104:H4 outbreak, MLST allowed researchers to realize that this was not a completely novel strain, but had been seen before a decade earlier. It has a lot of advantages over many other methods too:

Contrast this with a sequenced-based method such as multilocus sequence typing (‘MLST’) where short segments of multiple genes are sequenced. Not only are these data portable, in that a sequence in one lab can be readily compared to a sequence generated in another lab, but MLST is independent of sequencing technology.

There are websites that have codified these typing schemes, and that confirm new MLST types. That, of course, requires money. So receiving this email from the UCC MLST center, which governs a widely-used E. coli MLST scheme (as well as other organisms) is very upsetting (boldface mine):

These changes are in part stimulated by recent changes in the future prospects of the UCC MLST website….

The UCC MLST website has been funded indirectly by a grant from the Science Foundation of Ireland to myself. Curation has almost exclusively been performed by Vimal (E.coli) and myself (S. enterica). There has been little activity on Moraxella catarrhalis or Yersinia pseudotuberculosis, and this has required little effort for curation. That grant expires at the end of December, and an application for renewal has been rejected. As a result this website will no longer be hosted by UCC after some date in 2013, which is still undecided…..

I will contact alternative universities that may be willing to host these MLST websites. One possibility is to migrate them all to Oxford under the supervision of Keith Jolley and Martin Maiden. This would require volunteers who are willing to act as external curators. I am also exploring alternatives for the E. coli website, possibly affiliated with a university that is actively working on E. coli. If you wish to submit sequence traces that fulfil curator requirements for alleles that have not yet been accepted, I urge you to do so before mid-December. Otherwise, all alleles and STs that have not been accepted will be deleted at that date with no further notice. For those who wish to perform analyses of these data in the future, you would be well advised to ensure that you have downloaded the contents of the MLST websites in mid-December as well because we cannot guarantee how long the website will be active thereafter. However, I am optimistic that it will continue to be accessible from a different link in the future and will email you once more at that time.

My potential retirement endangers the future existence of the strain collections that I have established at UCC. In particular, we have a collection of 5,000 isolates in robotic friendly format from the classical Seeliger collection of Listeria monocytogenes and L. innocua (Haase et al., Environ Microbiol 13:3163-3171, 2011). MLST has been performed on over 1000 of those strains (Haase in prep.), and all of that data is publicly available on the Listeria MLST website at the Institut Pasteur, Paris. The strains will probably be maintained as frozen tubes at UCC (Colin Hill), but the robotic and IT infrastructure for accessing them and readily making subcultures will likely disappear as of 1 January….

A second collection that I have established here consists of >12,000 S. enterica subspecies enterica [MtMB: Salmonella], again all in robotically friendly format. This contains 5,000 isolates representing the diversity of serovars over all of Ireland, multiple isolates from rivers in Benin, Georgia (USA) and the river Thames in London. It includes reptile and environmental isolates from all over Europe and a large collection of isolates representing different PFGE patterns and multiple serovars from humans and reptiles in Taiwan… If others are potentially interested, and are willing to pay the costs for subculture and transport, please contact me. Again this would have to be done before mid-December.

Losing this resource would be devastating. Ironically, at Major Sequencing Center, we incorporate MLST identification (working out the kinks now) as part of our standard assembly process, and last week, I was talking to people at one of the NIAID-funded Bioinformatics Resource Centers about how to incorporate this information into their databases. A few months ago, the Center for Genomic Epidemiology released a free web service that allows you to determine the MLST type if you have a genome sequence (or just raw data).

The field of genomics is really starting to use this information–and as we continue to generate thousands of bacterial genomes every year, this will become all the more important. And we could lose a vital resource.

Storms are bad, but fiscal storms are just as devastating in their own way. And completely avoidable too.

*Last I heard, New York Senator Charles Schumer is trying to get the storm reclassified as a tropical storm, not a hurricane, so insurance companies will pay out claims.

