Incentives And Closed Data

So NEJM, which is not aware of all data sharing traditions, published a very silly opinion piece/proposal about sharing of data from clinical trials. While the entire article appears to have fallen out of the stupid tree and hit every branch on the way down, this is what the authors conlcude (boldface mine):

In summary, we recommend that the ICMJE come together with trialists and other stakeholders to discuss the potential benefits, risks, burdens, and opportunity costs of its proposal and explore alternatives that will achieve the same goals efficiently. Moreover, we recommend modifying the proposal as follows. First, the timeline for providing deidentified individual patient data should allow a minimum of 2 years after the first publication of the results and an additional 6 months for every year required to complete the study, up to a maximum of 5 years. Second, to enhance readers’ confidence in published data, an independent statistician should have the opportunity to conduct confirmatory analyses before publication of an article, thereby advancing the ICMJE’s stated goal of increasing “confidence and trust in the conclusions drawn from clinical trials.” Finally, persons who were not involved in an investigator-initiated trial but want access to the data should financially compensate the original investigators for their efforts and investments in the trial and the costs of making the data available.

We’ll return to the boldface part in a bit, but that wasn’t even the most ridiculous part. This was:

We believe that once data are released for public use after the appropriate interval, the deidentified trial data should be housed either in a reliable third-party data repository or at the trialists’ center. Whoever hosts the data will need to implement mechanisms to manage data requests in a timely and fair manner, avoid duplication of efforts, and ensure that such analyses are accurate and not conducted with the aim of inappropriately undermining the original findings.

If the data were generated with federal funding (e.g., NIH), then they are not your data. I appreciate the need to let the data generators get first crack at publication. But you can’t hold on to data–and you certainly can’t prevent other groups from evaluating your results for years–simply to fill out your CV.

While this proposal is horseshit, it does raise one issue: as long as researchers are rewarded solely for manuscript publication and not for data collection and generation, this will continue to be a problem. Funders have to reconfigure incentives–which is fancy-speak for award grants–to reward researchers for the data collection and generation parts of science, especially when those data will be made public.

