David Leonhardt is all excited about the growing movement to measure the effectiveness of gummint programs!
And yet there is some good news in this area, too. The explosion of available data has made evaluating success – in the government and the private sector – easier and less expensive than it used to be. At the same time, a generation of data-savvy policy makers and researchers has entered government and begun pushing it to do better. They have built on earlier efforts by the Bush and Clinton administrations.
The result is a flowering of experiments to figure out what works and what doesn’t.
New York City, Salt Lake City, New York State and Massachusetts have all begun programs to link funding for programs to their success: The more effective they are, the more money they and their backers receive. The programs span child care, job training and juvenile recidivism. The approach is known as “pay for success,” and it’s likely to spread to Cleveland, Denver and California soon. David Cameron’s conservative government in Britain is also using it. The Obama administration likes the idea, and two House members – Todd Young, an Indiana Republican, and John Delaney, a Maryland Democrat – have introduced a modest bill to pay for a version known as “social impact bonds.”
The White House is also pushing for an expansion of randomized controlled trials to evaluate government programs. Such trials, Mr. Schuck notes, are “the gold standard” for any kind of evaluation. Using science as a model, researchers randomly select some people to enroll in a government program and others not to enroll. The researchers then study the outcomes of the two groups.
In principle, we like TEH SCIENTISMZ! when it comes to policy assessment (We, however, do not like social impact bonds). But in practice, the outcome has been less than stellar, whether it be education, nuclear defense (?!?), or cardiology. For a recent horror story, read this by Rachel Aviv about the testing scandal in Atlanta.
When people’s jobs depend on these outcomes, there will be the temptation to fudge the numbers (or flat out cheat). In the Atlanta scandal, as well as the Air Force, the measurements used were viewed as illegitimate, which lent legitimacy to the cheaters (in the Atlanta case, it was seen as protecting children by some). Does this mean we shouldn’t assess these programs? Not necessarily, but we should be very cautious in interpreting the results–and I’m sure that will happen. Understanding the limitations of the data will be vital, or else a lot of damage will be done, often to those who can least afford it.