For a long time, I’ve been arguing that value-added measures–statistical claims of how much an individual teacher increases or decreases student test scores–don’t tell us much about individual teachers and they should not be used. From a study so hilariously good even Matthew Yglesias has to accept its validity (quick, reposition the pundit!; boldface mine):
Estimates of teacher “value-added” suggest teachers vary substantially in their ability to promote student learning. Prompted by this finding, many states and school districts have adopted value-added measures as indicators of teacher job performance. In this paper, we conduct a new test of the validity of value-added models. Using administrative student data from New York City, we apply commonly estimated value-added models to an outcome teachers cannot plausibly affect: student height. We find the standard deviation of teacher effects on height is nearly as large as that for math and reading achievement, raising obvious questions about validity. Subsequent analysis finds these “effects” are largely spurious variation (noise), rather than bias resulting from sorting on unobserved factors related to achievement. Given the difficulty of differentiating signal from noise in real-world teacher effect estimates, this paper serves as a cautionary tale for their use in practice.
When you examine the tables, it’s really stunning how similar teacher effects on height are to their influence on student test scores. Maybe once and for all, we’ll stop using the damn things to evaluate individual teachers, though school boards probably will be years behind the curve (SEE WHAT I DID THERE?).