Some Concerns About the LA Times’ Attempt at Teacher Evaluation

The LA Times has taken upon itself to rate school teachers in Los Angeles. To do this, the LA Times has adopted the ‘value-added’ approach (italics mine):

Value-added analysis offers a rigorous approach. In essence, a student’s past performance on tests is used to project his or her future results. The difference between the prediction and the student’s actual performance after a year is the “value” that the teacher added or subtracted.
For example, if a third-grade student ranked in the 60th percentile among all district third-graders, he would be expected to rank similarly in fourth grade. If he fell to the 40th percentile, it would suggest that his teacher had not been very effective, at least for him. If he sprang into the 80th percentile, his teacher would appear to have been highly effective.
Any single student’s performance in a given year could be due to other factors — a child’s attention could suffer during a divorce, for example. But when the performance of dozens of a teacher’s students is averaged — often over several years — the value-added score becomes more reliable, statisticians say.

While I laud the attempt to approach this issue quantitatively, I have serious doubts about their methods (Note to LA Times: methodological issues don’t make an approach “controversial”; they can make it wrong). Let us count the ways:

1) Using percentiles. This makes everything zero sum. Take the example given of a student who moves from the sixtieth percentile to the eightieth. If the students previously ranked 61-80 remain where they are, they all get shifted down one percentile. I think the raw scores should be used instead. Which leads to…
2) Using percentiles, part deux. We really don’t know what percentiles actually mean. How much better in raw score (or some adjusted score if you want to weight questions) is the sixtieth percentile versus the fiftieth. I doubt it’s linear. A decrease from 60th to 50th could be a very small real decrease. Put another way, a supposedly whopping twenty point increase might not be very impressive at all. We simply can’t tell without the underlying scores.
3) Using percentiles, the Lake Wobegon Edition. I’m not a fan of the “ninety percent of our students finish in the top ten percent of their class” philosophy. But the idea that a student who did relatively poorly last year should, if all teachers teach equally well (and they all could be doing a good job), continue to do poorly the next year seems defeatist, even by my standards. Talk about the soft bigotry of low expectations. With a standard curriculum, which the LA Times claims there is, there is an upper boundary effect. If teachers, regardless of where students start, are getting most of their pupils pretty close to where they need to be, then the variation, particularly at the upper end, will largely be wobble and noise, not teacher quality.
4) Brother, can I get an R2 in here? The method described averages the student gains (which contain the aforementioned problems) for each teacher. Allow me to give you the short version of what I think about this approach:


The longer version is that we want to capture that variability. Ideally, for each student, there would be associated demographic information (just because students go to the same school doesn’t mean they come from identical backgrounds); if not, we would like to know something about the school the student attends (e.g, is it a ‘poor’ school?). Even without that information, we would like to know how important teacher differences are relative to the total variability (and if possible, other factors too). If other teacher quality account for little of the variation, then, even if teacher effects are significant, that’s probably not what we should be focusing on.

If the data, including the raw scores, are released to the public, this will be a very interesting exercise. As currently constructed, though, I’m concerned the shoddy approach could do more harm than good. Admittedly, I’m biased: if teacher quality were so obvious, we would have found it by now. The link between poverty and educational performance hits you between the eyes like a two-by-four, but teacher effects are pretty weak.
I suggest California just copy Massachusetts’ curriculum and funding levels and see how that works. Or we can embrace untested ideas. Because it’s not like kids matter or anything.

This entry was posted in Education, Statistics. Bookmark the permalink.

14 Responses to Some Concerns About the LA Times’ Attempt at Teacher Evaluation

  1. Physicalist says:

    Excellent post.

  2. Mokele says:

    While under this plan, declines due to a kid’s folks getting divorced would average out, what about something that happens to all of the students at roughly the same time: puberty. A middle-school teacher (who may teach several grade levels of a particular subject, in my experience), is going to see all of their students decline because they’re all becoming hormonally unbalanced little maniacs compared to when they were in elementary school.

  3. becca says:

    I also dislike the percentages, but I see every bit as much of a problem using them in the first place- they make the students play the zero sum game. Why not teachers?
    I also understand your bias. But do you really think it’s easier to fix poverty than it is to get good teachers?

  4. eNeMeE says:

    But do you really think it’s easier to fix poverty than it is to get good teachers?

    Likely not, but it seems easier to me* to fix some of the aspects of poverty that affect performance to achieve a measurable impact than get better teachers that provide the same impact.
    *meaning I pulled this out of my ass, based on the fact that I have had some teachers I would qualify as excellent that other people hated (and learned little from) and vice-versa while feeding kids decently is doable and measurable.

  5. Tim Bartik says:

    Simple value-added models such as what the LA Times is using assume: (1) that value added in a given grade is substantially under the control of the teacher, and is not seriously biased by other influences, and (2) in particular that student assignment to teachers is either random or is not correlated with value added. Education economist Jesse Rothstein has done some recent research that casts doubt on these propositions. For example, he finds that the FIFTH-grade teacher you are assigned is correlated with your FOURTH-grade “value added” gains. Different teachers get assigned different types of students. Furthermore, there is a strong indication in his data of reversion to the mean in test score levels. That is, students who have high value-added gains in 4th grade tend to have lower value-added gains in 5th grade. Therefore, 5th-grade teachers who for whatever reason happen to get assigned a great many students who had high value-added gains in 4th grade will tend to have lower value-added gains. These lower value-added gains are obviously not solely due to the 5th-grade teacher. I’m not sure what your policy is on links, but if you go to Google Scholar and enter “Jesse Rothstein” and “value added”, you will find links to several of his papers, including a 2010 paper in the Quarterly Journal of Economics, one of the top econ journals. At the very least, these findings suggest that measuring teacher quality is more complicated than simply measuring average test score gains.

  6. rijkswaanvijand says:

    The major problem isn’t malnutrition though;
    Poor children mostly have undereducated parents who never had a fair chance on a decent education themselves, as a result they lack basic educational support at home which in turn makes them likely to end up like their parents..
    It’s a vicious circle of cultural poverty.

  7. Morgan says:

    Excellent analysis of value-added approach. What’s most concerning to me, however, is the effect posting teacher scores online will have. The LA Times qualifies their results on page A27 of the Sunday, Aug. 15 edition:
    “Value-added ratings reflect a teacher’s effectiveness at raising standardized test scores. As such, they capture only on aspect of a teacher’s work and, like any statistical analysis, they are subject to inherent error.”
    Does the LA Times really believe that the average parent looking at their child’s teacher’s score will take into account those qualifiers? In fact, according to that statement, the value-added rating only reflect a teacher’s ability to teach to the test. Is this really how we want our children educated?

  8. eNeMeE says:

    The major problem isn’t malnutrition though;
    Poor children mostly have undereducated parents who never had a fair chance on a decent education themselves, as a result they lack basic educational support at home which in turn makes them likely to end up like their parents..
    It’s a vicious circle of cultural poverty.

    Yeah, but what I was getting at was the part where “teacher effects are pretty weak”. Even though malnutrition isn’t the only factor of poverty influencing results it’s easier to identify and fix and may produce larger gains than improving teachers..
    (my ass hurts from all the stuff I’m pulling from it)
    So I’m thinking that there are things that can be addressed to lower the impact of poverty in order to more easily achieve higher gains than getting better teachers (which is, clearly, really damn hard).

  9. Mokele says:

    becca said: “But do you really think it’s easier to fix poverty than it is to get good teachers?”
    It’s not an issue of “easy”, it’s an issue of whether it’ll work at all. IMHO, better teachers won’t accomplish much, maybe raise things a few percentage points, tops. As is so often the case, the *real* solution which would lead to major improvement is also the most difficult, expensive, and time-consuming. We need to stop looking for quick fixes and band-aids, and start addressing the real issues.

  10. hibob says:

    Mike: complaint #4)
    They do mention some of that variance (well, relative, not absolute), but it looks like they didn’t have access to demographic data for this study.

    Although many parents fixate on picking the right school for their child, it matters far more which teacher the child gets. Teachers had three times as much influence on students’ academic development as the school they attend. …
    • Many of the factors commonly assumed to be important to teachers’ effectiveness were not. Although teachers are paid more for experience, education and training, none of this had much bearing on whether they improved their students’ performance.
    Other studies of the district have found that students’ race, wealth, English proficiency or previous achievement level played little role in whether their teacher was effective.

    The data analyst’s name is at the end of the article; maybe there will be a formal version of the study we could look at? I wouldn’t be surprised if her analysis was heavily dumbed down by the time it made it past the editor.
    Tim #2:

    (2) in particular that student assignment to teachers is either random or is not correlated with value added.

    in the article they note that assignment might not be random: “Some students landed in the classrooms of the poorest-performing instructors year after year”, but miss on the implications. I’d think the biggest problem with assignment is with highly disruptive students, the ones who can throw scores for an entire class for an entire year. Most teachers I’ve talked to say they can handle one per class, but that things really break down with two or more. They also say the number of “hand grenades” a teacher is assigned is often a function of the teacher’s relationship with the principal …

  11. A. Leahy says:

    The term “value added” is most commonly used as an economic term. As an English professor, I have concerns when we talk about and treat students–people–as commodities or, worse, as products of our making. While benchmarks, standards, and evaluation can be helpful, students are not merely raw materials (like rubber) that we transform into useful goods (like tires). One of my concerns is that this view inadvertently encourages institutions to leave out (or leave behind so that they stop being counted) that material that’s perceived to be poorer quality (or economically poorer, as other commenters here point out) or less easily made into tires. On a practical level for teachers, the pressure to raise test scores likely curtails flexibility, creativity, and spontaneous learning. Innovation is important for students to learn, too.
    I think DC is already using a system of value-added learning in teacher evaluation and is considering making scores public. Meanwhile, former Assistant Secretary of Education Diane Ravitch, one of the proponents of the rise of assessment and No Child Left Behind, has questioned those accountability and high-stakes testing policies she once championed and has come to the view that economic circumstances, not teachers, are the greatest predictor of academic performance.

  12. joemac53 says:

    Mike, maybe you haven’t heard that districts in Massachusetts are looking into this approach, using MCAS scores as baseline data. My district (from which I retired June 30) was selling this at the last batch of faculty meetings. The info is on the DOE website. Beware of misuse of data!
    I get sincerely riled up with administrators who try to quantify teacher quality using cookbook stats that they think they understand. (I tried to be an administrator for a few years, but I had to get back into the classroom to save my sanity. I also contend that administrative meetings gave me cancer.)

  13. Jason Felch says:

    You can find a technical paper on our methodology at
    Jason Felch
    LA Times

  14. Nelson says:

    The value-added methodology makes me think of that old quote, “For every complex problem, there is a solution that is simple, neat, and wrong.”
    I’d be interested to know if the v-a methodology takes into account that California standard curriculum introduces new material to students in the odd grades (1st, 3rd, 5th, etc) and (more or less) repeats the material in the even grades, making it easier to appear to be effective.

Comments are closed.