And How Does a Human Make Sense of NYC’s Released Test Scores?

Yesterday, I wrote about the problems surrounding New York City’s release of teacher scores (more information here). Well, I briefly looked at the NY Times website, which to its credit included error terms (not that most people will have an idea what a standard deviation is…), and I think most people are going to be confused. I randomly chose a school, J.H.S. 143 Eleanor Roosevelt, and looked at some of the teacher data. Several teachers teach different grade levels, so it was possible to get multiple estimates of teacher performance. Here’s one teacher, who teaches sixth and eighth grade math:


The number in the graphic is the percentile ranking of the teacher based on how much their students improved over the year. The three numbers below the number of students are expressed in terms of standard deviations above or below the citywide mean. Pretty darn good, especially this! Then we look at this eighth grade class:


Uh-oh. We see the same pattern with another teacher, who also appears to be performing worse this year than other years. Sixth grade:


Eighth grade:


This isn’t confined to math either. English instruction seems to vary between grades too.

The point isn’t to call these teachers out, but highlight just how variable these scores can be, from year to year, and cohort to cohort: it appears that most teachers at this school did worse this year than in previous years. In other words, not all entering classes are alike (and they probably differ among schools too).

And how is a parent supposed to figure out if a teacher is a ‘good teacher’ with this kind of variability?

As Bill Gates pointed out, only an idiot would use personnel evaluations this way.

This is not going to improve the quality of teaching at all.

An aside: I’ll go out on a limb and speculate that if the yearly data were released, then we would see just how variable these scores really are.

Related post: Reign Of Error: The Publication Of Teacher Data Reports In New York City

This entry was posted in Education, Statistics. Bookmark the permalink.

1 Response to And How Does a Human Make Sense of NYC’s Released Test Scores?

  1. bluefoot says:

    One of my siblings teaches in the NYC public schools and she was drilling down into some of the details as she knew it of the scores. For instance, she knows teachers who had radically different scores when broken down by class period – same teacher, same lesson plan on the same days, but different students and time of day.
    Another problem is that many of the best teachers end up with the problem (behavioral or other problems) students throughout the school year. That is, the problem students will be transferred to a new teacher. So by the February break, some classrooms are heavily weighted with students that severely disrupt the entire class which of course impacts all the students and the teacher’s efficacy.
    Another example is that one year one of the 3rd (?) grade tests was comparatively easy and the students all appeared to do well. However, when they got to 4th and 5th grade, the “value added” scores were artificially low because they had performed so well as 3rd graders.
    She said said that as far as she knew, the data does not take into account these sort of factors, with the caveat that she hasn’t had time to look to closely at it. Being concerned with things like lesson plans, teaching, getting her report card scores in, etc.

    I learned yesterday that she pays for all the photocopies of information and homework sheets for her classes. Apparently her school only gets a budget for a certain number of copies every year which nowhere near covers the needs for the school. Since her husband has a good job, my sister pays out of pocket so that her student get what she feels they need and other teachers with less money can use the photocopy budget. WTF?!?

Comments are closed.