Somehow, I think many of the commentators who routinely bash teachers would become violent if subjected to assessments like these in Palm Beach, FL (boldface mine):
Five (or more) times in the school year my principal does classroom observations. There are three different kinds: 1 formal, forty minute lesson observation; 2 5-15 minute informal; 2 30 second-2 minute walkthroughs. She evaluates me using the Robert Marzano Menu of Design Questions There are about 60 specific behaviors within 4 different domains, each with 3-7 components, that she is looking for during those observations.
Each of the 60 behaviors is then graded on a scale of: not using, beginning, developing, applying, and innovating. The evaluator, after marking off the components then decides how to grade the behavior. Comments, if appropriate are also added (as in: All of the students were actively engaged in the lesson) At the end of each observation I get an email directing me to approve the observation.
At the end of the year, the grades for each behavior are calculated to determine if the teacher is: Highly Effective (3.2 – 4.0), Effective (2.1 – 3.1), Developing (1.2 – 2.0) or Unsatisfactory (1.0 – 1.1).
The more any quantitative social indicator (or even some qualitative indicator) is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor
Moving along… the teacher describing this evaluation regime discovered that, after being rated as Highly Effective two years in a row (though it didn’t count, as it was a pilot program those years), she suddenly dropped to ‘Effective’ (boldface mine):
This year’s evaluation has come back and I am now graded as Effective with an overall score of 3.0. I was wondering how I dropped from Highly Effective to Effective, so I started looking more closely at the numbers. Here is what I saw:
I was marked as Innovating (4.0) for 12/31 behaviors
I was marked as Applying (3.0)for 19/31 behaviors.
I had no lower marks than that.
Now in my world of calculating scores, I would multiply 4 x 12 = 48 and 3 x19=57, then add them together 48 + 57 = 105, then divide by 31 which equals 3.39. 3.39 is Highly Effective, but I was graded as 3.0 – Effective. Hmm. I called my union rep and she was not sure how that could be. She also, for what it’s worth, had a similar score drop….
If 50% or more of your marks are Highly Effective, then you are Highly Effective; if 50% or more are effective than you are effective, and so on….
What am I NOT doing now that I did then? It turns out that I am, in fact doing the same things. I was marked as doing the same exact components of behaviors this year as I was for last year. The difference is, last year I was rated as innovating more times. So, for example let’s say Behavior A has 6 components. Last year when I was checked off as meeting all 6 I was deemed innovating. This year those 6 components checked off are only earning me applying.
It appears teachers are being graded on a curve:
The principals and assistant principals were told that they were giving out too many innovatings and that they needed to mark innovating less often. In other words, the evaluation that is supposed to determine our level of teaching, which in turn determines our merit pay (no, we don’t really get merit pay. we’re supposed to, but that’s a whole different – let’s lie to the people of Florida – nightmare) is being manipulated by the powers that be in an effort to…I don’t know…make it seem like teachers aren’t as good as we are. So they can pay us less and blame us more. The powers that be are doing to the teachers what the high stakes testers are doing to our students: creating a system that is skewed to failure (or mediocrity).
Anyone who has ever written a federal grant proposal knows just how subjective (or ridiculous) assessments of ‘innovation’ can be*. Yet the august solons of the Florida Legislature have decreed that these ‘objective’ scores will be released to be public. None the reformers or their pundit enablers would ever willingly subject themselves to this kind of evaluation at their workplaces. Given the mediocre track records of many reformers, if they were, they would be recognized as the mediocrities and failures they are.
Reformist rhetoric notwithstanding, this reality is why so many people hate testing and teacher evaluation.
*And once a teacher routinely incorporates an ‘innovation’, is it still ‘innovative’? The mind boggles.