Measuring Teacher Effectiveness
Something I've been thinking a lot about lately is the idea of linking test scores to teacher evaluation. It's a topic that's everywhere this summer:
- The EFF school report card makes the point.
- It's also the heart of Arne Duncan's remarks at the NEA Representative Assembly.
- The ASCD put out a newsletter recently on union/management collaboration; tying test scores to evaluation was a big part of it.
- And of course, there's the state-level hijinx going on vis-a-vis HB2261, the State Board of Education, and the PESB. Wheeee!
Last year, for one of my Master's classes, I dug into testing data I had on hand for the first grade team in my building. These are real numbers and real averages with real kids behind them; the test in question is the Measures of Academic Progress, from the Northwest Evaluation Association.
Teacher A: In the fall, her class had an average score of 162.5 on the MAP. In the spring the class average rose to 184.3, an average gain of 21.8 points.With this data, then, you could argue the case for two different teachers as the "winners" in the group. If you look at the average gain, Teacher A is your champion:
Teacher B: Her fall average was 164.7; her spring average, 183.85, for an increase of 19.15 points.
Teacher C: 169.05 in the fall, 189.35 in the spring, so an average gain of 20.3 points.
Teacher D: An average score of 155.30 points in the fall and 174.85 in the spring. Her fall-to-spring gain, then, was 19.55 points.
- Teacher A: 21.8 points
- Teacher C: 20.3 points
- Teacher D: 19.55 points
- Teacher B: 19.15 points
- Teacher C: 189.35
- Teacher A: 184.3
- Teacher B: 183.85
- Teacher D: 174.85
But we have to dig even deeper before making a statement about teacher quality, because here the raw numbers aren't telling the whole story.
In the fall, the average score for this test is 164 points. In the spring, the average score is 178. Knowing that, here's some new data to chew on.
In Teacher A's room in the fall, 10 kids scored in the below average range. In the spring, 6 kids scored below average.With this new information, you can make two new arguments. First, Teacher B is your best teacher because she had more of her kids cross the finish line (the goal score, 178) than the other teachers did. You could also argue that Teacher D is your best teacher because she lowered her percentage of kids who were below standard more than any of the other teachers did.
In Teacher B's room, 7 kids were below average in the fall, while 3 were below average in the spring.
In Teacher C's room, 6 kids were below average in the fall, and 3 in the spring.
In Teacher D's room, 16 kids were below average in the fall, and 6 tested below average in the spring.
So, who is your Most Valuable Teacher?
Is it Teacher A, who added the most value to her class over the course of the year?
Is it Teacher B, who had more of her kids meet the year-end goal?
Is it Teacher C, whose class scored the highest in the spring?
Is it Teacher D, who turned around more failing kids than any of the others?
"Value" is a homophone; there's the value signified by the numbers, but there's also the values of the school, the district, and the state which have to be superimposed atop any effort to link the data to the teacher. If the incentive pay/merit pay/whatever pay in this case goes to only one of the four teachers, you're making a statement about the value of the work the other three did, and it's a pretty lousy thing to say to the other three who also made progress that their success didn't matter as much.
Similarly, can we countenance a system where every one of these teachers is given the bonus money, indicating that they all did a good job? In the eyes of some reformers I could see that being too close to what we do now, where every teacher is assumed to be a good teacher. If a merit pay system is intended to have winners and losers, and to inspire the "less-capable" teachers to emulate the "better" teachers, can we really have a 4-way tie?
These are the questions that have to be answered going forward.
If you'd like to see the raw scores presented in a spreadsheet, you can find them here.