The debate on the efficacy of Value-Added Models as a measure of teacher effectiveness continues and the well-documented reservations remain unresolved, but the game changed last year when the Gates Foundation turned its attention from “small schools” to teacher quality issues. One of the key Gates initiatives is the so-called Measures of Effective Teaching Project which issued a “preliminary findings” report in December. The foundation has committed $248 million to this effort.
There are two components to the project, one involving videotaping lessons from over 3000 teachers who volunteered to be taped under a confidentiality agreement, and the other — and the second component — involves teacher assessment based on value-added test score results and student ratings of teachers based on seven different dimensions.
My guess is that the rush to get out preliminary findings was intended to influence policy discussions related to possible reauthorization of ESEA. Even though the preliminary results are truly preliminary and highly speculative they are not presented that way and the casual reader of the policy brief – a slick document published separately from the actual report and given wide circulation – is led to believe that there are strong indications that value-added scores are more predictive than the evidence presented in the report supports at this stage of the project.
One critique of the Learning About Teaching Report just released recently by the National Education Policy Center, a review written by Jesse Rothstein at Stanford University, states the following:
…the preliminary MET results contain important warning signs about the use of value-added scores for high-stakes teacher evaluations. These warnings, however, are not heeded in the preliminary report, which interprets all of the results as support for the use of value-added models in teacher evaluation. Moreover, the report’s key conclusions prejudge the results of the unfinished components of the MET study. This limits the report’s value and undermines theMET Project’s credibility.
The concern is that this independent research contracted for by a wealthy foundation may be unduly influenced by a pre-determined action agenda. This is the nature of the current policy/research climate in which the tail often wags the dog and requires increased due diligence on the part of teachers and their organizations.
As Rothstein’s Review points out, the MET Project has two aims or premises: “First, a teacher’s evaluation should depend to a significant extent on his/her students’ achievement gains; second, any additional components of the evaluation(e.g. observations) should be valid predictors of student achievement gains.” While increasing student achievement is obviously important, it certainly should not be the only lens through which we judge a teacher’s performance.
There should be no doubt that powerful forces are coming to bear on the notion of teacher effectiveness. It seems that one message adopted by the inner circle is that “value-added measures are flawed, but not enough that they should not be used.” In the policy brief released with the preliminary finding when talking about the volatility of results from class to class and year to year, the authors state that, “… our analysis shows that volatility is not so large as to undercut the usefulness of value-added as an indicator of future performance.” It seems more than mere coincidence that just a few days after the release, a commentary entitled “Value-added: It’s Not Perfect, But It Makes Sense” was published in Education Week in which the authors state the following:
The common thread in technical critiques of value-added evaluation is that teachers subjected to it will often be misclassified, e.g., a teacher who is identified as “ineffective” is, in fact, “average.” Given the typical reliability of value-added measures, there is no doubt that such misclassifications will occur with some frequency. However, we must recognize that all decision making systems generate classification errors, including those used today. Moreover, different types of errors have different consequences.
In the case of teacher value-added, the focus has been almost entirely on so-called false-negative errors, i.e., teachers who are falsely classified as ineffective because the measures are not perfectly reliable. But framing the problem in terms of false negatives places the focus almost entirely on the interests of the teacher who is being evaluated rather than the students who are being served.
If you are interested in the policy discussion related to teacher evaluations, add these to your reading list:
Secretary Duncan on Reauthorization Washington Post Jan. 3, 2011 (see comments on teacher effectiveness)
The Widget Effect (New Teacher Project
Fact Sheet on Measuring Effective Teaching Project
Learning About Teaching: Initial Findings from the Measures of Effective Teaching Project
Preliminary Findings Policy Brief
Education Week Commentary: Value-Added: It’s Not Perfect But It Makes Sense
I fear that the educational “blame-game” will continue indefinately as “experts” are forced to continue justifying their salaries by pinning the fall of educational effectiveness on the one thing they have control over – TEACHERS. At the end of the day, teachers are only able to teach students who come to school, don’t skip class, actually participate in lessons, and who understand there are enforceable consequences for bad behavior backed by parents and administrators alike. There are underlying reasons for a lack of student performance that have grown since 1970 along with costs. The number of students that have non English speaking parents has increased, the number of students from homes living below the poverty line has increased, and the number of students who have a single parent that they see only on weekends has increased. These factors have all significantly contributed to the number of students that simply aren’t in school or show up hungry, tired, and not ready to learn. Let’s all be careful that we aren’t focussing on teacher effectiveness/evaluations without considering ALL the factors effecting student performance.