Policy Implications of the New Teacher Value-Added Research

cross-posted at The Nation

Last month, economists at Harvard and Columbia released the largest-ever study of teachers’ “value-added” ratings—a controversial mathematical technique that measures a teacher’s effectiveness by looking at the change in his students’ standardized test scores from one year to the next, while controlling for student demographic traits like poverty and race.

Raj Chetty, John Friedman, and Jonah Rockoff analyzed the test scores and family tax returns of 2.5 million Americans over a twenty-year period, from 1989 to 2009. The team concluded that students who have teachers with high value-added ratings are more likely to attend college and earn higher incomes, and are less likely to become pregnant teens.

In a rare instance of edu-wonk consensus, both friends and skeptics of standardized tests are praising the study as reliable and groundbreaking. Indeed, these findings raise several interesting questions about how to evaluate and pay teachers—one of the most controversial topics in American urban politics. In his annual State of the City speech last Wednesday, New York Mayor Mike Bloomberg cited the new research as he promised annual bonuses of up to $20,000 for teachers rated “highly effective,” based partially on value-added measures and partially on principals’ judgments. In a move that befuddled many casual observers of the education debate, the New York City teachers’ union, the United Federation of Teachers, immediately opposed the proposal.

If we now know teacher effectiveness has a real, measurable impact on both student academic achievement and life outcomes like teen pregnancy, why aren’t teachers’ unions supporting plans to pay teachers with high value-added ratings more money? Pundits like Nick Kristof and the Daily News editorial page have jumped in to claim that the new research justifies merit pay plans like Bloomberg’s, and the oneinstituted by former chancellor Michelle Rhee in Washington, DC.

The policy implications of the Chetty, Friedman, and Rockoff paper are, however, far from clear. As the researchers note in their conclusion, their study was conducted in a low-stakes setting, one in which student test scores were used neither to evaluate nor pay teachers. In a little-noticed footnote (#64) on page 50, the economists write:

even in the low-stakes regime we study, some teachers in the upper tail of the VA [value-added] distribution have test score impacts consistent with test manipulation. If such behavior becomes more prevalent when VA is actually used to evaluate teachers, the predictive content of VA as a measure of true teacher quality could be compromised. [Emphasis added.]

The importance of this caveat cannot be overstated. As I’ve written in the past, there is evidence of increased teaching-to-the-test, curriculum-narrowing and outright cheating nationwide since the implementation of No Child Left Behind, which put an unprecedented focus on the test scores of disadvantaged children.

Despite these concerns about testing, the United Federation of Teachers has agreed in principle to a new evaluation system that depends in part on value-added; a similar system, after all, is already in place for determining whether teachers earn tenure. Negotiations between the union and the city are stalled not because, in the words of the Daily News, the union has “placed protecting the jobs of incompetents over the future financial well-being of children,” but because the union would like teachers who receive an “unsatisfactory” rating under the new system to have the right to file an appeal to a neutral arbitrator. Currently, the city Department of Education determines whether to hear appeals of teacher evaluations, and it rejects 99.5 percent of the appeals filed.

Given the widespread, non-ideological worries about the reliability of standardized test scores when they are used in high-stakes ways, it makes good sense for reform-minded teachers’ unions to embrace value-added as one measure of teacher effectiveness, while simultaneously pushing for teachers’ rights to a fair-minded appeals process. What’s more, just because we know that teachers with high value-added ratings are better for children, it doesn’t necessarily follow that we should pay such teachers more for good evaluation scores alone. Why not use value-added to help identify the most effective teachers, but then require these professionals to mentor their peers in order to earn higher pay? That’s the sort of teacher “career ladder” that has been so successful in high-performing nations like South Korea and Finland, and that would guarantee that excellent teachers aren’t just reaching twenty-five students per year but are truly sharing their expertise in a way that transforms entire schools and districts.

2 thoughts on “Policy Implications of the New Teacher Value-Added Research

  1. norm scott

    Why not use value-added to help identify the most effective teachers that bring shy children out of their shells, or teach children with difficulty getting along with others to flourish in a social environment or organize amazing enrichment activities, etc — the kinds of things that don’t necessarily show up in test scores and are especially important in the elementary school grades? The focus on the test will drive teachers away from doing the kinds of things with children that are as important as test results. I speak from 35 years experience and even I am surprised when former students with children of their own recall the things that really made a difference for them.

    Reply
  2. NashvilleJefferson

    While I completely agree with all the problems inherent to the value-added method, especially when used at an individual teacher level (where the data becomes much more incoherent than at the district or even school level), isn’t this, in some ways, an indictment of the tests we use? If we had more authentic assessments, would we care as much if teachers were “teaching to the test”?

    It’s pretty easy to cheat on a multiple choice test; it becomes much, much harder, the further away you move from that model.

    Better assessments, of course, aren’t nearly a complete fix, but I find it interesting to take a look at the other end of the whole test-prep/NCLB/value-added saga, even if only as a mental exercise.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>