What We Talk About When We Talk About Teacher Evaluations in the New Yorker

new yorker

Rebecca Mead, a staff writer at the New Yorker, has a piece on teacher evaluations in this month’s issue.

As I’ve written before, I have very mixed feelings on legislatively mandated teacher evaluations.

Good journalism could go far in curbing some of the excesses of this policy initiative. Unfortunately, in this piece, Mead fails to rigorously analyze the issue at hand.

Instead, she falls into the two traps of education reporting: (1) over focusing on raising or lowering the status of individuals and (2) not having a good enough grasp of the research data.

Additionally, she succumbs to the Nirvana Fallacy.

Status Games

Mead’s method or raising or lowering the status of individuals is to selectively quote them. Cuomo’s quotations are strident and simple-minded, while Farina and de Blasio quotes are warm-hearted and nuanced. None of the three are angels; each could have been quoted in a manner that lowers their status, but, in this case, only Cuomo receives the treatment.

In this piece and a previous piece, Mead also refers to the fact that she sends her son to a public school where over 70% of the students opt-ed out of the state test (in the earlier piece she insinuates her son opted out), which seems to put her squarely in Farina and de Blasio’s camp.

By raising the status of those she agrees with and lowering the status of those she disagrees with, Mead’s writing comes off as biased.

Weak Grasp of Research 

On the data side, Mead’s discussion of the reliability of value-added ratings consists of three words; she calls it: “a contested science.”

These three words are hyperlinked to a Valerie Strauss (not a blogger known for nuance) post that highlights a single piece of research against value-added teacher evaluations.

Mead should have mentioned other studies, which, together, present a more complicated picture.

Despite it being core to her argument, she does not mention the research comparing the reliability of evaluations conducted by principals, students, and outside observers. Nor does she cover the research demonstrating that testing is a key driver of learning.

But, most surprisingly, she does not mention Chetty’s value-added study. To quote from the study:

Students assigned to high-VA teachers are more likely to attend college, attend higher- ranked colleges, earn higher salaries, live in higher SES neighborhoods, and save more for retirement. They are also less likely to have children as teenagers. Teachers have large impacts in all grades from 4 to 8. On average, a one standard deviation improvement in teacher VA in a single grade raises earnings by about 1% at age 28. Replacing a teacher whose VA is in the bottom 5% with an average teacher would increase the present value of students’ lifetime income by more than $250,000 for the average class- room in our sample. We conclude that good teachers create substantial economic value and that test score impacts are helpful in identifying such teachers.

Chetty is not a corporate reformer hack. He’s a John Bates Clark Medal winner who teaches economics at Harvard.

This is not to say that the Chetty study proves that Cuomo’s proposal is right on the merits. But it is worth mentioning.

If I were a parent trying to gauge if we should evaluate teachers by their value-added score, I’d want to be aware of a major longitudinal study that links high value-added scores with major positive outcomes in student lives.

Instead of reviewing this study, however, Mead provides us with a hundred word summary of de Blasio’s testimony to the budget committee in Albany.

Nirvana Fallacy

Mead asserts that “no reasonable person” denies that teachers should be evaluated; for her, the question is in the “how.”

The current system, which weights student achievement growth for 20% of the overall evaluation score, has resulted in 98.7% of teachers being rated effective.

Cuomo believes that increasing the weight of student achievement growth will deliver more accurate ratings. This may or may not be true. But Cuomo has put forth a proposal that can be evaluated.

Mead does no such thing, nor do her protagonists.

Cuomo’s proposal might not be perfect, but what we should consider is: (1) is it better than the current system? and (2) is it better than any proposed alternatives?

Mead criticizes Cuomo’s proposal without turning this same discerning eye onto the status quo policy or other alternatives.

Just because Cuomo’s proposed policy is not Nirvana, it doesn’t mean it’s not the best option.

In Sum 

Mead argues that we overuse testing in public schools. To make her case, she: (1) raises the status of those who agree with her (2) lowers the status of those who don’t (3) overlooks important research that provides evidence against her thesis (4) criticizes a proposal that might improve teacher evaluations (5) but then provides no alternative solution.

Lastly, she does not even address the major elephant in the room: the reason we’re even having this debate is because of the dysfunctional relationship between government operated school systems and public labor unions.

As I’ve argued before, I think we should let non-profit organizations operate schools, hold these schools accountable for results, and get out of the business of passing one-size-fits-all evaluation laws.

On this last point, Mead and I might be in agreement.

2 thoughts on “What We Talk About When We Talk About Teacher Evaluations in the New Yorker

  1. Peter

    Chetty is, in fact, a corporate reform hack, and his research fails to distinguish between correlation and causation. There is no way to present a “balanced” view of VAM because there is no research to support it. Even the Gates-sponsored AERA study stated “Value-added performance measures do not reflect the content or quality of teachers’ instruction.”

    The “well, what’s your better choice” argument is also flawed. No alternative to Cuomo’s 50% proposal is required to prove that his idea is deeply flawed– bad data from bad tests plugged into bad VAM formulas will not yield useful results. That is true whether we’re considering and alternative or not.

    1. nkingsl

      Peter, the research community is not united on your view. From Goldhaber’s new paper:

      “Using value added to inform high-stakes decisions is certainly controversial, and as the quotes at the beginning of this article indicate, there is not currently a consensus, or anything close to one, in the research community on the use of value- added measures for evaluation and decision making.”

      I think it’s reasonable to not support VAM based evaluations. But your description of the research community does not seem accurate.



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.