The Mathbabe Pushes Back Against VAM Critics— But Overturning VAM will be a Daunting Challenge
In a blog post a few weeks ago that Diane Ravitch linked to yesterday Cathy O’Neill (a.k.a the Mathbabe) offered some counterarguments to critics who pushed back when she slammed VAM (Value Added Model) in her recent book Weapons of Math Destruction”. As one who was seeking a way to make use of the test scores that are generated due to the NCLB mandates that emerged in the early 2000s, I was drawn to the ideas that William Sanders proposed regarding “value added” testing. But I quickly saw that the rigorous methods he initially advocated were being oversimplified and in virtually all cases the tests that many “reformers” wanted to use to measure “value added” were NOT designed for that purpose. Moreover, as statisticians like Ms. O’Neill noted, VAM was a wrongheaded approach to begin with. Nevertheless, despite all the flaws in VAM, it gained traction among politicians who saw it as a means of “weeding out” bad teachers and saw the critics of VAM as either union apologists or etherial intellectuals. Consequently, when President Obama was elected and passed an overly modest stimulus package for public education, he used VAM as the centerpiece of his Race to the Top (RTTT) grant program, effectively requiring that it be used as the basis for teacher evaluations in order for States to receive any of the funding. The two States I was working in at the time, NH and VT, were among the last to seek RTTT funds, in large measure because the leadership in the State got pushback from either State Boards or Superintendents.
In her recent post, Ms. O’Neill responds to one of the frequent rebuttals she’s received as a result of her criticism of VAM, with my emphasis added:
Here’s an example of an argument I’ve seen consistently when it comes to the defense of the teacher value-added model (VAM) scores… Namely, that the teacher’s VAM scores were “one of many considerations” taken to establish an overall teacher’s score. The use of something that is unfair is less unfair, in other words, if you also use other things which balance it out and are fair.
Ms. O’Neill makes one clearly straightforward logical rebuttal to this “one of many considerations” argument, with my emphasis added:
The obvious irony of the “one of many” argument is, besides the mathematical one I will make below, that the VAM was supposed to actually have a real effect on teachers assessments, and that effect was meant to be valuable and objective. So any argument about it which basically implies that it’s okay to use it because it has very little power seems odd and self-defeating.
While the use of the “one of many” argument IS “odd and self-defeating”, it is also an argument that has intuitive appeal and one that would enable the use of a “valuable and objective” tool that is also— conveniently— cheap, easy, and seemingly exacting. But what if the exactitude is pointless and meaningless? As Ms. O”Neill notes, when everything else that constitutes a teacher evaluation yields very little variance, as is the case in teacher evaluations, the pointless and meaningless but exact measures can ultimately be the determining factor.
The VAM was brought in precisely to introduce variance to the overall mix. You introduce numeric VAM scores so that there’s more “spread” between teachers, so you can rank them and you’ll be sure to get teachers at the bottom.
But if those VAM scores are actually meaningless, or at least extremely noisy, then what you have is “spread” without accuracy. And it doesn’t help to mix in the other scores.
In a statistical sense, even if you allow 50% or more of a given teacher’s score to consist of non-VAM information, the VAM score will still dominate the variance of a teacher’s score. Which is to say, the VAM score will comprise much more than 50% of the information that goes into the score.
In the end, I have to believe that some statistician at the USDOE knew this whole concept was flawed but supported it anyway because VAM is easy to implement, relatively inexpensive, and intuitively appealing. The shame is that once a concept like this takes hold, correcting it is extremely difficult as is replacing it with something new. And with ESSA now in place, it will require a change of heart in 50 State capitols since virtually every state in the union embraced the VAM precepts when they accepted the RTTT funds. The “Weapon of Math Destruction” will be the Obama-Duncan legacy….