The bottom line of two maddening NYTimes articles is captured in the title of this blog post… and until the newspaper of record understands the limitations of testing, the effect of testing on the curriculum, and the need to emphasize funding equity the sooner we will improve schooling for all children.
As noted in an earlier post, the opt out movement had a real impact in New York State where 20% of the students did not take the examination. The title of Elizabeth Harris’ article in today’s paper, “Test Refusal Movement’s Success Hampers Analysis of New York State Exam Results”, indicates that the officials in the state acknowledge that the opt out movement had its intended effect… and it’s leader summed up the desired impact concisely:
“We always said that compliance just means more of the same,” said Jeanette Deutermann, a central figure in Long Island’s test-refusal movement. “The hope was to disrupt it to the point where it cannot be used,” she continued, to where “there are not enough children taking the test to close a school, or not enough data to fire a teacher.”
The Times, like most mass media, emphasize the second half of Ms. Deutermann’s statement while overlooking the first point entirely: the relentless emphasis on testing reinforces the factory school model that has failed and continues to fail children in all public schools.
“Opting Out of Standardized Testing Is Not The Answer”, the Times editorial today proves that point, It touches all the talking points of the “reform” movement and casts the opt out movement as a group of parents who “… say the tests are too difficult or do not track with classroom instruction”, effectively echoing Arne Duncan, Andrew Cuomo, and all the neo-liberal reformers who believe that failing to use tests will only hurt those who are most disadvantaged. The only reliable data NYS gets is the same data states have been getting for decades: children raised in poverty do worse on standardized tests than children raised in affluence…. and children in affluent districts with high per pupil spending do far better than students in less affluent districts with lower per pupil spending.
Elizabeth Harris writes in today’s NYTimes that NYC is creating “…a task force to root out cheating by teachers and administrators in city schools” in response to recent reports of an increase in cheating on standardized tests. While the headline will grab people’s attention and may lead some to conclude that this is a new phenomenon, if one reads further they will come to these paragraphs:
In 2011, Richard J. Condon, the special commissioner of investigations, said allegations of test-tampering and grade-changing had more than tripled since Mr. Bloomberg took office; he attributed that rise to the growth in the number of schools and the growing link between student performance and the evaluation of teachers and schools.
Nearly four years ago, responding to a suspiciously high number of high school students just barely passing state tests, the New York State Board of Regents said teachers could no longer grade the exams of their own students.
Alas, Bill de Blasio may come out on the short end of the stick by dealing with this problem… and pay a political price in addition to the actual price for investigating the cheating incidents, which is reportedly $5,000,000 per year.
And compounding his woes is the fact that he is being pressed to expand the number of buildings and/or classrooms he makes available to for-profit charter schools, space that taxpayers heat, light, and clean while investors pocket profits of the freeloading charters who inhabit the classrooms.
In the end, the “reformers” get the best of both worlds: they impose a testing regimen that creates an environment where cheating incidents increase among adults which undercuts the credibility of public schools and then insist that they get free space for their deregulated for-profit charters whose test results are no better and whose incidents of cheating are unreported.
“Giving Doctors Grades“, an op ed article in today’s NYTimes by Sandeep Jauhar, describes the consequences of using simplistic metrics to determine the effectiveness of a complex operation: heart surgery. In the early 1990s, NYS decided to issue “Report Cards” to surgeons in an effort to provide easy-to-understand information on the ability of various medical practitioners. The result?
(T)he report cards backfired. They often penalized surgeons, like the senior surgeon at my hospital, who were aggressive about treating very sick patients and thus incurred higher mortality rates. When the statistics were publicized, some talented surgeons with higher-than-expected mortality statistics lost their operating privileges, while others, whose risk aversion had earned them lower-than-predicted rates, used the report cards to promote their services in advertisements.
This was an insult that the senior surgeon at my hospital could no longer countenance. “The so-called best surgeons are only doing the most straightforward cases,” he said disdainfully.
This sounded VERY familiar to me… and I left the following comment:
This wrongheaded method of measuring the performance of surgeons is analogous to the “Value Added” evaluation methods promoted by “school reformers” and adopted by Arne Duncan, Andrew Cuomo, the Regents, and host of other governors and State Boards. The standardized test scores used to “measure” teacher performance mirror the economic standing of the parents. Consequently teachers who choose to work with the most challenging students, like the surgeons who tackle the riskiest cases, could lose their jobs. Grading schools using test scores only serves to humiliate the entire faculty who choose to work with children raised in poverty. Both of these failed metrics have one thing in common: they are attempts to bring mathematical precision to fields of endeavor that are crafts more than sciences.
The notion that service organizations should be run like businesses leads to the need for “precise” metrics like mortality rates and VAM to be used in lieu of “the bottom line” so revered by businessmen. But service enterprises do not provide neat and tidy outcomes: they defy the kinds of measures that can be used to develop “stack ratings” or “grades” because they serve individuals who have different backgrounds, temperaments, and physical compositions. The desire to reduce everything to a single number to rank employees using some kind of “objective criteria” is ultimately a means to replacing the judgement of human managers with algorithms. It has not worked in the past and is unlikely to work in the future— unless the future is led by robots.
Today’s NYTimes column by Paul Krugman describes the flawed thinking by political leaders in Europe that led to their embrace of an idea that was deeply flawed according to a host of economists. Because of my personal background and biases, I immediately saw a connection between the flawed thinking regarding the Euro and the flawed thinking that led to VAM sweeping the country. VAM, like the Euro, seems like a logical and efficient solution to a complicated problem. But, as H.L. Mencken wrote, every complicated problem has a solution that is simple, clean, and wrong… and VAM is as simplistic and intuitively appealing as the Euro. After reading Krugman’s essay I left this comment:
This same kind of magical thinking by “…self-indulgent politicians” who “…ignore arithmetic and the lessons of history” is happening right now in public education. There is no mathematical or statistical basis for rating schools or teachers based on the standardized test scores of children yet governors, state legislators, and “reformers” all champion the idea. Why? To paraphrase Mr. Krugman: To people who didn’t know much about statistics, or chose to ignore awkward questions regarding the repeatedly demonstrable evidence that test scores correlate with income, using tests to measure performance sounded like a great idea…. and it is an idea that appears to be persisting in the bills before Congress today.
To paraphrase Mr. Krugman once more: The only big mistake of the “school reformers” was underestimating just how much damage the emphasis on standardized test scores would do to public education… or MAYBE that’s not a bug of the legislation— it’s a feature.
One of the main reasons I hope the ECAA bill in Congress never passes is that it insisted States use standardized tests as the primary metric for performance and allows States to continue using VAM if they choose to do so. We’ve had six years of VAM: another six years will make it even harder to rid the schools of this flawed idea and all of the schools serving children raised in poverty will be treated like Greece is being treated today.
Last week NYTimes columnist Paul Krugman wrote an op-ed piece titled “Fighting the Derp“. What is Derp?
“Derp” is a term borrowed from the cartoon “South Park” that has achieved wide currency among people I talk to, because it’s useful shorthand for an all-too-obvious feature of the modern intellectual landscape: people who keep saying the same thing no matter how much evidence accumulates that it’s completely wrong.
Over the past few days I’ve accumulated examples of Derp in public education. The most obvious example came to me in an instant” the belief that standardized testing will improve schools is the derp of public education “reform”. When I started my career as a school administrator in the mid 1970s Pennsylvania administered state standardized tests to determine which schools were doing best. Since that time I witnessed the advent of state tests in NH, MD and NY and to no one’s surprise the results never changed: schools in affluent communities and neighborhoods ALWAYS outperformed schools serving children raised in poverty. Now we have derp on steroids: a Secretary of Education who— despite evidence to the contrary presented by a national professional association of statisticians— believes standardized tests can be used to measure teacher performance. After decades of testing that has not improved the results one would think another idea might be tried… but that would upset the “school reformers” who want to be reassured in their beliefs.
Other examples of Derp in public education include:
- The belief that the development of grit and resilience in children— not additional funds for schools— are the secret sauce that children raised in poverty need in order to succeed in school.
- The belief that there is some way to scale up successful charter programs that are supplemented with grant funds without increasing the public funding needed for the replicated schools.
- The notion that if parents could select schools the way they select appliances that there would be more equity in education… despite the fact that affluent districts typically do not accept out-of-district students unless they pay full-price tuitions and charter schools have admissions standards that limit their enrollments.
- The idea that since “Government is the problem” and public schools are government schools they are blocking innovations and advances that would be possible if they were run like businesses.
- The notion that unions are the primary problem with school performance, a notion that persists despite the fact that the lowest performing schools on NAEP are in the south where unions are weakest.
- Our mental models concerning the grouping of children in age cohorts is pervasive and unshakeable and, as noted frequently in this blog, is one that drives many of the misguided “reforms”.
I am confident that this list is incomplete and welcome additions and corrections….
ELizabeth Harris’ article on a judges decision regarding the racial bias could have an impact on all graduation or grade-level promotion examinations across the country if someone took this case to it’s logical conclusion. According to Harris’ summary of the case, Judge Kimba Woods decision that the Liberal Arts and Sciences Test 2 (or LAST) turned on the fact that the test’s “content objectives” were irrelevant and unimportant to teaching.
The judge found that National Evaluation Systems, now called Evaluation Systems, part of Pearson Education, went about the process backward.
“Instead of beginning with ascertaining the job tasks of New York teachers, the two LAST examinations began with the premise that all New York teachers should be required to demonstrate an understanding of the liberal arts,” Judge Wood wrote.
As Judge Kimba’s quote indicates, this isn’t the first employment test that failed to pass muster because it was discriminatory. It’s predecessor, the LAST-1 was found wanting because it, too, unfairly discriminated based on race…. and the test to replace the LAST-2, which is no longer in use, is currently under review by courts.
“Reformers” AND policy makers should take heed at these findings when they advocate the use of “high stakes tests” for two reasons. First, not all skills can be assessed using pencil and paper tests… and the job tasks associated with teaching have more to do with relating to children and peers than demonstrating an understanding of liberal arts and science. Secondly, before using any “high stakes” tests that determine the long term fate of students or teachers policy makers should be certain that the tests measure requisite skills and not cultural or ethnic background.
So while courts are reviewing the LAST and finding it wanting and as a result school districts are paying damages to roughly 3900 people who “failed” the test and took substitute jobs as a result, schools and teachers are being evaluated on tests designed by Pearson that supposedly measure the students’ mastery of the common core which supposedly indicates a student’s readiness for work or college… And the results of those tests, like the results of LAST, seem to have a racial bias. In the case of LAST,
…the pass rate for African-American and Latino candidates was between 54 percent and 75 percent of the pass rate for white candidates. Once it was established that minority applicants were failing at a disproportionately high rate, the burden shifted to education officials to prove that the skills being tested were necessary to do the job; otherwise, the test would be ruled discriminatory.
Given the results of the standardized tests administered over the past several decades it is evident that they discriminate against children raised in poverty. Here’s an interesting question for the Regents: could they prove that the new Common Core Tests assess the skills necessary to enter college or the workforce? If not, the tests they are using to evaluate students, schools and teachers would be ruled discriminatory.