Diane Ravitch wrote a post on Monday posing a set of questions raised in a Washington Post op ed essay on benchmarking by Boston College professor Andy Hargreaves. In the essay Hargreaves describes his perspective on the rationale behind benchmarking, which came to the public’s attention in the 1990s as part of the response to the fallout from A Nation at Risk. Hargreaves rightfully points out that benchmarking– especially international benchmarking— has been used by “reformers” as a means of “proving” that US schools are deficient and, therefore, should be overhauled. But, as I noted in a comment to both posts, benchmarking is nothing new in public education.
For decades individual student performance has been based on benchmarks. Teacher-made tests served as the de facto benchmark for determining whether a student passed or failed. The aggregated set of grades a student earned (i.e. their transcript) served as a benchmark for determining whether a student gained entry to particular colleges or not. Students were disciplined based on standards set forth in student handbooks and or standards set by a classroom teacher.
In most cases these standards were normative and not formative: a student was not compared to a set standard but rather compared to his or her cohorts. One of the reasons for setting benchmarks was to devise standardized tests like the SAT that provided a means for colleges to determine if a student with all A’s at East Podunk HS was as prepared as a student from an elite private school. Another reason to move away from this normative comparison of cohort groups was to avoid using it as a basis for homogeneous grouping that identified some students as “high perfuming” and others as “slow”. An important reason was to establish a means of implementing a mastery learning model whereby students progressed individually instead of as a cohort.
Before decrying benchmarking I think it is important to realize it’s been in place— and not necessarily to good effect.
An anecdote from my experience as a HS Principal in rural ME illustrates two approaches to the “benchmarking” teachers used to grade students.
In November of the first year I was Principal I reviewed the computer print-out listing the grades each teacher assigned to students and discovered that every student in one of the science teachers’ classes received an “A”. I asked my secretary (this was 1977— we didn’t have “administrative assistants” at that time) to schedule an appointment with this teacher after school. My intention was to make certain he understood that we wanted to have higher standards in the school and that “giving all A’s” was unacceptable. When I asked the teacher to explain why he had “given” all of his students an A, he replied that he hadn’t “given” them anything, they earned it. He believed it was imperative that all his students master the information presented in order for them to understand the information he would be presenting in the coming units and so he insisted that they re-take tests until they earned an “A”. That meeting in my office stayed with me for years to come…. and was on my mind later that year.
At the end of every school year, there is invariably a student who falls short of a passing grade… and invariably a case where a teacher can decide whether a 64.5 is an “F” or a “D”. One young woman had started the year off badly because of issues she was dealing with at home and done very poorly academically as a result. As the year progressed, a combination of her emerging maturity and the amelioration of her problems at home resulted in an upward trajectory in her grades. Several of her teachers were sympathetic to her problems and recognized that the improvement was genuine. Her social studies teacher, however, who was skeptical of my “higher standards” mantra, threw it back in my face when the student fell .75 short of his “high standard”.
Both teachers had benchmarks, but each was using them for different ends. As readers of this blog realize, I’ve come to realize that the science teacher’s benchmarks are the ones we SHOULD be using when we grade schools and students. Unfortunately, it’s the social studies standard that is in place thanks to NCLB, RTTT, and “education reform”.
When NCLB passed, I remember reading what I thought at the time was an especially cynical column suggesting that the intent of the bill from the conservative perspective was to undermine the public’s support for public education by devising a rating system that would demonstrate how poorly American schools were doing. I thought that was cynical until I saw the rating system itself, which WAS clearly designed to make virtually all schools by defining a school as “failing” if it failed to meet unrealistically high growth goals for any sub-group of students. Thus, a high performing school that had a single grade level cohort of, say, 10 special education students who failed to “grow” as measured by standardized test results was deemed to be a “failing” school. It was no surprise, then, that as time went on more and more schools were defined as “failing”, and it was even less of a surprise that public education critics used these results to repeatedly bludgeon public schools… and not at all surprising to see that while NCLB has not resulted in ANY substantial improvement in NAEP scores it has succeeded in one are: the erosion of public support for schools.
Diane Ravitch’s post on the Phi Delta Kappa annual poll on public education included these tidbits:
Local public schools get high marks from public school parents at the same time that American public education gets low marks. This seeming paradox shows the success of the privatizers’ relentless attacks on public education over the past decade. For years, the public has heard Arne Duncan, Bill Gates, Michelle Rhee, Jeb Bush, and other supporters of privatization decry American public education as “broken,” “obsolete,” “failing.” Their message has gotten through. Only 17% of the public gives American education an A or a B.
At the same time, however, 67% of public school parents give an A or B to the public school their oldest child attends.
The parents have always given higher grades to their schools than the general public, but the erosion of support from the general public was made clear after I googled PDK surveys and found an article from the North Carolina DOE providing an overview of the results of the 2000 survey, the las survey before the advent of NCLB. Here are it’s findings on public support for schools:
Public support for public schools is at an all-time high. For the first time in the 33-year history of the Phi Delta Kappa/Gallup Poll, a majority of respondents gave their schools either an A or B. Fifty-one percent of all those surveyed rated their schools an A or B with the figure climbing to 62 percent for public school parents and to 68 percent when these same parents were asked to grade the school their oldest child attends. On the 2000 Carolina Poll, 52 percent of North Carolinians said they would give the public schools in their communities a grade of A or B.
To drive the point home: there has been NO change whatsoever in terms of parent’s assessments of their child’s school but a precipitous decline in terms of the public’s assessment of public education. The cynics were right: Edward Kennedy and the Democrats who signed on to NCLB were duped and the public’s support for “government schools” is at an all time low 13 years after it was at an all time high…. and nothing’s changed in terms of the results. Mission accomplished.
David Kirp’s op ed essay, “Teaching is Not a Business“, echoes many posts on this blog. In addition to the pithy aphorism that serves as the title, Kirp’s essay touches on a host of topics that I’ve blogged on in detail, including:
- the need for teachers to be champions for their students
- the failed idea of using standardized tests as the ultimate measure of education, teacher performance, and school performance
- the demonstrable failure of the “turnaround” idea
- the shortcomings and pitfalls of merit pay plans
- the lack of evidence that charter schools are any better than public schools
- the reality that organizational change is superior to the quick fix inherent in “disruption” and the application of traditional business practices
- the reality that organizational change takes time
- the inherent messiness of any enterprise that provides human services
- the failed promise of technology
A look back at blog posts will show that the number of Times articles championing market-based solutions to education, the use of business practices in public education, charters, vouchers, disruptive technology, and “turnaround schools” FAR outnumber the articles like Kirp’s that are based on practical, realistic solutions. I’m glad the Times is giving its readers “the rest of the story”…. but expect to see several counter arguments in letters to the editor characterizing Kirp as a defender of the status quo, a union apologist, and an academic promoting failed ideas. I hope I’m wrong.
I have held college rankings in disdain for years. They are reductionist to an extreme, measuring the easy-to-measure elements that differentiate one college from another and, because the metrics are mathematical, yielding a seemingly exact numeric differentials among colleges and universities that are, upon close inspection, inconsequential. For example, the differences between the top ranked and second ranked college in one category (say, engineering schools) may not be the same as in another category (say art schools) and the numeric difference between the third and fourth ranked college may be do different than the difference between the fourth and sixteenth ranked college within a category.
Despite my misgivings about rankings, it is evident from the sales of US News And World Report experienced when it published it’s annual ratings that most Americans love them… and based on the way colleges respond to the rankings it is evident they are valued by prospective students and their parents. I do believe, however, that the US News And World Report rankings are seldom deal-breakers or deal-makers when it comes to students making their final decision. It may save a cross country trip to visit a campus, but I doubt that a student with acceptances to two or three schools refers to them to make his or her final decision… but it DOES sell magazines and it DOES generate lots of faculty room and coffee-clatch conversation and, if a college is highly rated, generates lots of calls and mailings to alumni.
Given my misgivings I was dismayed when I read that President Obama is advocating a rating system for colleges that incorporates some kind of cost/benefit analysis… and… in response to this emerging trend from the US government and the profits realized by U S News and World Report more and more media outlets are jumping on the rating bandwagon, including Money Magazine. Kevin Carey reported on this development an article that appeared in Tuesday’s NYTimes Upshot section… and the title of the article tells you all you need to know about the rating system: “Building a Better College Rating System. Wait! Babson Beat Harvard!” With an undergraduate degree from Drexel University and two graduate degrees from the University of Pennsylvania I can testify to the fact that there is NOT that much difference in rigor between a “middle tier” university and an “elite” Ivy League school… and I can also testify to the fact that what separates the two institutions defies a simplistic mathematical metric like “earnings after 10 years” no matter how sophisticated the weighting of various exogenous factors…
Measuring quality is a difficult proposition and, in my judgment, not worthwhile. But it is relatively easy to identify and regulate institutions that mislead prospective entrants, fail to support enrolled students, employ unqualified and/or underpaid staff that turn over frequently, and have abysmal graduation rates. The time and money spent developing arcane statistical calculations to create gradations between good and excellent schools would be better spent aggressively monitoring those institutions that are profiteering at the expense of gullible students.
Todday’s NYTimes Magazine features an article by Elizabeth Green titled “Why Do Americans Stink At Math?”, an article well worth reading because it provides a good description of what it would take to make Americans perform at a higher level but an article that underemphasizes or overlooks some of the subtle reasons that contribute to our deficiencies.
Ms. Green contrasts the Japanese methods of teaching mathematics with those used in the US, focussing on Akihiko Takahashi, an education reformer from Japan, and Takeshi Matsuyama, an elementary teacher affiliated with a university-based lab school who was his mentor. Together, they transformed mathematics instruction in Japan. Like Deming before them, Takahashi and Matsuyama implemented the recommendations of US experts, recommendations that our country rejected because they did not fit the hierarchical “factory model” of management that blinds us to new and different ways of thinking. Surprisingly Ms. Green overlooked the parallel to Deming’s experience, which mirrored that of Takahashi and Matsuyama and continues to limit our ability to innovate.
Ms. Green also contrasts the Japanese method of teacher training, which is ongoing and organic, with the virtual absence of training in our country. Instead of stand-alone workshops or the accumulation of graduate credits, Japanese teachers engage in “lesson study”, which is time provided for teachers to meet and discuss their teaching methods and to observe each other’s instruction. But she fails to emphasize the funding that would be required to provide the time needed for teachers to have the time for lesson study nor does she note that shift in thinking that would be required to move away from our credential-based method of measuring teacher learning, a method that is often based on seat time.
As one who led school districts from 1980 through 2011 I saw two other factors that Ms. Green overlooked or underemphasized: our country’s obsession with standardized tests and the unwillingness of parents and school boards to accept “non-traditional ways” of teaching mathematics and scheduling teacher time.
Ms. Green described how the emphasis on standardized tests reinforced “traditional” methods of teaching when she noted that while “…lesson study (in Japan)is pervasive in elementary and middle school, it is less so in high school where the emphasis is on cramming for college entrance exams”. In our country, the emphasis is on cramming for examinations from the very outset… and that emphasis is deleterious. Especially since to date, standardized tests have NOT measured the kinds of mathematics instruction valued by NCTM: they have focussed on the “skills” traditionally taught to parents and school board members, skills that are easy to test (see yesterday’s post for evidence of this).
Ms. Green made no mention of how any effort to introduce “non-traditional” methods of mathematics instruction meets with resistance from parents who complain that “they can’t help their children with homework” because they “don’t understand” the work assigned. And when that attitude is combined with our obsession with test scores, if the scores don’t jump immediately the “new math” books are soon be abandoned in favor of the worksheets that match the tested curriculum and the meme about the “failure of new mathematics” is reinforced.
School boards not only face resistance from parents, they also face budget challenges, which can pose the biggest obstacle to introducing innovation. When administrators contemplate the implementation of something akin to “lesson study” they need to hire additional staff to provide release time for teachers to engage in such a program. One way to provide more release time is to increase class sizes (Japan has much larger class sizes than the US), a recommendation that flies in the face of conventional wisdom in the US and meets resistance from teachers as well as parents.
Finally, as noted repeatedly in this blog, we need to stop thinking of our schools as factories that pour information into students who progress along an assembly line in lockstep based on their age and whose progress is measured by standardized tests and hours spent in the classroom. The bottom line: until we stop thinking of our schools as factories we will see no meaningful change or improvement.
Here’s a list of Pearson’s errors in administering standardized high stakes tests compiled by FairTest and blogged by Diane Ravitch. The list is as unsurprising as it is long.
The first course I took as a graduate student in educational administration in 1970 was on test construction. To get us started on gaining an understanding of the flaws in standardized tests the instructor distributed copies of the Stanford Achievement Test and asked us to find five errors in the construction of the questions after reading Chapter One of the assigned text. At the time the local Philadelphia newspapers used the Stanford Achievement test results to “rank” schools in the city— adding credence to the test’s validity and precision. In all, the test had roughly 75 questions… 13 of which were poorly constructed based on flaws described in Chapter One. In some cases there were two correct answers and in other cases there was no clear correct answer. Needless to say, I’ve been a skeptic of the “precision” of standardized testing ever since.