The NY Times describes how one of the promises of the Common Core, the ability to compare educational outcomes among states, is failing, since states are defining student proficiency is different ways (boldface mine):
Ohio seems to have taken a page from Lake Wobegon, where all the children are above average.
Last month, state officials releasing an early batch of test scores declared that two-thirds of students at most grade levels were proficient on reading and math tests given last spring under the new Common Core requirements.
Yet similar scores on the same tests meant something quite different in Illinois, where education officials said only about a third of students were on track. And in Massachusetts, typically one of the strongest academic performers, the state said about half of the students who took the same tests as Ohio’s children met expectations.
It all came down to the different labels each state used to describe the exact same scores on the same tests.
What the Times neglects to mention is that the Common Core models its difficulty levels after those of the NAEP, which uses very difficult cut scores. That context is critical. Massachusetts, which by all accounts, didn’t monkey about much with the definition of proficiency, only had about half of its students score ‘proficient’ or better. Yet Massachusetts is not only one best school systems year in and year out in the U.S., but also the world. Put another way, all European countries would have less than half of their students score as ‘proficient’, based on how they do relative to Massachusetts.
It’s worth noting what Common Core and the NAEP mean by proficient (boldface mine):
I served on the NAEP governing board for seven years. I understood that “proficient” was a very high standard. There are four NAEP achievement levels: Advanced (typically reached by 5-8% of students); Proficient (typically reached by about 35-40% of students); Basic (typically reached by about 75% of students); and Below Basic (very poor performance, about 20-25% of students). Thus, by aligning its “pass” mark with NAEP proficient, the PARCC and SBAC (the two testing groups) were choosing a level that most students will not reach. Only in Massachusetts have as many as 50% of students reached NAEP proficient. Nearly half have not….
So, if these consortia intend to align with the very rigorous standards of NAEP, most students will fail the tests. They will fail them every year…
It is time to ask whether NAEP proficient is the right “cut score” (passing mark). I think it is not. To me, given my knowledge of NAEP achievement levels, proficient represents solid academic performance, a high level of achievement. I think of it as an A. Advanced, to me, is A+. Anyone who expects the majority of students to score an A on their state exams is being, I think, wildly unrealistic. Please remember that NAEP proficient represents a high level of achievement, not a grade level mark or a pass-fail mark. NAEP basic would be a proper benchmark as a passing grade, not NAEP proficient.
Furthermore, the NAEP achievements levels have been controversial ever since they were first promulgated in the early 1990s when Checker Finn was chairman of the NAEP governing board. Checker was subsequently president of the Thomas B. Fordham Foundation/Institute, and he has long believed that American students are slackers and need rigorous standards (as a member of his board for many years, I agreed with him then, not now). He believed that the NAEP scale scores (0-500) did not show the public how American students were doing, and he was a strong proponent of the achievement levels, which were set very high.
James Harvey, a former superintendent who runs the National Superintendents’ Roundtable, wrote an article in 2011 that explains just how controversial the NAEP achievement levels are.
He wrote then:
…What about NAEP? Oddly, NAEP’s proficient standard has little to do with grade-level performance or even proficiency as most people understand the term. NAEP officials like to think of the assessment standard as “aspirational.” In 2001, long before the current contretemps around state assessments, two experts associated with the National Assessment Governing Board—Mary Lynne Bourque, staff member to the governing board, and Susan Loomis, a member of the board—made it clear that “the proficient achievement level does not refer to ‘at grade’ performance. Nor is performance at the proficient level synonymous with ‘proficiency’ in the subject. That is, students who may be considered proficient in a subject, given the common usage of the term, might not satisfy the requirements for performance at the NAEP achievement level.”
It is hardly surprising, then, that most state assessments aimed at establishing proficiency as “at grade” produce results different from a NAEP standard in which proficiency does not refer to “at grade” performance or even describe students that most would think of as proficient. Far from supporting the NAEP proficient level as an appropriate benchmark for state assessments, many analysts endorse the NAEP basic level as the more appropriate standard because NAEP’s current standard sets an unreasonably high bar.
…In 1993, the National Academy of Education argued that NAEP’s achievement-setting processes were “fundamentally flawed” and “indefensible.” That same year, the General Accounting Office concluded that “the standard-setting approach was procedurally flawed, and that the interpretations of the resulting NAEP scores were of doubtful validity.” The National Assessment Governing Board, or NAGB, which oversees NAEP, was so incensed by an unfavorable report it received from Western Michigan University in 1991 that it looked into firing the contractor before hiring other experts to take issue with the university researchers’ conclusions that counseled against releasing NAEP scores without warning about NAEP’s “conceptual and technical shortcomings.”
…Those benchmarks might be more convincing if most students outside the United States could meet them. That’s a hard case to make, judging by a 2007 analysis from Gary Phillips, a former acting commissioner of the National Center for Education Statistics. Phillips set out to map NAEP benchmarks onto international assessments in science and mathematics and found that only Taipei (or Taiwan) and Singapore have a significantly higher percentage of proficient students in 8th grade science than the United States does. In math, the average performance of 8th grade students in six jurisdictions could be classified as proficient: Singapore, South Korea, Taipei, Hong Kong, Japan, and Flemish Belgium. Judging by Phillips’ results, it seems that when average results, by jurisdiction, place typical students at the NAEP proficient level, the jurisdictions involved are typically wealthy—many with “tiger mothers” or histories of excluding low-income students or those with disabilities.
…First, NAEP’s achievement levels, far from being engraved on stone tablets, are administered, as Congress has insisted, on a “trial basis.” Second, NAEP achievement levels are based on judgment and educated guesses, not science. Third, the proficiency benchmark seems reachable by most students in only a handful of wealthy or Asian jurisdictions.
It is important to know this history when looking at the results of the Common Core tests (PARCC and SBAC). The fact that they have chosen NAEP proficient as their cut score guarantees that most students will “fail” and will continue to “fail.” Exactly what is the point? It is a good thing to have high standards, but they should be reasonable and attainable. NAEP proficient is not attainable by most students. Not because they are dumb, but because it is the wrong cut score for a state examination. It is “aspirational,” like running a four-minute mile. Some runners will be able to run a four-minute mile, but most cannot and never will. Virtually every major league pitcher aspires to pitch a no-hitter, but very few will do it. The rest will not, and they are not failures.
It’s really disturbing that someone at NAEP didn’t think normalized scores were good indicators–they are a really good way to compare student populations (all hail the standard deviation). But we’ll let that slide.
It’s all the rage to adopt the “soft bigotry of low expectations” attitude: who doesn’t want to believe every student can reach high levels of achievement? That said, we also have to be aware of the cruel bigotry of unrealistic expectations. Over the short term, not all students will show tremendous gains. That doesn’t mean we–or they–are failing.