Imagine you are a judge on a cooking show. Every contestant prepares three different dishes, and you must choose the best cook. But different cooks are good at different things, so what measure can you use to judge them all?
That’s the question California lawmakers are grappling with in trying to rate schools. Historically, we’ve thrown all the things that schools do into a blender and judged the “soup” that comes out.
But that “soup” score misses a lot. It conceals what we know about schools’ true performance and gives them no guidance on how to improve. A few schools in California are bad at nearly everything, but most are good at some things even as they fall short in other areas. The blender approach mixes everything together – from average test scores to performance growth to graduation rates – and fails to distinguish one from another.
California has been working to develop a smarter accountability system. The new system would judge schools the way we judge students, on a “dashboard” that gives parents information on how each school is doing on eight state priorities. Teachers don’t even try to judge which of their students is “best.” Instead, they provide feedback in multiple areas, from English to art to effort. The state’s dashboard would tell parents where their schools are doing well and where they need to do better, on measures including improvement in student test scores, school safety and parental engagement.
Unfortunately, the federal government is demanding that California continue to rank schools on the “soup” standard. The draft regulations for the new “Every Student Succeeds Act” require states to develop a single index, which could be a numerical score such as the Academic Performance Index that California has abandoned or even a letter grade.
This is a serious error. Relying on a single number to represent school quality appears to support precise judgments about the performance of schools. But, as a recent report from Policy Analysis for California Education has shown, the numbers that come out of the blender are fundamentally arbitrary and often seriously misleading.
For example, in the PACE study the correlation between average test scores and growth in student performance over time is only 14 percent. In some schools where average test scores are low, student performance is improving rapidly. In other schools test scores are high, but students show little or no improvement. Adding more ingredients to the soup (graduation rates, English learner reclassification) makes the problem worse.
Some advocates for children are supporting the federal government in its effort to require a single overall score. They argue that judging schools on multiple priorities will overwhelm parents, who need a simple and transparent measure of school quality.
This underestimates parents, who have no trouble reading their children’s report cards. Fake precision is not the same as transparency.
Calculating a single performance index is a bad way to evaluate students and cooks, and it is also a bad way to evaluate schools. The federal government should allow California and other states to design accountability systems that give parents full and accurate information about the performance of their schools.
David N. Plank is executive director of Policy Analysis for California Education and a professor in the Graduate School of Education at Stanford University. He can be contacted at firstname.lastname@example.org.