Should states use test score based accountability systems? If so, how? If not, why?

Over the past decade, I’ve deepened my belief in the power of letting educators form non-profits to run public schools. Both experience (walking into amazing public schools) and research (a track record of reading and math gains) have shown me that non-profits are an incredibly valuable tool in making public education better.

I’ve also deepened my belief in unified enrollment systems. They can give families a lot of information about public schools and make enrolling in public schools much easier.

I do not have deep confidence in my views on accountability. I often find myself moving up and down the spectrum of: no accountability (just let parents choose), to accountability-lite (require testing, share this information, but don’t intervene), to accountability heavy (require testing, give schools letter grades, intervene in lowest performing schools).

I think reasonable arguments can be made for all three approaches.

Recent NWEA Research

NWEA just published a new report using a national data set from the tests they license to schools. Many schools we work with use these tests. I’m not expert enough in statistics to evaluate the reliability of their findings, but the report raised some important issues.

Absolute test scores are highly correlated with poverty. The chart below shows that test scores rise as income increases. This is not new information.

Screen Shot 2018-10-04 at 7.17.05 AM

Student growth is not tightly correlated with poverty. Unlike absolute achievement, individual student growth does not rise significantly with income. Many high poverty achieve growth that mirrors those of their wealthier peers.

Screen Shot 2018-10-04 at 7.19.26 AM.png

Schools with similar levels of poverty perform very differently on growth. The red line in the chart below represents how schools with high poverty perform on academic growth. It is a fairly wide curve. Many schools achieve low growth, while others achieve very high growth. To the extent you believe that growth is a pretty good measure of school performance (the researchers do), this performance spread might increase a policymaker’s willingness to intervene in low-performing schools and expand high-performing schools.

Screen Shot 2018-10-04 at 7.23.15 AM.png

Focusing on absolute test scores will cause you to misidentify many, many schools. The graph below is tricky to read, but it’s very important. The red line represents all schools that are in the bottom 5% for absolute test scores. And it shows that 77% of these schools (the bottom 5% on absolute) are close to the average or better on growth. In other words, if you just closed the bottom 5% of schools based on absolute achievement, nearly 80% of the schools you’d close probably would be mistakenly closed (given their growth scores). This is pretty damning evidence against those who want to focus mostly on absolute achievement in accountability measures.

Screen Shot 2018-10-04 at 7.27.08 AM

When Does a Good Policy Idea Become Indefensible Because of Bad Practice?

Over the past few years, most states reworked their accountability systems during the reauthorization of No Child Left Behind.

Unfortunately, this report found that only 18 states weighted growth for at least 50% of the total accountability score, with another 23 states weighting growth at least at 33%.

On one hand, this is an improvement over old accountability systems. On the other hand, this means a lot of states are unfairly rating high poverty schools that have decent growth but low absolute scores.

I think a fair critique of test based accountability is that it’s a reasonable idea that has very little hope of being reasonably implemented.

My Own Thoughts

Again, I do believe deeply in letting non-profit organizations operate public schools. And I do believe deeply in enrollment systems that make it easier for families to find a great school for their children.

I’m uncertain about accountability, but here’s what I think I’d do if I were superintendent of a school district:

  1. Calculate a letter grade score for growth and a letter grade score for absolute achievement score.
  2. Publish the higher of these grades as the letter grade that appears most prominently on the online enrollment system. I would also include the lower letter grade, as well as a bunch of information about school programs and curriculum, on the school’s online profile.
  3. Allow for government intervention in schools that are in the bottom 5-10% for both growth and absolute (you need to perform bad on both).

This type of accountability system gives parent’s good information, avoids the political war of giving low letter grades to schools with high absolute scores, and avoids the error of intervening in schools that have low absolute scores and higher growth scores.

It does give an accountability pass to schools with high absolute scores and low growth, but I view this ok in that it’s both politically useful and it does reflect the notion that parents really want to get into these schools.

It also still uses test scores as the primary way to evaluate schools. This sits uneasy with me, as I think schooling is about much more than tests, but I haven’t seen any other way to measure schools that feels more reliable. I hope this changes.

I’m not very confident that this is the best system, but I think it’s the best of a bunch of options that all have reasonable drawbacks.

Another hard question would be what to do if local politics did not allow for the creation of a system like this. At some point, if the drum beat for absolute scores was too much, I’d probably walk away from accountability as a superintendent.

But I’m not sure. If you scan this blog’s history, I’m sure you can find me saying conflicting things about accountability. I’m conflicted about it. But the above reflects my current thinking of what makes for a good accountability system.

Lastly, if you want to hear a good version of the argument against test based accountability, see here.

3 thoughts on “Should states use test score based accountability systems? If so, how? If not, why?

  1. eanelson2014

    Before you can decide on test-based accountability, I think we need to know what we should be testing for. For example
    Take a look at the state tests scores in the graph at the top of this PDF:
    and answer this question: Are these test scores good news or bad news?
    In math, what skill should we be measuring? Reasoning? Or Computation?
    Does it make a difference?
    The article below the graph discusses what cognitive science says on the issue.
    — Eric (rick) Nelson

  2. Brenda Montaine, EdD

    “Sometimes, the most brilliant and intelligent minds do not shine in standardized tests because they do not have standardized minds.” ― Diane Ravitch
    Therefore, how can there be standardized assessment, as indeed, what is being tested toward reward and glory or punishment and hopelessness?
    I believe a growth model that drives instruction and celebrates learning is an improvement over publishing low scores without context of acknowledging diligent work, by students and staff, and major milestones met.
    I think assessment disconnect and dissonance continues. Would love to help solve this conundrum. Thank you for sharing your sentiments.

  3. Rob Kremer

    The relationship between absolute test scores and poverty is well established. It is a bit of a surprise, however, that the relationship between academic growth and poverty is weak. It seems logical that the lower the absolute score the easier it would be to achieve gains – which would suggest a positive relationship between growth and poverty. So why is that not the case?

    Work we have done at Pearson suggests a reason: Schools that have high levels of student mobility will show lower academic growth. Why? Because a student in a new school for the first couple years show negative growth, due to the well researched “school switching effect.”

    The more first and second year students a school has, the lower their growth will be. Poverty and student mobility are positively correlated.

    So, we don’t see a positive relationship between high poverty schools and high academic growth due to mobility.

    This is testable: if your data was adjusted for student mobility (defined as the % of tested students who are in their first or second year in the school) I’d expect to see a much stronger positive correlation between poverty and growth.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.