Schools vs. Standards

An important research study just came out on Newark charter schools.

And over at Education Next, there are three commentaries assessing the impact of the common core standards.

Taken together, these pieces offer a useful jumping off point for reflecting on what’s working in education reform.

Newark Charter Schools

Marcus Winters used Newark’s unified enrollment system to try and figure out the effect of enrolling in the charter sector. This is an innovative methodological approach that was pioneered at MIT (through Arnold Ventures funding).

Winters found large effects: a +.25 standard deviations increase in a student’s score in math and ELA. The effects were even larger for two prominent non-profit charter operators, KIPP and Uncommon.

To put this effect into context, Winters notes the results are larger than “80% of other educational interventions that have been recently studied using an experimental design.”

Of course, test scores aren’t everything. It’s also important to note that these non-profit schools can only exist if parents choose to enroll their children in them.

Uncommon and KIPP are two of the most in demand school operators in the city.

These schools are passing both the parent test and the academic results test. Hopefully, over time, they will also pass the life outcomes test by helping their students succeed in building a meaningful and financially secure life.

State Standards

Common core state standards are now in their 10th year. All three commentators in Education Next agree that research has found no positive achievement effects at scale.

Mike Petrilli, however, argues that states should stay the course, and that we should see positive results over the coming five years.

On one hand, I’ve always been a bit ambivalent about Common Core. Ultimately, I think school operator quality is much more important than standards and assessments.

However, many educators I’m close with, including many of the nation’s best charter leaders, have said that the new standards are more rigorous and have pushed them to be better.

Certain states and cities, such as Tennessee, Louisiana, and Washington D.C., have used higher standards as key part of their successful reform efforts.

To the extent we’re going to have state standards and assessment, better they be good rather than mediocre. Highly effective leaders can use them as part of their improvement strategy.

But I think the sobering results of the first ten years of Common Core should put their cost / benefit in perspective, especially since so few states have been able to use the new standards to jumpstart improvement efforts.

At the very least, no one should fool themselves into thinking that better standards and assessments are going to be a major cure for our nation’s most struggling schools.

Progress One School as a Time 

Zuckerberg, Oprah, Booker…. the Newark story has had no shortage of drama.

But through it all the steady growth of great non-profit schools has radically increased educational opportunity for families in Newark, particularly for African-American and Hispanic students.

Newark is also an all boats rising story: the traditional school system has gotten better as well.

Yes, it would be wonderful if we could pass laws that made all public schools great in a short period of time. But we can’t. You can’t legislate institutional effectiveness.

Newark shows the best path forward for most cities: grow great schools.

Better to take a decade to get something right rather than layer on sweeping reform after sweeping reform after sweeping reform.

Reflections on International Relations

Sometimes my reading list matches up with current events. In this case, unfortunately, with the escalating U.S. / Iran conflict.

Books read over past two months:

  1. Escape from Rome: The Failure of Empire and the Road to Prosperity 
  2. 1491 (Second Edition): New Revelations of the Americas Before Columbus
  3. The Third Wave: Democratization in the Late 20th Century 
  4. The Clash of Civilizations and the Remaking of World Order
  5. Presidents of War: The Epic Story, from 1807 to Modern Times
  6. World Order

All of the below is a layperson’s reflections. I’m clearly not an expert on international relations or the history of civilizations.

Taken as a whole, the books were very good, even if I disagreed with major parts of them (particularly Huntington’s lack of nuance in writing on American multiculturalism and Kissinger’s analysis of Vietnam). They were also a little more conservative leaning. I was pretty steeped in liberal human rights perspective from working at the international court in Sierra Leone and conducting on the ground research of Tibetan government-in-exile in India. These books broadened my perspective.

Picking up frameworks 

These books deepened my understanding of some very useful international relations frameworks; such as:

The geography framework

  • Why did Europe, and not other previously more advanced civilizations, take off with the enlightenment and the industrial revolution? One potential cause is Europe’s geography: jagged coast lines, large islands, and mountain ranges made it difficult for empires to consolidate power. Only the Romans conquered the vast majority of Europe. Most often, numerous powers had to compete with each other.  When it came to seeding the modern world, this competitive landscape was an advantage. In Europe, unlike in China and Japan, one authoritarian government couldn’t stifle an entire civilization’s scientific progress.
  • Why did the previous Latin American civilizations (Mayan, Aztecs, Incas) develop extremely sophisticated cultures but not the scientific breakthroughs that could lead to steel, gunpowder, and other powerful technologies? One cause was because they were separated from each other by large mountain ranges and narrow land bridges. This allowed for less trade and stealing of other civilizations innovations and thus less rapid technological progress (compared to Europeans stealing Arabic numerals and Chinese gunpowder, for example).
  • Why were Latin American civilizations so susceptible to European diseases? Potentially because they came from a smaller gene pool: only a couple of waves of people made it across the Bering Strait into the Americas. This led to less genetic diversity. When a disease affected one person, it was more likely to affect most people. Small pox wiped out a higher percentage (50-80%) of the native population than the Bubonic Plague did in Europe (30-50%).

The civilization framework

  • Understanding civilizational philosophies and histories can help you understand international relations. Confucian philosophy is different than Islamic philosophy, for example, and this shapes how their current nation states view the world. Understanding other civilizations can also temper your arrogance about the universality of your own civilization’s values.
  • Major, large civilizations (and their religions) that still exist today in some form include: Sino / China (Confucion), Middle East (Islam); Russia (Orthodox Christian); America / Europe (Judeo-Christian); India (Hindu); Japan (Shinto / Buddhist); Latin America (Christian after Europeans wiped out much of indigenous population); African (very under discussed in books I read; need to read more to understand historical roots).
  • During the cold war period, civilization history played second fiddle to whether a country aligned to America or the Soviet Union. With the end of the cold war, civilization affiliation is playing a stronger role in international relations.

The realism framework:

  • National interest remains a powerful force that can transcend geography and historical affiliations.
  • Why do Taiwan, Vietnam, and South Korea all have strong ties to the United States despite having more civilization commonality with China? Because of national interest. For now, they perceive the United States to be a check on Chinese power.
  • Numerous other cross-civilization relationships, such as the United States and Saudi Arabia, can be explained better through a realism framework rather than a civilizational framework.

A starting point for getting smarter 

To be clear, geography, civilization history, and nation state realism don’t explain all of the world, nor are they full determinants of the future, but all of these frameworks have helped me think through current international affairs.

Understanding both the United States and Iran’s geography, civilizational history, and national interests are a good place to start if you want to be an informed citizen on the issue.






Sobering Results

We talk a lot about test scores in education reform. But the goal of the work is not to increase test scores. The goal is to prepare kids to lead good lives.

Good work is part of the good life.

Getting a skilled credential, two year, or four year degree is a decent (though imperfect) proxy for whether a student will end up with good work.

The results coming out of the best charter organizations, and cities as a whole, are sobering.

Right now, the better charter networks achieve a 30%-50% postsecondary success rate with students living in poverty. There are some outliers who are achieving +70% success rates. But this has not been replicated at any scale.

In New Orleans, the very early results show an estimated citywide jump from 10% to 15% postsecondary success rate. Most of these students did not experience much of the reforms, so my guess is that this rate will drift up, perhaps to ~20-30% in the coming years.

The national postsecondary success rate for students living in poverty is 10-15%.

The best charter schools, and New Orleans as a whole, will likely double this rate. Perhaps they will triple it.

Doubling or tripling the postsecondary success rate for low income students is no small feat. This could help change the lives of millions of students.

But this is far below the results I hoped for when I got into this work. I imagine many of my peers feel the same. And I am sure families want more for their children.

Our current results fall far below what I hoped we would accomplish.

My expectations were naive. I did not have a full understanding of how our country’s history of racism – coupled with current dysfunctions in criminal justice, housing, and healthcare – put up so many barriers to success.

I, like many others, also over emphasized a four year degree as the primary pathway for success.

But the one thing I think we did get right is the strategy: I’ve grown deeper in my belief that non-profit schools are the best chance we have to making things better. They are already doing better than the existing system, even if the results aren’t as high as we’d hoped for.

And so much of the innovation happening in job preparation is coming out of the non-profit sector. The postsecondary work happening in New Orleans has the chance to be as groundbreaking as the K12 work.

The best non-profit organizations learn quickly. When they see that their students are not succeeding, they change their approach. And because they are not subject to the turnover of elected boards, they can keep making improvements year after year.

Our little corner of the world has been doubly stupid and one part wise.

Stupid because we underestimated how stacked the world is against kids growing up in poverty; stupid because we thought increases in test scores would quickly translate into postsecondary success… wise because we understood how much good can happen when amazing educators are allowed to start their own non-profit schools.

My hope is that we don’t run away from the sobering data; that we don’t try to spin doubling the low income postsecondary rate as mission accomplished.

We should admit that we failed to live up to our expectations; that we’re not good enough yet; and that we need to get smarter as quickly as possible so kids lead secure, meaningful lives.

My Parents

My mom is 74 years old. Next year will be her last year as a professor at Valparaiso University.

She just gave a speech to much of the freshman class at Valparaiso University. It was an honor that she was asked. And it was a great opportunity to share decades of earned wisdom.

Her speech was on cultivating empathy through the narrative arts.

I agree with her that globalization and technology require us to expand our circle of empathy beyond the family, the tribe, and the state – to people who live in far away lands, speak different languages, and see the world in completely different ways.

My favorite part of her speech was when she described studying British literature as undergraduate student in India.

At the time, studying British literature was higher status than studying Indian literature. So she read the British cannon, which is dominated by white men.

From My Mother’s Speech 

“I grew up in post-colonial India.  The British left India in 1947, but they continued to control the minds of many of us in formidable ways.  The term for this is mental colonization. Not a good thing.

But I must admit that I have very mixed feelings about this mental colonization that I experienced, chiefly because this mental colonization helped me to cultivate my empathy through reading narratives of the other.  Let me explain.

So, here I was, reading about dancing daffodils that fill the landscape in Wordsworth’s poem of that name, without having seen any daffodils.  To feel the joy of Wordsworth at the dancing of daffodils in spring I had to exercise my imagination. Of course, this went beyond flowers and leaves.  I had to learn to experience the reality of the characters in Shakespeare, in Dickens and Thackery, and whatever I studied in my courses … this helped me to develop empathy. “

There is so much nuance and complexity in this reflection. To have mixed feelings about your oppressor means you have the power to see them as humans; that you are not consumed by outrage.

If you feel their poetry about their native flowers, you have kept your own humanity in seeing theirs.

My Father

My mother also talked about my deceased father in her speech:

“One of my favorite courses I have taught here was one team taught with my late husband who was an Africanist. It focused on African politics and literature. We called it the African Experience, by adding literature, it qualified as an experience.  Something similar remains my endeavor in all I teach. I want to push my students to walk in the shoes of Indian, Chinese, African American, and Caribbean characters.”

My father played a unique role in the mostly white town where we grew up. He was a black academic who was connected to the university elites in our city. In Valparaiso, Indiana, most African-Americans in the town were middle class or poor.

Through my father, others were able to access power. He was the connection between Valparaiso’ mostly white university and its only black church. 

In 2006, my dad received Martin Luther King, Jr. Day award recipient for his work on race relations at Valparaiso University.

One of his colleagues noted: “Much of what he did early on laid the groundwork for where we are with diversity issues today.”


As I think about the next couple decades of my life, I hope I can live out both of these lessons.

It is so important to strive to see the humanity in others and to expand who has access to power.

In their own ways, both my parents were able to do this in the town where we grew up. And at the university they called home for decades.

Do all boats rise when charter schools expand?

I. A Hypothesis 

All organizations are founded on a hypothesis. Deliberate organizations are explicit about their hypothesis.

The City Fund’s hypothesis is that educational opportunity in cities will increase if:

(1) Non-profit schools enroll more students.

(2) Cities adopt a unified enrollment system to increase equitable access to all public schools.

(3) Elected officials encourage the best schools to expand and selectively transform struggling schools with new non-profit operators.

We don’t yet know if this is true, though early signs are promising.

Cities such as New Orleans, Denver, Newark, and Washington D.C. have seen strong gains using these strategies (as well as a focus on instruction and talent in district schools).

II. Fordham’s New Study on Charter Enrollment 

Fordham just put out a study that attempts to measure whether increases in charter enrollment in a city leads to all students learning more, including children in district schools.

Fordham found that, in urban areas, higher charter schools enrollment is associated with achievement gains for all black and Hispanic students in the city.

If it holds, this is an important finding on the benefit of expanding non-profit schools.

So how much weight should we give to the study?

On the positive side, the authors methodology is reasonable: they track a bunch of cities that are home to increasing charter enrollment, and then use a set of controls to try and determine if this increase in enrollment is associated with positive citywide results for minority students.

There are some clear limitations to this approach, most of which the authors acknowledge. The trickiest issue is causation: it’s hard to know if charter enrollment itself is causing the gains. For example, perhaps cities that see increasing charter enrollment also tend to be home to strong economic growth, and it’s the city’s economic gains that are driving better student performance.

Another major limitation is how much we can extrapolate from the cities in the data set.

Given that very few cities rapidly grew charters (ie, went from 10% enrollment to 50% enrollment), it’s hard to know how much we can draw from the study.

Perhaps citywide gains spike when charters increase from 10% to 30% (due to increased competition) but then reverse when charters go from 30% to 60% (due to financial pressures on the district). We won’t know until more cities reach higher charter enrollment.

III. What Can We Learn From the Study? 

The Fordham study should nudge us a bit toward the idea that increasing charter enrollment can increase learning for all students.

But, perhaps more importantly, it should cast serious doubt on the claim that the current rate of increased charter enrollment is significantly harming traditional public schools.

We can’t know if increased charter enrollment is causing citywide gains, but we can clearly observe that current charter enrollment is not causing major drops in district performance.

This is a very important finding. It refutes the major argument made by charter detractors.

This result mirrors some of what we’ve seen in CREDO’s recent analysis of city performance.

CREDO found an all boats rising effect in three of the most mature choice cities in the country. In Denver, Camden, Washington D.C., district schools improved as the charter enrollment increased.

It’s notable, though not dispositive, to us that these cities all have unified enrollment systems and transparent school performance information.

IV. How Can We Learn More?

Doug Harris and his team at Tulane are going to attempt a similar study but use a quasi-experimental approach. This should shed some more light on the issue.

We will also keep working with CREDO to hold the mirror up on the cities The City Fund is working most deeply with.

Lastly, it’s worth mentioning that understanding citywide impacts is important but not the only way to understand charter growth.

We should care a lot about the fact that charter school enrollment is increasing in the first place: it’s a clear sign that families are hungry for a better public education for their children. And that they view charter schools as way to meet their children’s needs.

Large scale correlational studies are not a substitute for simply observing that millions of parents are choosing charter schools in hopes of finding a great school for their children.

Why did New Orleans public schools improve so much?

Tulane researchers have a new paper that attempts to determine the causal mechanism for New Orleans school improvements.

A similar paper was written by Harvard researchers on the Newark reforms.

Both papers tried to answer the question: did things get better because schools opened and closed, or because existing schools improved?

Both papers come to the same conclusion: opening and closing schools is driving the gains in student learning (as measured by test scores).

The Tulane report came to a particularly strong conclusion. The authors write:

“The average school improved from the first to the second year after it opened, but school performance remained mostly flat afterwards… aside from the improvement when schools first opened, essentially all of the improvement in New Orleans’ average test scores has been due to the state regularly closing or taking over low-performing schools and opening new higher performing charters.”

The below graphic captures this finding in visual form:

Screen Shot 2019-08-29 at 12.45.18 PM.png

The authors end their study with a strategic recommendation and warning:

“The fact that newly opened schools continue to be better than those closed and taken over also suggests that the extreme measure of replacing school operators also still has some potential to generate further gains. At some point, the benefits from this strategy are likely to run out, but it does not appear that we have reached that limit yet.”

New Orleans has had strong government regulation over the past decade. For the most part, the best schools expanded and the government closed or transformed the worst schools.

It is an open question whether this good regulation can persist in New Orleans, or if it can be consistently scaled to other cities.

Of course, strong regulation is not the only way to shift enrollment to higher-performing schools.

A city could also simply let family choose amongst all schools and wait for lower-performing schools to fold under enrollment pressures. This process will be parent driven and likely slower.

Every city will need to figure out its own path when it comes to balancing top down accountability and bottoms up family choice.

Personally, I favor a combination of both. Let government have the ability to selectively transform the lowest performing schools in a city, and let families choose from a wide array of schools.

All boats rising in Denver public schools

Last year, Arnold Ventures commissioned CREDO (out of Stanford University) to study the effects of charter, innovation, and traditional schools in select cities across the country.

Most of the cities included in the study were cities where Arnold Ventures (and now The City Fund) have partnered with local leaders to expand high-quality schools.

CREDO’s analysis measures how much a school helps a student grow over the course of a year. They do this by comparing students in the city to similar students across the state.

The results just came in for Denver.

Denver Reform History

Over the past fifteen years, the locally elected school board partnered with superintendents Michael Bennet (now Democratic Senator for CO) and Tom Boasberg.

These leaders gave educators more freedom to tailor their school programs to the students they served. And they gave families more access to a diverse array of public schools. The district also made heavy investments in teacher and leader talent

This effort greatly expanded the number of public schools operated by non-profit organizations. Non-profit schools now serve around 30% of students in the city. These non-profit organizations are a mix of charter schools and innovation schools (district schools that operate with more freedoms under a non-profit board).

Superintendent Cordova just took the reigns last year.

CREDO Results: Every Sector in Denver is Outperforming Similar Schools Across the State

In Denver, traditional schools, charter schools, and innovation schools are all outperforming similar schools across the state.

The study’s author noted: “The pattern of performance here is consistent… it’s an incredibly strong advantage for students in Denver no matter what school they go to.”

A common critique of charter schools is they hurt traditional school performance. This critique has no grounding in evidence. And it does not seem to be true in Denver. All sectors in Denver are helping students grow.

Together, the sectors combine to achieve annual +.1 standard deviation effects in reading and math. These are large annual citywide effects.

Screen Shot 2019-08-20 at 11.06.12 AM

Where will Denver Head From Here? 

The Denver reforms have led to more than increases in test scores. High school graduation and college enrollment rates are also up.

Hopefully this will translate to more Denver students benefiting from Denver’s booming economy.

But for continued gains to occur, Denver should not abandon its most successful strategies. For so many kids in Denver, they have been a lifeline for increased academic learning.

More specifically: given that large academic gaps remain across racial lines, Denver would do well to expand those schools that are doing the most for kids of color.

Looking at 2018 data, these schools were rated highest in the city for closing academic gaps:

  • Polaris Elementary School
  • Slavens K-8 Schools
  • KIPP Northeast Denver Leadership Academy
  • Cory Elementary School
  • Steck Elementary Schools
  • DSST: Byers MS
  • Denver Green School
  • Stephen Knight Center for Early Education
  • Creativity Challenge Community
  • Holm Elementary Schools
  • DSST: Green Valley Ranch
  • Escalante-Biggs Academy

Hopefully these schools can grow to serve more students and help close persistent gaps across this city.

Good news for Camden’s children

Last year, Arnold Ventures commissioned CREDO (out of Stanford University) to study the effects of charter, innovation, and traditional schools in select cities across the country.

Most of the cities included in the study were cities where Arnold Ventures (and now The City Fund) have partnered with local leaders to expand high-quality schools.

CREDO’s analysis measures how much a school helps a student grow over the course of a year. They do this by comparing students in the city to similar students across the state.

CREDO presents its findings in standard deviations. A useful way to understand these impacts is to translate them into extra days of learning, based on a 180 day school year.

Screen Shot 2019-07-08 at 7.45.53 PM.png

As the chart above shows, a .15 standard deviation impact equates to about an extra half year of learning.

The results just came in Camden.

Camden Reform History

The state intervened in Camden schools in 2013. You can read more about the effort in this good New York Times piece.

One of the major innovations in the takeover was the creation and expansion of Renaissance schools. Renaissance schools are governed by non-profit organizations but must serve all students in the neighborhood. They are sort of a hybrid between charter schools and traditional schools.

The Renaissance reform effort was also coupled with improvements to traditional schools. Schools became safer and academic improvements were implemented across the city.

The city also created an online unified enrollment system to help families find the best public schools for their children.

Large Citywide Improvements

Camden’s city level effects are large.

In just two years, scores are up ~.15 standard deviations in math and ~.05 standard deviations in reading (compared to similar schools across the state).

To put this in context, over five years, New Orleans achieved a .4 standard deviation effect. These city effects were the largest the researchers had seen. Camden may achieve similar results. The math results are on track to mirror the gains seen in New Orleans.

It’s pretty incredible to see students learning so much more so quickly. Effects this large are a good signal that students are getting smarter in literacy and numeracy.

Screen Shot 2019-07-08 at 7.48.31 PM.png


Renaissance and Traditional Schools are Improving the Fastest

Renaissance schools are the highest performing sector in Camden, outperforming similar schools across the state in both reading and math. They also improved by over +.1 standard deviations in both subjects over the last year of the study.

The Camden traditional sector, though lower-performing, has improved. District schools have seen large improvements in Math (+.2 standard deviations) and modest gains in reading (+.06 standard deviations).

The charter sector continues to outperform the district, though it has seen a decline in its learning gains relative to the state over the past few years

Screen Shot 2019-07-08 at 7.50.41 PM

Screen Shot 2019-07-08 at 7.50.51 PM

Will Learning Improvements Lead to Better Life Outcomes for Children?

In New Orleans, we began to worry that gains in test scores, while important, would not translate into better life outcomes for students. Unfortunately, There were not enough post-secondary programs in the city that could help high school graduates prepare for meaningful careers.

Many cities across the country also struggle with this issue.

Recently, the former superintendents of New Orleans and Camden announced they were launching a new organization, Propel, to help high school graduates transition to good careers.

This promising effort, if it works, will help students capitalize on their increased numeracy and literacy skills.

Mission Not Accomplished

The Camden reforms are barely past their fifth year. The city is still home to struggling schools. Absolute achievement remains low. And the district remains under state takeover.

Hopefully, over the next five years the city’s schools will return to local control and continue to improve. And all of the work will translate to better life outcomes for students.

All public schools (traditional and charter) rising in Newark

When more students enroll in non-profit charter schools, what happens to the students who remain enrolled in traditional schools?

This is one of the most contentious questions in public education right now.

Past research has shown that increased public charter school growth does not negatively affect the academic performance of traditional public schools.

But much of this research covers geographies that don’t have that many charter schools.

An open question is whether the effects of charter growth on traditional public schools will change as charters serve more and more kids in a district.

In Newark, nearly 40% of students attend charter schools.

At this scale, non-profit schools have given families a lot more choices to find a good fit for their children.

But they have also put real academic and financial pressure on the traditional system.

So what’s been the effect?

New Jersey ranks all of its school districts based on academic performance. The state runs a couple of different types of analysis: ranking all district statewide and then also ranking districts based on those that have similar levels of poverty.

See below for the results for Newark citywide, Newark traditional system, and the Newark charter sector.

Newark’s Overall City Rank is Rising 

Newark has shot up from the 39th percentile to the 78th percentile amongst the thirty-seven highest poverty districts in New Jersey,

In the 100 highest poverty districts, Newark has moved from the 18th percentile to the 50th percentile.

When comes to all districts, Newark performs poorly, though it has made major progress in the past five years. This progress comes after a fairly long period of stagnation.

Screen Shot 2019-06-17 at 9.12.00 AM

Newark Traditional Public Schools are Improving at a Healthy Rate

Newark’s traditional schools have made major improvements over the past five years, after being fairly flat in previous years.

When it comes to the highest poverty cities, the district’s traditional public schools have moved from the 20th percentile to at or above the 50th percentile.

They also have also seen gains the other performance rankings, though overall gains when compared to all New Jersey districts are fairly modest.

Screen Shot 2019-06-17 at 9.12.23 AM

Newark Charter Schools have Nearly Caught the State Average 

Newark charters are achieving at very very high levels.

Taken together, they are the top performing high-poverty district in the state.

Even more impressive, Newark’s charters have risen to nearly the 50th percentile in the entire state of New Jersey.

New Jersey is one of the wealthiest states in the nation.

Students in Newark charter schools, who are mostly children of color living in poverty,  are performing as well as their much more affluent and privileged peers.

Screen Shot 2019-06-17 at 9.12.10 AM

Still an Open National Question

Newark is just one of many cities that are now home to large non-profit public school sectors.

We are working with researchers to study city level and sector level effects across many of these cities.

In the fall, I hope to have a more comprehensive write-up on these results.

But, for now, it’s good to see all schools rising in Newark.

Hopefully these results will hold true across many more cities.

Bloom, New Orleans, and Effect Sizes

Six months ago Matthew Kraft published an excellent article on effect sizes.

I worked in education for five years before I had any understanding of research design and reporting. I wish Matt’s piece was around a decade ago.

His article is a bit dense if you’re trying to just wrap your head around the issue, so consider this post a lay person’s intro to Matt’s piece and the subject itself.

If you catch any mistakes, please do let me know. I’m still learning.

Why are effect sizes useful?

Consider currencies. Currencies are useful because they allow you to easily compare prices across various goods. Instead of having to constantly refer to one set of goods in relation to another set (ie, three apples are worth the same as four oranges which is worth the same as three paperclips), we can use the same unit (dollars) to compare a bunch of different goods.

Effect sizes serve the same function. They help us easily compare the magnitude of the impact of a bunch different interventions. We can do research on graduation rates, test scores, suspension rates, or whatever we want, and then we can convert our results into an effect size to help us compare how big of an impact we had.

Effect sizes are the unit of currency for measuring impact.

What is an effect size?

Many effect size calculations in education research are expressed in standard deviations.

A common formula to determine the effect size is:

(mean of experimental group – mean of control group) / standard deviation

Let’s say we trying to find the effect size of a new math curriculum on test scores. We might give half the population the new curriculum, half the old curriculum, and then see what the difference is.

Let’s say the difference is +5 pts out of a 100 for the students using the new curriculum. The curriculum “worked.”

But what does that mean?

We now want to know if +5 pts is a big deal. This is where the standard deviation comes in.

A low standard deviation means there is very little difference in the population (everyone is scoring about the same score). A large standard deviation means there is a wide spread in scores.

Because the standard deviation is the denominator in the formula, the smaller it is, the large the effect will be for any given difference between two groups.

In other words, if everyone is scoring between 62 and 65 out of a hundred, and you jump five points, you could go from the bottom 1% of test takers to the top 1% of test takers.

Because the standard deviation is low (small spread), a modest jump leads to a big effect.

What is a large effect?

This is where Matt’s paper is particularly useful.

Much of the previous literature on effect sizes made many mistakes:

  1. Sample sizes were ignored.
  2. Duration of treatment were ignored.
  3. Time elapsed until measurement was ignored.
  4. Cost was ignored.

Taken together, scalability of interventions was ignored. This had the unintended consequence of setting the bar too high for what should be considered a large effect size.

Bloom’s 2 standard deviation effect 

You may have heard of Bloom’s 2 sigma tutoring intervention. This result is taken to show that 1-1 tutoring can have a 2 standard deviation (very large!) effect.

But Bloom’s study design was the following: take dozens of 4th, 5th, and 8th graders; give them 1-1 tutoring in discrete subjects like cartography or probability; and then test them on what they learned after 3-4 weeks!

It’s much easier to squeeze out a big effect under these conditions.

These types of small sample studies led to a research norm where an effect size had to be .8 standard deviations for it to be considered large.

New Orleans’ .4 standard deviation effect 

Contrast Bloom’s study to Doug Harris’ study on the New Orleans education reforms.

The New Orleans study covered tens of thousands of students. Students received the treatment across all major subjects, including math, reading, science, and social studies. The treatment lasted multiple years. And students were tested once every year in each subject.

It’s a lot harder to make large gains under these conditions, especially when the intervention costs under 20% of 1-1 tutoring.

Doug’s study found .4 standard deviation effects for New Orleans students over a five year period.

In his paper he wrote that he was “not aware of any other districts that have made such large improvements in such a short time.”

To summarize:

  1. The standard bar for a large effect was .8 standard deviations. This was irregardless of sample size, length of treatment, measurement proximity, or cost. The bar was poorly constructed.
  2. New Orleans achieved a +.4 standard deviation effect on test scores.
  3. Researchers had never seen a citywide effect this large before.

There are two ways to interpret this.

  1. The previous .8 standard deviation bar was way too high for large samples.
  2. The New Orleans effect, despite being relatively large for district improvement, is still so absolutely small that we should not be too impressed.

Was the New Orleans effect too small?

The +.4 standard deviation effect equates to the average New Orleans student moving from the 22nd to 37th percentile in performance.

For any individual, this might or not be life changing. But in the aggregate this means the average New Orleans student roughly went from a borderline high school dropout (bottom 20% of performance) to a student who has a real chance to enter a two year or four year college (modestly below average performance).

Across a large population, this is a pretty big deal.

We should pay attention to a city level +.4 standard deviation increase in test score. If this effect (or even one somewhat lower) can be scaled, kids across the country will have a better chance at leading a good life.

Of course, academics and test scores are just one piece of the puzzle of economic mobility, but they are an important piece. Schools with negative effects on test scores tend not to deliver great long-term life outcomes for kids.

Matt Kraft’s proposed effect size scale

When it comes to large interventions, Matt argues we should get rid of the .8 standard deviation benchmark.

I agree.

Matt proposes the following rough scale:

Small effect: less than .05

Medium effect: .05 to .2

Large effect: .2 or larger

Matt reviews a bunch of educational studies to help come up with this table. While I don’t love that it averages a bunch of very different studies, at the very least it sets conservative estimates on effects and cost (given that averages include studies that don’t meet the highest bar for sample / duration / etc.).

Screen Shot 2019-05-29 at 5.10.56 PM.png

Take a look at where .4 standard deviations shows up. The New Orleans reforms are in the 90th percentile of magnitude but the 60th percentile of costs. New Orleans increased it’s pup-pupil by $1,400 in the years following Katrina, though it’s not clear to me that the money is what really drove the effect. But even if you assume it did, the results pass a ROI test.

Again, the New Orleans impacts are pretty remarkable.

In considering impact, cost, and scale, Matt also provides the following matrix:

Screen Shot 2019-05-29 at 9.49.25 PM.png

New Orleans does well.

In Sum: Toyotas > Ferraris 

When it comes to effect sizes, be very careful to review sample sizes, treatment duration, measurement proximity, and cost.

Holding out for .8 standard deviation effects is foolish. These effects will rarely occur and when they do they tend to be very hard to scale.

When it comes to large scale interventions across medium term time frames, effects above .2 standard deviation warrant our attention.

The most realistic path for broad academic gains is to look for meaningful jumps in student performance that are caused by an intervention that has a real chance of scaling over time. And then testing and scaling and testing and scaling.

In other words: Toyotas > Ferraris.