Is Roland Fryer Right? Or has the RCT Fallacy Reared its Ugly Head?

Screen Shot 2016-04-11 at 9.34.05 AM

Roland Fryer just published a compilation guide to 196 RCTs in education. HT to my colleague Stuart Buck for passing it along.

The compilation is a good review of a bunch of interesting studies. Roland’s contributions always make me think. He also won the John Bates Clark Medal, which is basically the Nobel prize for economics for people under 40.

Yet, while this RCT compilation is informative, I’d be very, very, very hesitant to pass a bunch of laws and regulations based on this type of meta-research.

___

Increasingly, policy makers and pundits are using RCT evidence to make policy. This is generally a step in the right direction, and it’s great to see evidence playing a bigger role in policy making.

Yet, sometimes RCTs are more about Rigorously Contorted Tales than Randomized Controlled Trials.

Call it the RCT Fallacy.

In statistical terms, the RCT Fallacy is pretty close to the concept of external validity, but I think the RCT Fallacy has a little more psychology to it.

So here goes:

The RCT Fallacy occurs when thought leaders propose adoption of policies based on the results of RCTs so as to avoid the messiness of politics, ideology, history, psychology, and evolution.

Fryer is more balanced than most, but, in this case, I think he still succumbs to the fallacy.

___

The RCT Fallacy is grounded in the following:

There is an inverse correlation between the external validity of a RCT and the operational complexity of an industry.
If you have a RCT on your side, it’s much easier to defend yourself against being unreasonable, even if the RCT has very questionable external validity.
If you don’t have a RCT on your side, you can be called an ideologue even if you’re making a very well thought out case.
This leads to the perverse incentive of thought leaders being in a safer place trumpeting policies with modest RCT support rather than proposing solutions that are grounded in a deep understanding of systems, organizations, and humans – but which are difficult to measure with RCTs.
RCTs overvalue what can be measured quantitatively.
RCTs overvalue the worth of understanding existing best practices and testing pilots over the creation of entire systems that accelerate new best practices.
In complex systems with complex organizations, evolution is a better change mechanism than running RCTs and implementing best practice adoption, especially in policy areas where some type of accountability (user choice, output measurement, etc.) can “kill off” bad ideas.
Quasi-experimental studies are often a better way to capture the effects of the impact of complex systems, as it is very difficult to conduct large scale RCTs on system level policy adoption.

___

In other words, RCTs will never tell us:

Whether democracies are better than dictatorships.
How to invent an iPhone.
Whether capitalism is better than Communism.
Whether single payer health systems are better than market based health systems.
Whether or not a start-up will be successful.

Yes, well designed RCTs can inform our decisions on the above issues, but RCTs will not provide definitive evidence on these issues.

___

Fryer’s paper ends with his summary of the RCT evidence in education.

He argues that RCTs have demonstrated that four interventions work: pre-k, high dosage tutoring, managed teacher PD, and charter schools.

The paper ends with the following rally cry:

Screen Shot 2016-04-07 at 8.00.11 PM

I’m not sure courage is what we need:

Pre-K: There is pretty mixed evidence on our ability to scale effective pre-k. Fryer himself notes: “of the 64 treatment effects recorded in these randomized studies [on pre-k], 21 were statistically positive; zero were statistically negative and 43 were statistically indistinguishable from zero.”

Again, I’m not sure “courage” is the term I’d use to describe scaling an intervention that shows zero effect 67% of the time.

Tutoring: Fryer covers some high-dosage tutoring studies that show strong effects. However, the costs of these programs are sometimes upwards of 20% of total per-student spending. Moreover, there would likely be severe human capital limitations if we tried to give high dosage tutoring to all the students who needed it.

Managed Teacher PD: Fryer covers studies that show success for Success For All and Reading Recovery programs. The data seems robust and schools should surely consider adopting these programs. But here’s the thing: nothing is preventing districts from adopting these programs right now!

Perhaps either districts know something that these RCTs aren’t picking up, or perhaps districts are so poorly run that it takes a dramatic intervention to get them to adopt effective programs that have been around for 10+ years.

Charter Schools: While I clearly support charter expansion, charter RCTs often run into the issue of using lottery data which limits trials to schools that are oversubscribed (and thus creates positive bias); as such, I generally view CREDO’s far reaching urban quasi-expermintal studies to be of more use.

___

Again, I don’t mean to pick on Fryer. I’ve learned a ton from reading his research and children would be better off universities were filled with thinkers like him. His work on “looking under the hood” of high-performing charters greatly influenced my thinking on schools, as has his research on tutoring.

Moreover, it’s much better to try and build a policy regime from RCTs than from the weak theory that comes out of many education departments.

But, ultimately, I don’t think that (a) the RCTs covered in his study make a strong case for the scaling of his preferred interventions or (b) that RCTs can ever really tell us how to best design our public education systems.

I do think we should utilize RCTs to help schools make choices about which practices to adopt, but, ultimately, we should utilize theory and quasi-expermential evidence to handle the major public policy questions concerning education, which in mind have more to do with system structure than educational practice.

“We” (researchers, thought leaders, policy makers, etc.) shouldn’t be operationally scaling much; rather, we should be running experiments that give empowered educators and families more information to make great choices.

5 thoughts on “Is Roland Fryer Right? Or has the RCT Fallacy Reared its Ugly Head?”

kevin denny April 12, 2016 at 9:13 am

John Bates Clark as a Nobel Prize for the under 40s. Definitely. Provided you live in the US that is.

LikeLike

Reply ↓
hlempel April 12, 2016 at 10:24 am

“Again, I’m not sure “courage” is the term I’d use to describe scaling an intervention that shows zero effect 67% of the time.”

Did this 67% of the studies show an effect when analyzed together? I.E. Can we say anything about whether it’s more likely to be the case that 67% of studies had an actual effect of zero v. 67% of studies had too little power to detect an effect?

LikeLike

Reply ↓
Paul April 13, 2016 at 11:56 am

Critical thinking about education RCTs that isn’t foaming at the mouth? What internet did I end up in?

The 67% zero description of pre-k is a bit misleading. Inconclusive is different from zero, a better approach would be to look at the standard errors and only count as zero the ones that reject moderate effect sizes.

LikeLiked by 1 person

Reply ↓
Yellow Hanks April 13, 2016 at 2:47 pm

This article misses the point. Field experiments in economics are used for marginal analysis and testing structural primitives, not to necessarily test whole systems.

LikeLike

Reply ↓
1. Rick Hull April 14, 2016 at 7:22 pm
  
  Yellow Hanks:
  
  Do you mean that Fryer’s paper misses the point? If not, isn’t Fryer’s paper committing the same mistake in your eyes?
  
  LikeLike
  
  Reply ↓