Correlation is causation
One of the most common tropes about research is that “correlation isn’t causation.” This is generally used in opposition to the research at hand. Don’t like the findings? You can say that correlation isn’t causation and dismiss it.Academics also like to use this phrase defensively about their research. Suppose they do a correlational study and find something they don’t like (say, that X policy they support is related to more crime). In that case, they can say it’s only a correlational study, so policy changes should wait until further research. This is also, I believe, a defense mechanism among academics against having to take responsibility for their study’s findings. If they write a correlational paper and someone makes a policy from its conclusions, then the policymaker goes beyond the study’s findings. The author bears no responsibility for that outcome. This is a very foolish point of view.
Research does not exist in a silo. For it to be useful we need people outside of academia to read and use it. That means people will take correlational research and treat it as if it were causal. That’s partly out of a misunderstanding of the stats - and academic papers do not make it easy to understand the data/methods - and partly because, for political topics especially, people will latch onto convenient research no matter if it’s causal or not. But the outcome is the same. Many people will treat correlational research as causal. To pretend otherwise, to pretend that your research will stay among the academics and never move into the real world where people cannot spend hours thoroughly reading your paper (and where they probably don’t have the skills to do so anyways), is beyond foolish. It is dangerous and will lead to bad policy.
One extremely prominent example is the paper “Presence of Armed School Officials and Fatal and Nonfatal Gunshot Injuries During Mass School Shootings, United States, 1980-2019,” published in JAMA last year by Drs. Peterson, Densley, and Erickson, who are criminologists (available to read here). Following the standard of public health papers, this paper is four pages long and looks at a bunch of variables in school shooting cases (N=133) and sees what’s related to the number of people killed or injured. The finding that’s getting a ton of attention in light of the shooting in Uvalde, TX, is that schools with armed police officers have a statistically significant increase in the number of people killed though no effect on the number injured. There are no experimental or quasi-experimental changes here, so you cannot determine causality; it is purely a correlational study. There are data/methods issues in this study, such as only looking at schools with shootings (so there is no way to look at the deterrent effect and assuming officers are assigned randomly), using data that the authors refuse to release publicly, imputing (somehow) missing data for key variables, but the critical issue is that it’s correlational. It’s a cross-sectional study meaning you can’t say that any variable causes changes in the outcome. To the author’s credit, they never make such a claim. Their discussion section is rightly free from any causal language.
That’s not how the paper is being used. Its use in the media and discussions of responses to school shootings is treated as causal - having police at schools makes school shootings deadlier. The first author of that paper, Dr. Peterson, was a guest on Jake Tapper’s show on CNN, and that clip (and the paper’s findings) was also included in John Oliver’s show Last Week Tonight. This means that the JAMA paper was talked about in front of millions of people, putting it at the top .01% (if not higher) of papers in terms of attention and policy relevance. Oliver said this in relation to the study: “If school cops can make shootings worse, why then are we still pitching them as a solution? [emphasis added]” This is clearly causal language, that the school police cause worse shootings and therefore it is bad policy to have this. This may be true - but this paper cannot make that claim.
In this post, I use the example of the JAMA paper, but it’s certainly not the only one. I use it merely as an example of a recent and extremely popular paper that’s influencing policy - or at least attention on the topic - today. Many, if not most, criminology papers are correlational and have similar issues. And these papers tend to go much further than the JAMA paper in their discussion section. Whereas the JAMA remained purely discussing the findings, many papers talk about the relevant policy interventions that could lead to changes in the outcome by affecting the correlated variable… which is treating that correlation as if it was causal. If the paper uses causal language, why should a non-researcher think otherwise?
Here’s an example from a random recent paper that is representative of these issues but by no means is particularly bad about it. A cross-section study finds a positive relationship between energy drink consumption among teenagers and using drugs. How does the conclusion section start? By treating the results as if they were causal and arguing for policy change: “The findings from this study suggest that policies that permit children and young adolescents to consume energy drinks may need to be reconsidered…Our findings also buttress the notion that adolescent energy drink consumption represents a significant public health concern.” Even the first sentence in the paper, in the “Highlights” section, says the relationship is causal: “This study demonstrated that energy drinks may begin the drug use cycle. [emphasis added]” This is ubiquitous in criminology research.Correlational results section, causal discussion/conclusion section.
So what’s the solution to these issues - non-academics treating correlation as causation and academics doing the same? The second part is a more straightforward fix. Authors should stop using causal language - including recommending policy changes - in correlational papers. Editors should enforce this. Some of this is also caused by the journals themselves, as even in correlational papers, reviewers and editors will want to see policy recommendations. This forces authors to use causal language pushing for policy changes even in these correlational papers, even if they know it is inappropriate. Here again, editors have the power and responsibility to stop this.
We also need far more actually causal papers and fewer correlational papers. Criminology has a massive quantity over quality problem. The more papers you push out, largely regardless of quality, the more successful you will be. Causal papers are a lot harder than correlational papers. You usually can’t just download data and plug in a regression for causal research. Given the increased difficulty - and reduced rewards of fewer better papers over many worse papers - it makes sense people will gravitate towards the easy work.
I doubt this will happen because editors need papers to fill journals, and researchers need papers to fill CVs. But having fewer papers that are higher quality and actually get at the causal question, which is crucial for policy, would not only help the issue of people misidentifying papers as causal when they aren’t (since in this world, more papers would be causal) it would make the overall body of research better. And would make our work as researchers more practical since we can say what the effect of X or Y policy is on crime - you know, the entire point of criminology. Non-academics are going to misuse (sometimes accidentally, sometimes intentionally) research. Researchers should push back on this lousy usage by explaining publicly what we can and cannot learn from it. But we’ll still have these issues. There’s no solution, only ways to mitigate the problem.
People also like to do this even when the research is causal, such as through an experiment or natural experiment, though most research is only correlational.
And academics love to be foolish when it benefits them.
And likely many other fields though I am most familiar with criminology.