Better Debates, Better Decisions: Causality Assessment in Population Health

Topics:

Authors:

Elizabeth A. Stuart

Joshua M. Sharfstein

Citation:

Stuart EA, Sharfstein JM. Better Debates, Better Decisions: Causality Assessment in Population Health. Milbank Quarterly Opinion. May 1, 2024. https://doi.org/10.1599/mqop.2024.0501

Back to The Milbank Quarterly Opinion

Getting your Trinity Audio player ready...

There is growing recognit ion that the caustic scientific debates about the impact of masks, social distancing, and remote schooling during the COVID-19 pandemic are less about the specific findings of individual studies and more about which kinds of studies and arguments should take precedence over others. The fundamental question is causality, the determination that an intervention leads to changes in an outcome. There are many types of evidence, from logic and biological data to observational studies and experimental trials. A better understanding of the strengths and weaknesses of different forms of evidence can reframe contentious discussions and ultimately lead to better decisions.

Today, the public debate over evidence can seem formulaic. In one corner is the archetypal true believer in the randomized controlled trial, who trumpets the fact that randomization eliminates many forms of bias and yet who does not seem to appreciate that, for many questions of policy, randomization is not desirable, ethical, or possible. In the other corner is the archetypal data scavenger, who works to integrate logic, biology, and observational data into a more comprehensive picture, and who seemingly has never met an uninformative study.

Both the true believer and data scavenger, however, often struggle to provide reliable answers to critical, urgent questions for policy. Neither fully appreciates the importance of well-designed, observational studies that can be rapidly deployed in the real world. Such studies are critical to developing meaningful answers to such questions as whether masks, restaurant closures, and remote schooling “worked.”

The widespread use of randomized controlled trials in medicine gives such studies a head start on primacy in policy debates. But the further away from a clinical intervention a question goes, the less effective this design becomes. That’s because unlike a medication, which is often expected to have the same biological effect for different people or in different settings, a policy can be implemented differently and be received differently in different environments. The same mask-wearing program can be received well in one society and lead to civil unrest in another. Similarly, a recommendation to stay at home is more feasible—and thus more easily implemented—for some populations than for others.

Outside the pandemic context, effective programs such as high impact tutoring for students can have variable effects depending on the implementation, tutoring quality, and fit with the rest of a student’s activities. And when the effects of an intervention do vary, even well-conducted randomized trials may not be very informative about the effects in individual locations.

Recognizing the limitations of such “gold standard” studies does not mean that anything goes. One weakness of the “all of the above” approach to evidence is that there are often conflicting signals in the broad pool of data, leaving plenty of room for picking and choosing. It is not uncommon for people to sample the wide world of evidence to reach very different conclusions—an outcome increasingly likely as science becomes more politicized. Constructing arguments with poor quality evidence is one reason that hydroxychloroquine became popular in the early days of the pandemic, even leading the Food and Drug Administration to provide a (later revoked) emergency use authorization for the medication.

Another danger in data scavenging is that some of the widely used and published study designs are quite weak. Pre-post studies, in particular, may be the easiest to conduct, but they may also mislead, as many contemporaneous factors can influence the results. Similarly, simply comparing outcomes in states with and without a particular policy of interest (e.g., stay at home orders, vaccine mandates, and remote schooling) does not account either for the many other ways in which those states likely differ, or for the different ways that those policies may have been implemented.

Effective causal inference requires more than what the archetypal true believer and the data scavenger have to offer. There are non-randomized, controlled study designs that should be considered of higher quality than many other observational studies. In fact, for pressing questions of policy, these designs may even be preferable to randomized controlled studies.

One valuable approach is to compare changes over time—before and after the policy went into effect—in places with and without the policy. These are known as “difference-in-differences” designs. Similarly, to understand whether interventions that seem promising in randomized controlled trials will continue to be effective in real-world use, it is helpful to take advantage of natural occurring variation in implementation. Thoughtful and thorough comparison of test scores in places with different levels of in-person schooling during the pandemic, combined with large-scale data and advanced statistical methods to adjust for a large set of factors, help to disentangle relationships between in-person schooling and learning loss, and how it varies across places and groups.

A key aspect of this type of causality assessment is in-depth knowledge of what is happening “on the ground.” During the pandemic, county or school district policies around in-person schooling did not always correspond to an individual family’s experiences. In state policy evaluations, simply using whether a policy is “on the books” does not account for whether it is being implemented. It is crucial that researchers do not just blindly analyze data without knowledge of where it comes from, what it means, and how it may or may not be accurate for the questions at hand.

Finding the sweet spot between the true believer and the data scavenger is crucial for evaluating causality in complex real-world situations. Unfortunately, doing so is especially difficult in the heat of the moment when agendas, arguments, and egos collide. Three steps can help to lower the temperature while also raising the level of understanding.

First, journalists, policymakers, and the general public can receive additional training on what evidence to trust. Schools of public health can create opportunities for free courses and lectures that explain the basics of causality assessment as well as common pitfalls.

Second, trustworthy data intermediaries can summarize studies and their contributions to causality assessment. One example is the Novel Coronavirus Research Compendium, which provided timely reviews of emerging evidence during the pandemic from a multidisciplinary set of experts, including physicians, epidemiologists, and statisticians.

Third, the fields of epidemiology and biostatistics can pay more attention to urgent questions in causality assessment. Evidence specialists can master how to design non-randomized studies that address possible biases—and thereby generate confidence in the results.

There will rarely be one study or study type that provides a simple answer regarding causality; instead, policymakers will need to make use of a causal crossword created by multiple sources of evidence. Put another way, it may not be possible to resolve all debates over evidence, but with greater understanding of the strengths and weaknesses of different types of studies, better debates can lead to better decisions.

Citation:

Stuart EA, Sharfstein JM. Better Debates, Better Decisions: Causality Assessment in Population Health. Milbank Quarterly Opinion. May 1, 2024. https://doi.org/10.1599/mqop.2024.0501

About the Author

Joshua Sharfstein is Distinguished Professor of the Practice in Heatlh Policy and Management at the Johns Hopkins Bloomberg School of Public Health. He served as secretary of the Maryland Department of Health and Mental Hygiene from 2011 to 2014, as principal deputy commissioner of the US Food and Drug Administration from 2009 to 2011, and as the commissioner of health in Baltimore, Maryland, from December 2005 to March 2009. From July 2001 to December 2005, Sharfstein served on the minority staff of the Committee on Government Reform of the US House of Representatives, working for Congressman Henry A. Waxman. He serves on the Committee of Science, Technology and Law of the National Academies of Science, Engineering and Medicine and the editorial board of JAMA. He is a 1991 graduate of Harvard College, a 1996 graduate of Harvard Medical School, a 1999 graduate of the combined residency program in pediatrics at Boston Medical Center and Boston Children’s Hospital, and a 2001 graduate of the fellowship program in general pediatrics at the Boston University School of Medicine.

See Full Bio

Back to The Milbank Quarterly Opinion

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
li_gc	2 years	LinkedIn - Used to store consent of guests regarding the use of cookies for non-essential purposes
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_0JXFP3TZJG	2 years	This cookie is installed by Google Analytics.
_gat_UA-35374969-1	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	No description available.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This is a Hotjar cookie that is set when the customer first lands on a page using the Hotjar script.
_hjIncludedInPageviewSample	2 minutes	No description available.
_hjTLDTest	session	When the Hotjar script executes we try to determine the most generic cookie path we should use, instead of the page hostname. This is done so that cookies can be shared across subdomains (where applicable). To determine this, we try to store the _hjTLDTest cookie for different URL substring alternatives until it fails. After this check, the cookie is removed.
AnalyticsSyncHistory	1 month	LinkedIn - Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries
at-rand	never	AddThis - Used by social sharing platform AddThis
CONSENT	16 years 3 months 17 days	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
xtc	1 year 1 month	AddThis - Registers the users sharing of content via social media

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.

Better Debates, Better Decisions: Causality Assessment in Population Health

Authors:

Citation:

Related Content:

When Medicaid Unwinding Meets AI: In the Matter of DeLoitte Consulting

Five Questions for a Post-Pandemic World

The Sharp Decline in COVID-19 Mortality in 2023: Interpreting Good News in a Population Health Context

Citation:

About the Author

Related Content:

When Medicaid Unwinding Meets AI: In the Matter of DeLoitte Consulting

Five Questions for a Post-Pandemic World

The Sharp Decline in COVID-19 Mortality in 2023: Interpreting Good News in a Population Health Context