Can We Trust Observational Studies Using Propensity Scores in the Critical Care Literature? A Systematic Comparison With Randomized Clinical Trials

Georgios D Kitsios; Issa J Dahabreh; Sean Callahan; Jessica K Paulus; Anthony C Campagna; James M Dargin

doi:10.1097/CCM.0000000000001135

Can We Trust Observational Studies Using Propensity Scores in the Critical Care Literature? A Systematic Comparison With Randomized Clinical Trials

Crit Care Med. 2015 Sep;43(9):1870-9. doi: 10.1097/CCM.0000000000001135.

Authors

Georgios D Kitsios¹, Issa J Dahabreh, Sean Callahan, Jessica K Paulus, Anthony C Campagna, James M Dargin

Affiliation

¹ 1Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pittsburgh Medical Center, Pittsburgh, PA. 2Center for Evidence-Based Medicine, School of Public Health, Brown University, Providence, RI. 3Department of Health Services, Policy & Practice, School of Public Health, Brown University, Providence, RI. 4Division Pulmonary, Critical Care, Allergy, and Sleep, Medical University of South Carolina, Charleston, SC. 5Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA. 6Department of Pulmonary and Critical Care Medicine, Lahey Hospital and Medical Center, Burlington, MA.

PMID: 26086943
DOI: 10.1097/CCM.0000000000001135

Abstract

Objective: To assess the degree of agreement between propensity score studies and randomized clinical trials in critical care research.

Data sources: Propensity score studies published in highly cited critical care or general medicine journals or included in a previous systematic review; corresponding randomized clinical trials included in Cochrane Systematic Reviews or published in PubMed.

Study selection: We identified propensity score studies of the effects of therapeutic interventions on short- or long-term mortality. We systematically matched propensity score studies to randomized clinical trials based on patient selection criteria, interventions, and outcomes.

Data extraction: We appraised the methods of included studies and extracted treatment effect estimates to compare the results of propensity score studies and randomized clinical trials. When multiple studies were identified for the same topic, we performed meta-analyses to obtain summary treatment effect estimates.

Data synthesis: We matched 21 propensity score studies with 58 randomized clinical trials in 18 distinct comparisons (median, one propensity score study and two randomized clinical trials per comparison), for short- and long-term mortality. We found one statistically significant difference between designs (hyperoncotic albumin vs crystalloid fluids) among these 18 comparisons. Propensity score studies did not produce systematically higher (or lower) treatment effect estimates compared with randomized clinical trials, but estimates from the two designs differed by more than 30% in one third of the comparisons examined. Observational studies in critical care met widely accepted methodological standards for propensity score analyses.

Conclusions: Across diverse critical care topics, propensity score studies published in high-impact journals produced results that were generally consistent with the findings of randomized clinical trials. However, caution is needed when interpreting propensity score studies because occasionally their results contradict those of randomized clinical trials and there is no reliable way to predict disagreements.

MeSH terms

Critical Care*
Critical Illness / mortality*
Humans
Observational Studies as Topic / statistics & numerical data*
Propensity Score
Randomized Controlled Trials as Topic / statistics & numerical data*