Abstract
BACKGROUND: Multiple scoring systems have been developed for both the intensive care unit and the emergency department to risk stratify patients and predict mortality. However, it remains unclear whether the additional data needed to compute ICU scores improves mortality prediction for critically ill patients compared to the simpler ED scores. METHODS: We studied a prospective observational cohort of 227 critically ill patients admitted to the ICU directly from the ED at an academic, tertiary care medical center. We compared Acute Physiology and Chronic Health Evaluation II, APACHE III, Simplified Acute Physiology Score II, Modified Early Warning Score, Rapid Emergency Medicine Score, Prince of Wales Emergency Department Score, and a pre-hospital critical illness prediction score developed by Seymour et al. :747-754). The primary endpoint was 60-day mortality. We compared the receiver operating characteristic curves of the different scores and their calibration using the Hosmer-Lemeshow goodness-of-fit test and visual assessment. RESULTS: The ICU scores outperformed the ED scores with higher area under the curve values. There were no differences in discrimination among the ED-based scoring systems or among the ICU-based scoring systems. With the exception of the Seymour score, the ED-based scoring systems did not discriminate as well as the best-performing ICU-based scoring system, APACHE III. The Seymour score had a superior AUC to other ED scores and, despite a lower AUC than all the ICU scores, was not significantly different than APACHE III. When data from the first 24 h in the ICU was used to calculate the ED scores, the AUC for the ED scores improved numerically, but this improvement was not statistically significant. All scores had acceptable calibration. CONCLUSIONS: In contrast to prior studies of patients based in the emergency department, ICU scores outperformed ED scores in critically ill patients admitted from the emergency department. This difference in performance seemed to be primarily due to the complexity of the scores rather than the time window from which the data was derived.