Diagnostic errors represent a largely unaddressed concern in all healthcare settings, including the Emergency Room (ER) [1, 2]. According to current estimates, one in every 10 diagnoses is likely to be wrong and one in every 1000 ambulatory encounters engenders the risk of harm from a diagnostic error [3, 4]. In aggregate, diagnostic errors are estimated to account for 40,000–80,000 deaths annually in the US . In the ER setting, studies of malpractice claims identify multiple breakdowns in the diagnostic process, and cognitive factors were involved in 96% of these instances .
Recent reviews have summarized both the system-related and cognitive interventions that have been proposed to reduce diagnostic error [7–11]. A wide range of well-justified interventions has been considered, including personnel-focused programs, educational interventions, structured process systems, second reviews, and technology-based solutions . Relatively few of these have been evaluated rigorously in practice, although second review programs are now increasingly common in both pathology and radiology programs, and decision support tools to help with differential diagnosis [12, 13], along with online access to medical knowledge resources are increasingly being utilized in direct patient care.
Although the diagnostic process is clearly influenced by many factors inherent in healthcare systems, diagnosis is fundamentally a cognitive process, often based on intuition or gestalt . Successful diagnosis depends on clinical reasoning to synthesize all of the available information and relevant knowledge to arrive at plausible diagnostic possibilities. With an ever-expanding number of diseases to be considered (8000–12,000 according to estimates), the variability in how the disease will manifest and be described and perceived, the substantial uncertainty attending every step of the process, and ultimately the inherent limitations of human memory and cognitive processing, diagnosis is a challenge. The typical ER, with the pressures of workload and time, repeated distractions, and constant multitasking, contributes to this challenge.
Checklists have been developed to address complex and error-prone processes, especially for situations where the outcome is too critical to rely on human memory . Although checklist usage in medical settings is a relatively new endeavor , they have already produced impressive reductions in the infection rate from central line placement [17, 18], and in surgical complications [19, 20]. However, the process of diagnosis is not as well defined, and it is not known if checklists will be as effective in this application. The goal of this project was to develop and pilot test a checklist to assist with diagnosis and ultimately help reduce the risk of diagnostic error. To complement symptom-specific checklists that have been previously developed and trialed [21, 22], we sought to develop a more general checklist, and performed initial evaluations of both of these products in an ER setting.
Rapid cycle design and development was used to draft, develop, evaluate, and refine a general checklist for use in ER medical diagnostic evaluations. We solicited input from subject matter experts and physician users on content and design elements. We used qualitative methods to evaluate usability, physician satisfaction and impact of checklist usage. We also quantitatively assessed test and consult ordering pre- and post-checklist use. The protocols and products were reviewed and approved by a central Institutional Review Board (IRB) at RTI International and separately by the IRB’s at the two participating medical centers.
We recruited 16 staff-level ER physicians from two academic centers who participated with informed consent. One physician dropped out due to an acute illness. Physicians who completed the project and participated in all of the evaluations were offered a $500 incentive reward.
Development of a symptom-specific set of checklists for primary care has been previously described [21, 22]. We focused on developing a usable “general” diagnostic process checklist:
Version 1: To derive a novel, general checklist for diagnosis we used published checklist design principles and recommendations [23–25] and input from a content-matter expert advisory panel (see acknowledgements) to modify a draft prototype (Version 1, see Appendix) we had previously developed as a starting point [22, 26]. We conducted 45–60 min, structured individual telephone discussions with the seven subject matter experts to obtain input on the ideal components of a general checklist, feedback on Version 1, suggestions on the proposed process for refining and piloting the checklist, and key outcomes to consider. Interviews were transcribed and common themes and major suggestions were identified.
Version 2a and 2b: Based on input from the subject matter experts and the participating ER physicians, two different general checklists were developed, designated 2a and 2b (see Appendix, Checklists 2a and 2b). Experts recommended we include a complete set of symptom-specific checklists along with the general checklist . Both were then published as a spiral-bound 4×6 inch booklet. Physicians at the two ER sites used the booklet during two work shifts. We then performed cognitive testing to obtain feedback through interviews in person or by telephone. Based on this input and further consultation with the subject matter experts, a revised final “Version 3” checklist was developed.
The Version 3 checklist, along with reformatted and updated versions of the symptom-specific checklists, (Figure 2A, B) were again published as a spiral-bound booklet sized to fit in a lab coat pocket. Over the next 2 months, the 15 participating physicians were encouraged to use the booklet containing Version 3 twice per shift, focusing on patients they believed had high risk symptoms or “don’t miss” diagnoses. Usage was specifically encouraged for patients with chest or abdominal pain. Physicians were also encouraged to review the appropriate symptom-specific checklist with the patient, and were shown a video demonstrating this approach (http://tiny.cc/ElyDemo or http://www.youtube.com/watch?v=uHpieuyP1w0).
Physicians completed a post-shift questionnaire (Supplemental Data, which accompanies the article at http://www.degruyter.com/view/j/dx.2014.1.issue-3/issue-files/dx.2014.1.issue-3.xml – Checklist Impact Evaluation) examining how and why they used the checklist and whether this had any impact on their clinical decisions.
Outcome assessment and data analysis
Upon completing development of Versions 1 and 2 (2a and 2b), we conducted cognitive interviews with participating physicians to elicit their impressions of the relative advantages, feasibility, usability, satisfaction, usefulness, and checklist usage fit within their clinical workflow. Interviews were performed using a structured format (see Supplemental Data – Cognitive Testing of Checklist Content and Format) and responses were recorded and coded by two team members (AS, NL) using NVivo 9 software (QSR-International) to identify emergent themes. Themes were identified when at least three individual respondents introduced a topic [27, 28]. A final round of structured interviews was conducted and analyzed similarly after the physicians had used Version 3 for 2 months to evaluate checklist appropriateness, usability, and impact.
At one of the test sites, chart reviews were performed to evaluate any major trends in resource utilization as a result of using the checklists. We randomly selected and reviewed a total of 186 medical records relating to ER visits for patients with chest pain or abdominal pain seen by physicians participating in the study. We recorded the number of laboratory tests, imaging exams, and consultations ordered. A total of 104 charts were reviewed of patients seen before the physicians were exposed to any of the checklists, and 82 charts were reviewed of patients seen after they had used the Version 3 checklist for 2 months. These data were compared by χ2 analysis.
Feedback during checklist developmental
The ER user group considered the content of Versions 2a and 2b to be appropriate and helpful in the initial testing, but users suggested replacing some of the items they did not find helpful (such as, “take your own history”) with items identifying specific high risk situations and actions to take in those situations. Combined with other suggestions, this resulted in the revised content used in Version 3.
Quantitative data from post-shift responses
The responses physicians provided after using the checklists 348 times during their last shift are shown in Table 1. The major findings were 1) Both the general and the specific checklists were used, and the specific checklists were used somewhat more commonly than the general one; 2) Checklist usage was prompted by a variety of different factors, not just diagnostic uncertainty; and 3) In the majority of cases, using the checklist helped confirm the original considerations and had no major impact on the final diagnosis rendered or the management plan chosen. However, using the checklists commonly helped in considering novel diagnostic possibilities (approximately one-third of usages), and in a small but important fraction of cases, using the checklists changed the working diagnosis (37 instances, roughly 10% of usages). In terms of resource utilization, both positive and negative effects were identified: In 21 instances using the checklist was judged to have slowed the physician down and in 77 instances additional diagnostic tests had to be ordered. However, some planned tests were cancelled, there was no substantial increase in the plans to obtain subspecialty consultation and sometimes there were plans to not refer, refer more appropriately or ask a colleague. Thus, although use of the checklists did not have an impact on management or referrals in most cases, in some cases it led to resource optimization and in some to additional resource utilization. We did not evaluate whether checklist usage was ultimately beneficial to the patient’s diagnostic process, or whether any new diagnostic considerations were in fact correct.
Of the 15 physicians, 10 had more than 3 years of experience in the ER (“senior clinicians”). The more junior clinicians were just as likely to use the general and specific checklists (51 and 52 instances, respectively), whereas the more senior clinicians favored the specific checklists (65 and 115 instances, respectively; χ2 5.1; Yates p value=0.02, Fisher’s exact test p<0.0001). Experience was not a significant factor in any of the other response categories.
Qualitative data from final interviews
Several common themes emerged from interviews of the ER physicians after they had used the Version 3 general checklist and the set of symptom-specific checklists for 2 months (see Table 2). Primarily, the comments corroborated the quantitative findings from the post-shift questionnaires, that the checklists were generally helpful and could help prompt consideration of additional diagnostic possibilities. Several providers noted specific patients where a critical diagnosis would have been missed had the checklists not been used. None of the physicians had read the instructions provided, and for a variety of reasons none had used the toolkit in front of the patient, or with the patient’s participation. Several users commented on the usefulness of the general checklist as a way to introduce concepts relating to diagnostic error to trainees. A comprehensive listing of impressions, including comments on barriers and facilitators for usage and suggestions for future improvements is included in the Supplemental Data.
Quantitative data from chart reviews pre- and post-checklist use
Comparing the actions of the ER physicians before and after using the Version 3 and symptom-specific checklists on patients with chest or abdominal pain, there was a tendency post-exposure to list more items in the differential diagnosis (1.60 items/patient vs. 1.43, p=0.08), to order more laboratory tests (7.05 tests/patient vs. 6.10, p=0.17), imaging tests (1.40 tests/patient vs. 1.05, p=0.03) and consults (0.22 consults/patient vs. 0.14, p=0.24), although the significance of these comparisons were limited due to the small sample size.
The majority of diagnostic errors in medicine reflect cognitive errors , and thus the goal of this project was to develop a checklist that would help optimize the clinical reasoning process and help physicians recognize situations that predispose to these errors. To develop a general checklist, we began with some of the major shortcomings identified in the medical literature, then used an iterative process of evaluation and refinement based on input from subject matter experts and end-users. Rather than serving as a how-to guide, the resulting general checklist may cue physicians to pause and reflect in the process for formulating a diagnosis in the ER setting, and to recognize situations prone to cognitive biases.
We found that using the resulting general checklist along with a set of previously developed symptom-specific checklists was generally well accepted by practicing emergency room physicians. Using the checklists was reported to be helpful in confirming initial impressions, and in almost a third of instances the checklist helped the physician consider important diagnoses they had not previously considered. Using the checklists was identified as having prevented several important diagnostic errors, and there were no reports that usage induced new errors. The checklist also encouraged physicians to engage patients in ways that help prevent diagnostic errors or help catch these errors at an early stage, to avoid or minimize harm (“Make sure the patient knows when and how to get back to you if necessary”).
Although both the general and the specific checklists were judged to be valuable, users preferred using the specific checklists for most situations. Shimizu and colleagues have evaluated the same symptom-specific checklists that we used, along with a general debiasing checklist (our Version 1 checklist) in 188 Japanese medical students asked to evaluate five cases of increasing clinical complexity . Using the general checklist did not increase the proportion of students choosing the correct diagnosis. In contrast, using the symptom-specific checklist was beneficial, at least for the more difficult cases. A similar result was found in a study using checklists to reduce nursing errors in programming IV infusion pumps: a highly specific, step-by-step checklist outperformed a general checklist that encouraged nurses to think critically and consider the ‘five rights’ of medication administration . In our study, physician users were not receptive to a step-by-step approach that emphasized basic elements of the diagnostic process, viewing this as too simplistic. A general checklist that instead focused on high-risk situations therefore evolved, and the results confirmed that high-risk situations were one of the most common reasons to review the checklists. This acknowledges the reality that the synthesis phase of diagnosis, where diagnostic hypotheses are generated from integrating all the available information, is simply not amenable to the step-by-step approach that most checklists employ. Thus, the use of general and symptom specific checklists in the ER setting needs to be explored further, because they appear to be serving different purposes and each may play a role in unique situations. We also believe that the specific checklists used in this study, designed for patients seen in primary care, could be improved by customizing them for patients with urgent and emergency conditions, which often invokes a different differential diagnosis set. Finally, symptom-specific checklists in general are not ideal for patients presenting with multiple problems, or patients whose differential diagnosis is influenced by contextual factors. For example, the differential diagnosis for a healthy patient with weakness would be prioritized differently than the same differential for a patient on chemotherapy for metastatic disease.
A critical issue regarding interventions to improve diagnostic reliability is the extent to which reductions in errors and error-related harm are offset by unintended consequences. In the case of checklists for diagnosis, the goal of prompting consideration of a wider differential diagnosis could potentially lead to additional diagnostic testing and consultation, and thus additional costs in terms of time, money, and possibly complications of these investigations. Although our study was not powered to investigate this definitively, there did appear to be a tendency for providers to order additional diagnostic evaluations post-exposure, albeit the effect appeared to be small, and to some extent offset by cases where tests and consults were cancelled after checklist review. Other negatives were occasionally noted by users, including a sense in some cases that checklist usage ‘slowed me down’ or was not worth the time invested.
Although the success of checklists to improve surgical safety and to reduce central line infections seems adequately established, the impact of checklists in other medical settings has been mixed . The possibility that checklists could improve diagnostic performance is suggested by the findings of Shimizu discussed above, and recent studies by Sibbald et al. who evaluated the impact of a simple checklist for interpreting electrocardiograms . Even though this checklist included simply the standard steps of ECG interpretation, using the checklist improved accuracy in a group of expert interpreters (senior Cardiology fellows) without increasing the cognitive load or causing expert reversal. Similarly, use of a checklist improved the reliability of identifying key findings on the cardiovascular examination .
Our study had several limitations. It was not designed to evaluate whether checklist usage reduced the incidence of diagnostic error or error-related harm, the major outcomes of interest. We also make no assertion that the final general checklist that evolved from our process is the final tool or is superior to any others, and several other general checklists and ‘tips sheets’ for diagnosis exist. A key issue regarding checklist usage in ERs is whether it should be used in every case, or just when use is judged to be “appropriate”. We did not mandate checklist use in every patient, even though many diagnostic errors arise from situations where the diagnosis is established quickly and automatically using ‘System 1’ cognitive processing, and physicians seem generally to be unaware of which of their diagnoses are incorrect [34–36]. We would predict that usage in every case would decrease diagnostic errors and might also increase diagnostic testing, hypotheses that would need to evaluated in subsequent studies. Generalizing our findings is limited by the small size of the study sites (2) and physicians (15).
The ultimate success of checklists to improve clinical performance will reflect the complex interaction of many different socio-adaptive influences beyond simply ticking off boxes [37–39]. Improving communication amongst team members, education, providing feedback after usage, using clinical champions, endorsement by local leaders, and establishing a cultural expectation of improved safety are among the other elements relevant to whether such checklists can decrease diagnostic error and harm. Stakeholder buy-in is also critically important  and the checklists must be appropriately customized in relation to local needs.
In conclusion, the general checklist derived in this project has been extensively evaluated and refined, and in pilot studies was perceived as being helpful in the diagnostic process. Whether checklists can reduce diagnostic error will require further study. Given the complex nature of the diagnostic process, improving diagnostic reliability will no doubt require a set of multi-faceted interventions. Our pilot study suggests that checklists for diagnosis should be considered for further development, implementation and evaluation, perhaps with additional complementary strategies to reduce diagnostic error.
We sincerely appreciate advice provided by the subject matter experts: Pat Croskerry, John Ely, Robert Wears, Peter Pronovost, Atul Gawande, Key Dismukes, and Dan Boorman. We also thank John Ely for providing his set of symptom-specific checklists.
Version 2a and 2b
Version 3 General Checklist and examples of specific checklists.
Expert Interview Guide
Cognitive Testing of Checklist Content and Format
Checklist Impact Evaluation
Additional results: Post Checklist Interview Responses.
Graber M. The incidence of diagnostic error. BMJ Qual Saf 2013;22, Part 2:ii21–ii7.Google Scholar
Leape L, Berwick D, Bates D. Counting deaths from medical errors. J Am Med Assoc 2002;288:2405.Google Scholar
Kachalia A, Gandhi T, Puopolo A, Yoon C, Thomas EJ, Griffey R, et al. Missed and delayed diagnoses in the emergency department: a study of closed malpractice claims from 4 liability insurers. Ann Emerg Med 2006;19:196–205.Web of ScienceGoogle Scholar
Graber M, Kissam S, Payne V, Meyer AN, Sorenson A, Lenfestey N, et al. Cognitive interventions to reduce diagnostic error: a narrative review. BMJ Qual Saf 2012;21:535–57.Google Scholar
Singh H, Graber M, Kissam S, Sorensen AV, Lenfestey NF, Tant EM, et al. System-related interventions to reduce diagnostic errors: a narrative review. BMJ Qual Saf 2012;21:160–70.Google Scholar
McDonald K, Matesic B, Contopoulos-Iannidis D, Lonhart J, Schmidt E, Pineda N, et al. Patient safety strategies targeted at diagnostic errors – a systematic review. Ann Int Med 2013;158:381–9.Google Scholar
El-Kareh R, Hasan O, Schiff G. Use of health information technology to reduce diagnostic error. BMJ Qual Saf 2013;22ii:40–4.Google Scholar
Thomas N, Ramnarayan P, Bell M, Maheshwari P, Wilson S, Nazarian EB, et al. An international assessment of a web-based diagnostic tool in critically ill children. Technol Health Care 2008;16:103 – 10.Google Scholar
Ramnarayan P, Cronje N, Brown R, Negus R, Coode B, Moss P, et al. Validation of a diagnostic reminder system in emergency medicine: a multi-centre study. Emerg Med J 2007;24:619–24.Web of ScienceCrossrefGoogle Scholar
Cervellin G, Borghi L, Lippi G. Do clinicians decide relying primarily on Bayesian principles or Gestalt perception? Some pearls and pitfalls of Gestalt perception in medicine. Intern Emerg Med 2014. DOI 10.1007/s11739-014-1049-8.CrossrefGoogle Scholar
Gawande A. The checklist manifesto: how to get things right, 1st ed. New York: Metropolitan Books, 2010.Google Scholar
Ko H, Turner T, Finnigan M. Systematic review of safety checklists for use by medical care teams in acute care hospitals – limited evidence of effectiveness. BMC Health Serv Res 2011;11:211.CrossrefGoogle Scholar
The Matching Michigan Collaboration and Writing Team. Matching Michigan: a 2-year stepped intervention programme to minimse central venous catheter-blood stream infections in intensive care units in England. BMJ Qual Saf 2013;22:110–23.Google Scholar
Pronovost P, Needham D, Berenholz S, Sinopoli D, Chu H, Cosgrove S, et al. An intervention to decrease catheter-related bloodstream infections in the ICU. N Engl J Med 2006;355:2725–32.Google Scholar
Treadwell J, Lucas S, Tsou A. Surgical checklists: a systematic review of impacts and implementation. BMJ Qual Saf 2014;23: 299–318.Google Scholar
Haynes AB, Weiser TG, Berry WR, Lipsitz SR, Breizat AH, Dellinger EP, et al. A surgical safety checklist to reduce morbidity and mortality in a global population. N Engl J Med 2010;360:491–9.Web of ScienceGoogle Scholar
Ely JW, Osheroff JA. Checklists to prevent diagnostic errors. Abstract, Presented at: Diagnostic Error in Medicine Annual Conference, Phoenix-AZ, 2008.Google Scholar
Thomassen O, Espeland A, Softeland E, Lossius H, Heltne J, Brattebo G. Implementation of checklists in health care; learning from high-reliability organisations. Scand J Trauma Resusc Emerg Med 2011;19:53.Web of ScienceGoogle Scholar
Yin RK. Case study research: design and methods. Thousand Oaks, CA: Sage Publications, Inc., 1994.Google Scholar
Miles M, Huberman A. Qualitative analysis: an expanded sourcebook, 2nd ed. Thousand Oaks, CA: Sage Publications, 1994.Google Scholar
Graber ML, Franklin N, Gordon RR. Diagnostic error in internal medicine. Arch Int Med 2005;165:1493–9.Google Scholar
Shimizu T, Kentaro M, Tokuda Y. Effects of the use of differential diagnosis checklist and general de-biasing checklist on diagnostic performance in comparison to intuitive diagnosis. Med Teacher 2013;35:e1218–e29.Google Scholar
White R, Trbovich PL, Easty AC, Savage P, Trip K, Hyland S. Checking it twice: an evaluation of checklists for detecting medication errors at the bedside using a chemotherapy model. Qual Saf Health Care 2010;19:562–7.Web of ScienceGoogle Scholar
Sibbald M, de Bruin A, van Merrienboer J. Checklists improve experts’ diagnostic decisions. Med Educ 2013;47:301–8.Google Scholar
Sibbald M, de Bruin ABH, Cavalcanti RB, van Merrienboer JJG. Do you have to re-examine your diagnosis? Checklists and cardiac exam. BMJ Qual Saf 2013;22:333–338.Google Scholar
Podbregar M, Voga G, Krivec B, Skale R, Pareznik R, Gabrscek L. Should we confirm our clinical diagnostic certainty by autopsies? Intensive Care Med 2001;27:1750–5.Google Scholar
Friedman CP, Gatti GG, Franz TM, Murphy GC, Wolf FM, Heckerling PS, et al. Do physicians know when their diagnoses are correct? Implications for decision support and error reduction. J Gen Intern Med 2005;20:334–9.CrossrefGoogle Scholar
Chopra V, Shojania K. Recipe for checklists and bundles – one part active ingredient and one part measurement. Qual Saf Health Care 2013;22:93–6.Google Scholar
The online version of this article (DOI: 10.1515/dx-2014-0019) offers supplementary material, available to authorized users.
About the article
Published Online: 2014-06-19
Published in Print: 2014-09-01
Conflict of interest statement
Authors’ conflict of interest disclosure: The authors stated that there are no conflicts of interest regarding the publication of this article. Research funding played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.
Research funding: This work was supported by an ACTION II contract award from the Agency for Healthcare Research and Quality to RTI International (Contract Number HHSA290200600001I, Prism No. HHSA29032005T, Task Order #8). Drs. Meyer and Singh and Ms. Modi are supported by in part by the Houston VA Center for Innovations in Quality, Effectiveness and Safety (CIN 13-413).
Employment or leadership: None declared.
Honorarium: None declared.
Citation Information: Diagnosis, Volume 1, Issue 3, Pages 223–231, ISSN (Online) 2194-802X, ISSN (Print) 2194-8011, DOI: https://doi.org/10.1515/dx-2014-0019.
©2014, Mark L. Graber et al., published by De Gruyter. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0