Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter March 13, 2019

Use of data mining techniques to classify length of stay of emergency department patients

  • Görkem Sariyer ORCID logo EMAIL logo , Ceren Öcal Taşar and Gizem Ersoy Cepe


Emergency departments (EDs) are the largest departments of hospitals which encounter high variety of cases as well as high level of patient volumes. Thus, an efficient classification of those patients at the time of their registration is very important for the operations planning and management. Using secondary data from the ED of an urban hospital, we examine the significance of factors while classifying patients according to their length of stay. Random Forest, Classification and Regression Tree, Logistic Regression (LR), and Multilayer Perceptron (MLP) were adopted in the data set of July 2016, and these algorithms were tested in data set of August 2016. Besides adopting and testing the algorithms on the whole data set, patients in these sets were grouped into 21 based on the similarities in their diagnoses and the algorithms were also performed in these subgroups. Performances of the classifiers were evaluated based on the sensitivity, specificity, and accuracy. It was observed that sensitivity, specificity, and accuracy values of the classifiers were similar, where LR and MLP had somehow higher values. In addition, the average performance of the classifying patients within the subgroups outperformed the classifying based on the whole data set for each of the classifiers.

  1. Ethical Approval: The conducted research is not related to either human or animal use.

  2. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  3. Research funding: None declared.

  4. Employment or leadership: None declared.

  5. Honorarium: None declared.

  6. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

  7. Conflict of interests: The authors declare no conflict of interest.


[1] Sariyer G, Ataman MG. Utilizing mHealth applications in emergency medical services of Turkey. In: Sezgin E, Yildirim S, Özkan-Yildirim S, Sumuer E, editor(s). Current and Emerging mHealth Technologies: Adoption, Implementation and Use. USA: Springer International Publishing; 2019:266–266–284.Search in Google Scholar

[2] Holm LB, Dahl FA. Using soft systems methodology as a precursor for an emergency department simulation model. OR Insight 2011;24:168–89.10.1057/ori.2011.8Search in Google Scholar

[3] Gul M, Celik E. An exhaustive review and analysis on applications of statistical forecasting in hospital emergency departments. Health 2018;19:1–22.10.1080/20476965.2018.1547348Search in Google Scholar PubMed PubMed Central

[4] Sariyer G, Ataman MG, Kızıloğlu İ. Factors affecting length of stay in the emergency department: a research from an operational viewpoint. Int Healthcare. 2018. DOI: 10.1080/20479700.2018.1489992.Search in Google Scholar

[5] Yucesan M, Gul M, Celik E. A multi-method patient arrival forecasting outline for hospital emergency departments. Int J Healthcare Manage 2018. DOI: 10.1080/20479700.2018.1531608.Search in Google Scholar

[6] Kankaanpää M, Raitakari M, Muukkonen L, Gustafsson S, Heitto M, Palomäki A, et al. Use of point-of-care testing and early assessment model reduces length of stay for ambulatory patients in an emergency department. Scand J Trauma Resuscitation Emerg Med 2016;24:125.10.1186/s13049-016-0319-zSearch in Google Scholar PubMed PubMed Central

[7] Harper PR. A review and comparison of classification algorithms for medical decision making. Health Policy 2005;71:315–31.10.1016/j.healthpol.2004.05.002Search in Google Scholar PubMed

[8] Rahman RM, Hasan FR. Using and comparing different decision tree classification techniques for mining ICDDR, B Hospital Surveillance data. Exp Syst 2011;38:11421–36.10.1016/j.eswa.2011.03.015Search in Google Scholar

[9] Kantardzic M. Data mining: concepts, models, methods, and algorithms. USA: John Wiley & Sons, 2011.10.1002/9781118029145Search in Google Scholar

[10] Lin WT, Wu YC, Zheng JS, Chen MY. Analysis by data mining in the emergency medicine triage database at a Taiwanese regional hospital. Exp Syst Appl 2011;38:11078–84.10.1016/j.eswa.2011.02.152Search in Google Scholar

[11] Handel D, Epstein S, Khare R, Abernethy D, Klauer K, Pilgrim R, et al. Interventions to improve the timeliness of emergency care. Acad Emerg Med 2011;18:1295–302.10.1111/j.1553-2712.2011.01230.xSearch in Google Scholar PubMed

[12] Kreindler SA, Cui Y, Metge CJ, Raynard M. Patient characteristics associated with longer emergency department stay:a rapid review. Emerg Med J 2016;33:194–9.10.1136/emermed-2015-204913Search in Google Scholar PubMed

[13] Casalino E, Wargon M, Peroziello A, Choquet C, Leroy C, Beaune S, et al. Predictive factors for longer length of stay in an emergency department: a prospective multicentre study evaluating the impact of age, patient’s clinical acuity and complexity, and care pathways. Emerg Med J 2014;31:361–8.10.1136/emermed-2012-202155Search in Google Scholar PubMed

[14] Brouns SH, Stassen PM, Lambooij SL, Dieleman J, Vanderfeesten IT, Haak HR. Organisational factors induce prolonged emergency department length of stay in elderly patients–a retrospective cohort study. PloS One 2015;10:e0135066.10.1371/journal.pone.0135066Search in Google Scholar PubMed PubMed Central

[15] Li L, Georgiou A, Vecellio E, Eigenstetter A, Toouli G, Wilson R, et al. The effect of laboratory testing on emergency department length of stay: a multihospital longitudinal study applying a cross-classified random-effect modeling approach. Acad Emerg Med 2015;22:38–46.10.1111/acem.12565Search in Google Scholar PubMed PubMed Central

[16] Gardner RL, Sarkar U, Maselli JH, Gonzales R. Factors associated with longer ED lengths of stay. Am Emerg 2007;25:643–50.10.1016/j.ajem.2006.11.037Search in Google Scholar PubMed

[17] Liao SC, Lee IN. Appropriate medical data categorization for data mining classification techniques. Med Informatics Internet Med 2002;27:59–67.10.1080/14639230210153749Search in Google Scholar PubMed

[18] Zhongxian W, Ruiliang Y, Qiyang C, Ruben X. Data mining in nonprofit organizations, government agencies, and other institutions. Int J Inf Syst Service Sector 2010;2:42–52.10.4018/jisss.2010070104Search in Google Scholar

[19] Shouman M, Turner T, Stocker R. Applying k-nearest neighbour in diagnosing heart disease patients. Int Inf Educ Technol 2012;2:220–3.10.7763/IJIET.2012.V2.114Search in Google Scholar

[20] Safdari R, Rezaei-Hachesu P, Ghazi-Saeedi M, Samad-Soltani T, Zolnoori M. Evaluation of classification algorithms vs knowledge-based methods for differential diagnosis of asthma in Iranian patients. Int J Inf Service Sector 2018;10:22–35.10.4018/IJISSS.2018040102Search in Google Scholar

[21] Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Informatics 2008;77:81–97.10.1016/j.ijmedinf.2006.11.006Search in Google Scholar PubMed

[22] Hachesu PR, Ahmadi M, Alizadeh S, Sadoughi F. Use of data mining techniques to determine and predict length of stay of cardiac patients. Healthcare Informatics Res 2013;19:121–9.10.4258/hir.2013.19.2.121Search in Google Scholar PubMed PubMed Central

[23] Rowan M, Ryan T, Hegarty F, O’Hare N. The use of artificial neural networks to stratify the length of stay of cardiac patients based on preoperative and initial postoperative factors. Artif Intell 2007;40:211–21.10.1016/j.artmed.2007.04.005Search in Google Scholar PubMed

[24] Pendharkar PC, Khurana H. Machine learning techniques for predicting hospital length of stay in Pennsylvania federal and specialty hospitals. Int J Comput Sci Appl. 2014;11(3):45–56.Search in Google Scholar

[25] Azari A, Janeja VP, Mohseni A. Predicting hospital length of stay (PHLOS): a multi-tiered data mining approach. In: Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference, December. IEEE, 2012:17–24.Search in Google Scholar

[26] Liu P, Lei L, Yin J, Zhang W, Naijun W, El-Darzi E. Healthcare data mining: prediction inpatient length of stay. In: Intelligent Systems, 2006 3rd International IEEE Conference, September. IEEE, 2006:832–837.Search in Google Scholar

[27] Chuang MT, Hu YH, Lo CL. Predicting the prolonged length of stay of general surgery patients: a supervised learning approach. Int Trans Oper Res 2018;25:75–90.10.1111/itor.12298Search in Google Scholar

[28] Gul M, Guneri AF. Planning the future of emergency departments: forecasting ED patient arrivals by using regression and neural network models. Int J Ind Eng. 2016;23(2):137–154.Search in Google Scholar

[29] Golmohammadi D. Predicting hospital admissions to reduce emergency department boarding. Int J Prod Economics. 2016;182:535–44.10.1016/j.ijpe.2016.09.020Search in Google Scholar

[30] Xu M, Wong TC, Chin KS. Modeling daily patient arrivals at emergency department and quantifying the relative importance of contributing variables using artificial neural network. Decision Support 2013;54:1488–98.10.1016/j.dss.2012.12.019Search in Google Scholar

[31] Sarıyer G, Taşar C. The use of data mining and neural networks for forecasting patient volume in an emergency department. 4th International Researchers, Statisticians, and Young Statisticians Congress, May 2018 book of abstracts.Search in Google Scholar

[32] Gul M, Guneri AF. Forecasting patient length of stay in an emergency department by artificial neural networks. J Aeronautics Space Technol/HavacilikveUzayTeknolojileriDergisi. 2015;8(2):43–48.10.7603/s40690-015-0015-7Search in Google Scholar

[33] Wrenn J, Jones I, Lanaghan K, Congdon CB, Aronsky D. Estimating patient’s length of stay in the Emergency Department with an artificial neural network. In AMIA Annual Symposium Proceedings. AMIA Symposium 2005 (Vol. 2005, p. 1155). American Medical Informatics Association.Search in Google Scholar

[34] Breiman L, Friedman JH, Olshen RA. Classification and regression trees. Wadsworth statistics/probability series. 1984.10.1201/9781315139470Search in Google Scholar

[35] Denison DG, Mallick BK, Smith AF. A bayesian CART algorithm. Biometrika 1998;85:363–77.10.1093/biomet/85.2.363Search in Google Scholar

[36] Lemon SC, Roy J, Clark MA, Friedmann PD, Rakowski W. Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann Behav Med 2003;26:172–81.10.1207/S15324796ABM2603_02Search in Google Scholar PubMed

[37] Breiman L. Bagging predictors. Machine Learning 1996;24:123–40.10.1007/BF00058655Search in Google Scholar

[38] Breiman L. Random forests. Machine Learning 2001;45:5–32.10.1023/A:1010933404324Search in Google Scholar

[39] Liaw A, Wiener M. Classification and regression by Random Forest. R News 2002;2:18–22.Search in Google Scholar

[40] Verikas A, Gelzinis A, Bacauskiene M. Mining data with random forests: a survey and results of new tests. Pattern Recogn 2011;44:330–49.10.1016/j.patcog.2010.08.011Search in Google Scholar

[41] Arora R, Suman S. Comparative analysis of classification algorithms on different datasets using WEKA. Int Comput Appl 2012;54:21–5.10.5120/8626-2492Search in Google Scholar

[42] Han J, Kamber M, Pei J. Data mining: concepts and techniques. The Morgan Kaufmann series of data management systems, Elsevier; 2011:230–240.Search in Google Scholar

[43] Baba N. A new approach for finding the global minimum of error function for neural networks. Neural Networks 1989;2:367–73.10.1016/0893-6080(89)90021-XSearch in Google Scholar

[44] Minsky ML, Papert S. Perceptrons: an introduction to computational geometry. USA: MIT Press, 1969.Search in Google Scholar

[45] Iserson KV, Moskop JC. Triage in medicine, part I: concept, history, and types. Ann Emerg Med 2007;49:275–81.10.1016/j.annemergmed.2006.05.019Search in Google Scholar PubMed

[46] Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 2009;11:10–18.10.1145/1656274.1656278Search in Google Scholar

Received: 2018-12-11
Accepted: 2019-02-07
Published Online: 2019-03-13

©2019 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.9.2023 from
Scroll to top button