Surgical data science recently developed as a new interdisciplinary research field between medicine, computer science, and engineering. Some of the major components of this field, referred to as surgical workflow analysis or surgical process modeling, have been the focus of research conducted by several international groups for more than a decade now. Here, we provide an overview of the research done in this field by our groups and some of its target applications we are currently working on. The focus is laid on the integration of mechatronic platforms into the overall workflow as an integrative part of the cooperative surgical environment.
The term “surgical data science” covers many goals and applications based on the definition developed at a joint workshop , including intraoperative decision support and surgeon training in a very general sense. Our focus in this field, however, lies in the monitoring, analysis, and recognition of the surgical workflow to provide context-aware assistance by various support systems, including mechatronic platforms such as surgical robots.
Various groups are working on developing and formalizing surgery models automatically from individual, recorded procedures , . Many try to detect surgical gestures , , activities, or phases , ,  directly from available data without the usage of models. A review on this topic is available from Lalys and Jannin .
In this article, we first briefly present the different techniques proposed by our own groups and a short summary of their respective experimental results and possible applications. Based on these, we describe in detail an additional new application field of workflow analysis and prediction in the surgical operating room (OR). Beyond automatic situation understanding, objective performance evaluation, and context-aware intraoperative visualization, the integration of robotic assistance systems is delineated. Finally, we discuss the possibilities of a fully workflow-enabled, context-aware OR of the future.
Workflow detection techniques
As one of the first groups to do this, we avoided using predefined surgical models and aimed rather at modeling and monitoring the surgical workflow only based on acquired sensor data . Several methods have been applied to this end, from dynamic programming more than a decade ago to advanced machine learning in the recent years, as will be explained in the following sections.
Dynamic time warping (DTW)
DTW can be used to synchronize two separate but similar time series and has originally been developed for speech recognition , . To compare different laparoscopic cholecystectomies, we recorded simplistic, binary usage data of surgical tools in Ref. . For every time step (in this case with a resolution of 1 min), the status of 17 different laparoscopic tools was stored in a multidimensional instrument vector as 1 if the tool was in use and as 0 otherwise (see Figures 1 and 2). The resulting signals included some with a very low rate of change, such as the one representing the presence of trocar ports placed in the beginning of the surgery and removed only at the end, as well as frequently changing ones, such as the status of the monopolar cutting or coagulation current.
These recorded surgeries could now be synchronized pairwise to each other using DTW. By choosing a single surgery arbitrarily as temporary reference, we can synchronize all other surgeries to the reference recording, exploiting the transitivity of DTW. The obtained warp paths can be interpolated and averaged to create a warping path between the reference and a virtual, average surgery. By applying the inverse of the common warp path to the timing of the synchronized surgeries first, each recorded surgery can be warped to match the timing of the virtual surgery. Then, the average surgical model can be calculated by averaging the warped instrument vectors. The binary usage data of each tool changes due to the averaging to an approximation of the probability that the tool is being used at this point of time during the surgery.
Labeling the average surgery provides a baseline for workflow detection very similar to an anatomical atlas for organ and structure segmentation. Newly recorded surgeries can be mapped to the average surgery, after which the labels of the average surgery can be applied to the corresponding times of the tested surgery. When comparing the mapped phase boundaries to known ground-truth annotations 92% of all events were correctly identified with a tolerance of 5 s and 83% even with a tolerance of 1 s.
Hidden Markov models (HMM)
HMMs also originated from the field of speech recognition  and can model sequences of observations into a traversal of a graph of hidden states automatically. For the application in surgical workflow recognition, the observations in each time step correspond to the recorded instrument vector. Often, the model is trained on the premise of using a single state per expected phase, so that a one-to-one relationship between hidden states and phases can be established, although other approaches exist. HMMs additionally have the advantage over DTW that they can work on partial sequences, so the methods based on HMM have the potential to be applied to a running surgery in real time.
In the work by Padoy et al. , a separate HMM was trained for each surgical phase, so that the correct recognition of the overall phases is not negatively influenced by the variability of the performed actions in each phase. These individual models were built with a number of hidden states correlated to the square root of the corresponding mean phase duration. The results of this experiment were also compared to HMM built by forcing only a single or two states per phase each. The overall detection accuracy over all phases was 92.4% for the dynamic model, whereas the static models with one or two states per phase reached 84.2% and 87.4%, respectively.
Another approach by Blum et al.  uses adaptive model merging to reduce the number of states from a single state per sample up to only a single state for the entire surgery. By merging the recordings of multiple different surgeries of the same type, this method can be used to visualize and analyze all variations and bottlenecks encountered in this surgery type. Using a specialized UI, it is possible to freely navigate the model by merging a state with the calculated best merging candidate via clicking or by splitting a merged state into the corresponding source states again. Thereby, a user can focus on specific parts of the surgery while accessing as much details about phases, steps, and gestures as necessary.
Canonical correlation analysis (CCA)
Contrary to the detection of instrument usage, the laparoscopic video is by definition always available during minimally invasive surgeries and usually very easy to obtain. This has been employed in Ref. , where many image features have been extracted from the laparoscopic video. This high-dimensional feature space has been mapped to a common, low-dimensional feature space with manually annotated instrument usage data. The obtained mapping is then used to reduce the number of image features to fewer, semantically more meaningful features, which in turn serve as features or observations in DTW or HMM, respectively. The best result of 76.8% accuracy was achieved using DTW on the transformed features.
In an approach to automatically detect instrument usage during a surgery (instead of manual annotations), our clinical partners attached radiofrequency identification (RFID) tags to each instrument and an appropriate antenna to the instrument table . The data obtained through this detection, in addition to other sensors to detect intraoperative light status, HF modes, table inclination, intra-abdominal pressure, and the weight of the irrigation and suction bags, were the data basis for the work in Ref. . To robustly handle the heavy noise in the recorded data, the machine learning technique of random forests was used. A forest of 50 randomized decision trees with maximal node depth of 4 was trained to classify each sample into any of the seven possible surgical phases. An average accuracy of 68.8% was achieved directly on the unfiltered sensor data.
A later improvement of the method  used the classification output of the random forest as observations for a subsequent HMM. The HMM was trained to represent each surgical phase with a single hidden state. This sequential usage enables the combination of mostly reliable classification of the random forest with the modeling structure of the HMM to produce a relatively stable and smooth classification output. This achieved an accuracy of 80.8% on the unfiltered sensor data and 82.4% when additionally preprocessing the sensor data with noise reducing filters.
Convolutional neural networks (CNN)
Due to the increasing availability of inexpensive high-performance hardware, so-called deep neural networks were widely adopted. Especially, CNNs have become a powerful tool for image understanding, as they can be trained without manual parameter adjustment on very large data sets to obtain highly robust results. Recent works focused so far on detecting straightforward surgical events, such as the presence of blood or smoke (see Figure 3), or detecting and locating surgical tools or anatomical structures in the surgical video, or more abstract workflow information .
Future surgical applications
With our background in recognizing and modeling the surgical workflow from simple and easily obtainable sensor data, our vision for surgical data processing in the near future consists of the following four major aspects.
Automatic understanding of the surgical situation
Surgical workflow recognition provides the basis for a broader recognition of the surgical context, including the detection of anomalies and emergencies. This constitutes the required infrastructure for any context-sensitive system in the OR of the future. Immediate applications of scene understanding are also possible, for example, in the form of the automatic generation of surgical reports or report templates, in which relevant key events are mentioned and possibly weighted based on the duration spent on related activities.
Objective evaluation and comparison
A long-term goal of surgical data science is also to be able to create completely novel surgical processes enabled by advances in medical devices, imaging, and robotics. Such new processes are challenging to compare manually to existing procedures in a short time frame, so support by workflow analysis is strongly required. The same methods can often also be applied to evaluate the dexterity of young surgeons during their training and help with personalized training support.
An early approach was proposed in , where several laparoscopic surgeries were recorded, synchronized through DTW (see “Dynamic time warping” section), and their laparoscopic videos played back in parallel (see Figure 4). This method provided a strong distinction between young and expert surgeons. The activities in which the young surgeons had little to no practical experience could be clearly identified in comparison to the experts, whose synchronized videos were frozen during that time, as they completed the task in a significantly shorter time.
Context-aware intraoperative visualization and control
The goal of many modern integrated OR suites is to provide every needed information to the surgeon. They often accomplish this goal by displaying as much data as possible on large wall- or ceiling-mounted monitors. These setups offload the cognitive task of filtering the data for useful information completely to the surgeon. They also usually tend to be rather large, which prevents them from being close to the patient and surgeons, so it is impossible to use them interactively. More practical, small, single-display solutions have to filter the displayed information. This filtering must happen automatically in order not to burden surgeons with additional user interactions. This requires knowledge of the surgical context. Thus, the detection and prediction of the workflow is imperative for the next generation of intraoperative unified user interfaces (UI).
A prototypical implementation of such a UI has been presented in . A tablet PC was wrapped in sterile foil and placed next to the situs in direct access for the surgeon. Throughout a simulated surgery, based on the detected phase, the display switched between different, available views, showing various sources such as preoperative planning data or intraoperative imaging data. Due to the immediate access of the tablet, it was also possible to provide interactive touch elements to the surgeon, such as buttons or sliders, to trigger events or adjust device parameters.
Medical augmented reality (AR)  provides a more direct way of visualizing supportive data, although it is not yet that common due to the high technical complexity and hardware requirements. The goal of medical AR is to display medical imaging data directly in the surgeon’s field of view either through AR displays such as head-mounted displays or by augmenting common displays with matched overlays. Challenges of this technology are not only the required tracking and alignment precision but also a convincing visualization to avoid problems with depth perception. The advantages of this technology are a more intuitive presentation of medical data and easier correlation between the data and the patient. Additionally, the visualization can be enhanced through simulated tools such as virtual windows or virtual mirrors. The prior knowledge of the surgical workflow is also required for such applications to choose an appropriate visualization style highlighting the most relevant information at each moment of the surgical procedure.
Workflow-based control of robotic assistance systems
Robot-assisted surgery (RAS) is gaining popularity with the increased availability and safety of surgical robots (Figure 5). Currently, all robotic systems in surgery have a strict master-slave relationship, so their movement is controlled exclusively and directly by a surgeon-controlled input device. Although fully autonomous robots are still far away, it is now at least possible to support the controlling surgeon through gesture completion or responsive controls.
It would be sensible for many reasons to break up the encapsulated systems and to make them communicate with the comprehensive surgical environment. In a master-slave system, a wealth of information is generated when it is being used. The range and quality of information varies with the type of the robot, but all of them are highly valuable for the purpose of workflow analysis (Table 1). With the vast amount of additionally available real-time data, a robotic system can act as “super sensor”, which could be particularly helpful for the machine learning approaches mentioned in the “Dynamic time warping” section. Whereas some of these data can also be acquired by dedicated sensors (e.g. the type of instruments in use), many others can be provided exclusively by the mechatronic support system.
Even more interesting than the role as an additional source of data is to use the robot as an active tool at the service of the workflow analysis and prediction process as shown in Figure 6.
The aim is clear: As soon as the assistance system has become cognitive and context-aware and as soon as it is able to foresee the next actions required, it should be capable to reduce the surgeon’s workload by taking over part of the tasks by itself, acting autonomously. From the surgical point of view, this would be highly attractive, offering an opportunity to relieve the surgeon from tedious tasks such as camera readjustment or knot tying. On the contrary, autonomous actions could become extremely hazardous if the machine decides to perform the wrong action, which is by far worse than just remaining passive. Due to the specific conditions, the difficulties in “automated surgery” are in orders of magnitude greater than autonomous car driving and there is a very low probability to overcome them in the near future.
Nonetheless, it appears to be conceivable that at least some segments of the surgical workflow are suitable for automation (Table 2). These steps or tasks should not be too sensitive to misinterpretation or erroneous, cognitive decisions but rather depend on mechanical and precise manual tasks. The last given example of anastomotic closure is certainly the most elaborate option, as it requires the highest quality in image understanding and force sensing. Although after the initial steps from the surgeon, this task does not involve further decision-making, contrary to other aspects of surgery, such as the positions and margins required for cutting around a tumor.
Suturing and knot tying are well-defined sequences of rather uniform motions of the instruments, requiring considerable manual skills, which are tedious and time consuming. A robotic support system should be ideally suited to carry out this procedure, with the surgeon just giving “start” and “stop” commands . In a similar way, the surgeon can define a larger tissue area where the cancer material is expected. A robotic assistance system can then automatically take biopsy samples in a regular grid pattern within that area. This allows for the first time to perform such a meticulous and precise sampling.
If a mechatronic support system provides stereoscopic imaging (as many of them do), continuous real-time distance measurements are a feasible technical challenge for the near future. This could then be used to define virtual working spaces to avoid unwanted collision with sensitive tissue, other instruments, or other artificial conditions of the environment.
A further aspect of surgery that may be performed by robotic systems in the future is the preparation and delivery of tools to the surgeon . This task still has many unsolved challenges, such as the correct identification of tools and safe handover procedures between robot and humans. The knowledge of the current workflow phase can be also used here to predict and prepare the next most likely required tool, which can greatly aid the recognition of spoken requests and reduce prohibitive delays. A subtle understanding of what is actually going on could even prepare the system for immediate reaction in case of emergency, that is, conversion to open surgery.
In a farther future, robots can also support surgeons by operating other intraoperative imaging sensors, such as ultrasound (US) ,  or SPECT imaging , . A lightweight robot with force-sensing joints is capable to move a US probe with perfectly constant speed and pressure over the patient’s skin, whereas a robot with gamma cameras is able to scan from locations, which provide the highest gain in tomographic reconstruction based on calculated areas of uncertainty. This increases the repeatability and reliability of the scans already by itself and can replace other, usually rather bulky equipment in the OR. More importantly, however, is the potential of supportive cooperation, as surgeons cannot operate multiple robotic systems themselves at once, especially when performing their own surgical actions. Esposito et al.  show that the robot and surgeon can collaborate to perform multimodal US- and gamma-guided needle biopsy. Workflow-aware control and assistance between the different systems and the surgeon is mandatory for this complex level of support.
The amount of information available to the surgeon during an intervention has always been growing, and this trend is most likely going to continue in the foreseeable future. However, to prevent disruptive cognitive overload on surgeons in the coming years, the presentation and especially the utilization of relevant data must be improved.
Recently, robotic support systems have also been widely adopted in many hospitals. In the coming decade, the presence of robotic systems is expected to increase, as more procedures will adopt this technology, and novel, truly hybrid surgical procedures can be developed based on robotic support. Such systems could not be efficiently deployed without a complete digital and automatic monitoring of surgical workflow providing the robotic system with additional intelligence in their action and interaction with surgical crew. Surgical robots have a special relevance for surgical data processing, as they both benefit from workflow knowledge, and provide additional sensor, as a result of their accurate kinematic tracking, improving the detection and monitoring of surgical workflow.
In summary, it can be seen that the management of intraoperative data flow is mandatory in modern ORs, with surgical data processing and reliable workflow analysis leading to a prospective judgment on the further workflow. This would enable robots to become cognitive, cooperative assistants with a major positive impact on the development of surgery. Yet, further research and development on workflow analysis and surgical data processing, as the ones presented in this paper, are absolutely necessary to achieve this goal.
Maier-Hein L, Vedula S, Speidel S, et al. Surgical data science: enabling next-generation surgery. arXiv:170106482. 2017. Google Scholar
Liebmann P, Neumuth T. Model driven design of workflow schemata for the operating room of the future. Inform 2010 Serv Sci-Neue Perspekt Inform 2010;175:415–9. Google Scholar
Neumann J, Rockstroh M, Franke S, Neumuth T. BPMN SIX – a BPMN 2.0 surgical intervention extension concept and design of a BPMN extension for intraoperative workflow modeling and execution in the integrated operating room. In: 7th Workshop on Modeling and Monitoring of Computer Assisted Interventions (M2CAI), Athens, Greece; 2016. Google Scholar
Haro BB, Zappella L, Vidal R. Surgical gesture classification from video data. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI). Berlin/Heidelberg: Springer; 2012:34–41. Google Scholar
Despinoy F, Bouget D, Forestier G, et al. Unsupervised trajectory segmentation for surgical gesture recognition in robotic training. IEEE Trans Biomed Eng 2016;63:1280–91. PubMedWeb of ScienceCrossrefGoogle Scholar
Bouarfa L. Recognizing Surgical Patterns. PhD diss., TU Delft: Delft University of Technology; 2012. Google Scholar
Lalys F, Riffaud L, Morandi X, Jannin P. Surgical phases detection from microscope videos by combining SVM and HMM. In: Menze B, Langs G, Tu Z, Criminisi A, editors. Medical Computer Vision Recognition Techniques and Applications in Medical Imaging (Lecture Notes in Computer Science; vol. 6533). Berlin/Heidelberg: Springer Berlin/Heidelberg; 2011:54–62. Google Scholar
Padoy N, Blum T, Feußner H, Berger M-O, Navab N. On-line recognition of surgical activity for monitoring in the operating room. In: Proceedings of the 20th Conference on Innovative Applications of Artificial Intelligence, Chicago, IL, USA, 2008:1718–24. Google Scholar
Kranzfelder M, Schneider A, Fiolka A, et al. Real-time instrument detection in minimally invasive surgery using radiofrequency identification technology. J Surg Res 2013;185:1–7. Web of ScienceGoogle Scholar
Stauder R, Okur A, Peter L, et al. Random forests for phase detection in surgical workflow analysis. In: 5th International Conference on Information Processing in Computer-Assisted Interventions (IPCAI), Fukuoka, Japan; 2014. Google Scholar
Stauder R, Kayis E, Navab N. Learning-based surgical workflow detection from intra-operative signals. arXiv:170600587 [csLG]. 2017. Google Scholar
Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N. EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med imaging 2017;36:86–97. PubMedWeb of ScienceCrossrefGoogle Scholar
Sielhorst T, Stauder R, Horn M, et al. Simultaneous replay of automatically synchronized videos of surgeries for feedback and visual assessment. Int J Comput Assist Radiol Surg 2007;2:433–4. Google Scholar
Stauder R, Belagiannis V, Schwarz LA, et al. A user-centered and workflow-aware unified display for the operating room. In: MICCAI Workshop on Modeling and Monitoring of Computer Assisted Interventions (M2CAI), Nice, France; 2012. Google Scholar
Navab N, Traub J, Sielhorst T, Feuerstein M, Bichlmeier C. Action- and workflow-driven augmented reality for computer-aided medical procedures. IEEE Comput Graph Appl 2007;27:10–4. PubMedWeb of ScienceCrossrefGoogle Scholar
Padoy N, Hager GD. Human-machine collaborative surgery using learned models. In: Robotics and Automation (ICRA), IEEE International Conference on, Shanghai, China; 2011:5285–92. Google Scholar
Stauder R, Okur A, Navab N. Detecting and analyzing the surgical workflow to aid human and robotic scrub nurses. In: 7th Hamlyn Symposium on Medical Robotics. London; 2014. Google Scholar
Hennersperger C, Fuerst B, Virga S, Zettinig O, Frisch B, Neff T. Towards MRI-based autonomous robotic US acquisitions: a first feasibility study. IEEE Trans Med Imaging 2016;62:1–11. Web of ScienceGoogle Scholar
Virga S, Zettinig O, Esposito M, et al. Automatic force-compliant robotic ultrasound screening of abdominal aortic aneurysms. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, South Korea; 2016. Google Scholar
Lasser T, Gardiazabal J, Wieczorek M, et al. Towards 3D thyroid imaging using robotic mini gamma cameras. In: Handels H, Deserno Th.M, Meinzer H-P, Tolxdorff Th., editors. Bildverarbeitung für die Medizin. Berlin Heidelberg: Springer-Verlag; 2015:498–503. Google Scholar
Esposito M, Busam B, Hennersperger C, Rackerseder J, Navab N, Frisch B. Multimodal US-gamma imaging using collaborative robotics for cancer staging biopsies. Int J Comput Assist Radiol Surg 2016;11:1561–71. PubMedCrossrefGoogle Scholar
The article (https://doi.org/10.1515/iss-2017-0035) offers reviewer assessments as supplementary material.
About the article
Published Online: 2017-09-09
Research funding: Partly funded by DFG grants FE 585/6-2 and NA 620/33-2 and BFS grant AZ-1093-13. Conflict of interest: Authors state no conflict of interest. Informed consent: Informed consent is not applicable. Ethical approval: The conducted research is not related to either human or animals use.
Ralf Stauder: Conceptualization; Funding acquisition; Methodology; Software; Writing – original draft. Daniel Ostler: Data curation; Resources. Thomas Vogel: Data curation; Investigation; Validation. Dirk Wilhelm: Data curation; Investigation; Validation. Sebastian Koller: Conceptualization; Funding acquisition; Project administration; Visualization; Writing – review and editing. Michael Kranzfelder: Formal analysis; Investigation; Resources. Nassir Navab: Funding acquisition; Supervision; Writing – review and editing.
The German Society of Surgery funded the article processing charges of this article.
Citation Information: Innovative Surgical Sciences, Volume 2, Issue 3, Pages 145–152, ISSN (Online) 2364-7485, DOI: https://doi.org/10.1515/iss-2017-0035.
©2017 Stauder R. et al., published by De Gruyter, Berlin/Boston. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0