Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Current Directions in Biomedical Engineering

Joint Journal of the German Society for Biomedical Engineering in VDE and the Austrian and Swiss Societies for Biomedical Engineering

Editor-in-Chief: Dössel, Olaf

Editorial Board Member: Augat, Peter / Buzug, Thorsten M. / Haueisen, Jens / Jockenhoevel, Stefan / Knaup-Gregori, Petra / Kraft, Marc / Lenarz, Thomas / Leonhardt, Steffen / Malberg, Hagen / Penzel, Thomas / Plank, Gernot / Radermacher, Klaus M. / Schkommodau, Erik / Stieglitz, Thomas / Urban, Gerald A.

2 Issues per year

Open Access
Online
ISSN
2364-5504
See all formats and pricing
More options …

Respiratory motion tracking using Microsoft’s Kinect v2 camera

Floris Ernst / Philipp Saß
Published Online: 2015-09-12 | DOI: https://doi.org/10.1515/cdbme-2015-0048

Abstract

In image-guided radiotherapy, monitoring and compensating for respiratory motion is of high importance. We have analysed the possibility to use Microsoft’s Kinect v2 sensor as a low-cost tracking camera. In our experiment, eleven circular markers were printed onto a Lycra shirt and were tracked in the camera’s color image using cross correlation-based template matching. The 3D position of the marker was determined using this information and the mean distance of all template pixels from the sensor. In an experiment with four volunteers (male and female) we could demonstrate that real time position tracking is possible in 3D. By averaging over the depth values inside the template, it was possible to increase the Kinect’s depth resolution from 1 mm to 0.1 mm. The noise level was reduced to a standard deviation of 0.4 mm. Temperature sensitivity of the measured depth values was observed for about 10-15 minutes after system start.

Keywords: radiotherapy; motion compensation; respiratory tracking; template matching

1 Introduction

In many clinical applications, detecting and tracking of respiratory motion is required. As an example, image-guided radiotherapy (IGRT) of the chest and abdomen relies heavily on this principle: some kind of marker is placed on or attached to the patient’s chest and is monitored using a non-invasive localisation device. Subsequently, trajectory of the marker is then analysed and used to either dynamically activate the treatment beam (called gating [3]) or to guide the radiation source [6]. Especially in the second scenario, tracking one marker may not be sufficient: the actual target of the treatment beam – the tumour – is typically not observed directly. Although this could be done (either using continuous X-ray localisation [7] or 3D ultrasound tracking [1]), the current method in clinical use relies on a mathematical model linking the motion on the patient’s chest to the motion of the actual target.

It has been shown that the accuracy of these correlation algorithms can be improved by incorporating multiple markers [2]. In this work, we demonstrate how consumer hardware (Microsoft’s Kinect v2 depth sensor) can be used to accurately track the 3D position of multiple markers using a special marker shirt.

2 Methods and materials

To acquire respiratory motion traces, a special marker shirt has been developed. Eleven marker templates were printed onto a Lycra shirt, ensuring tight fit on the volunteers, while position of the markers correspond to areas relevant for the measurement. Each marker consists of a black circle surrounded by a black ring. Details and numbering of the markers are shown in Figure 1.

Marker shirt for motion tracking
Figure 1

Marker shirt for motion tracking

Tracking the position of the markers is done using Microsoft’s Kinect v2 camera (see Figure 2) and the corresponding software development kit (SDK) [5]. The camera is able to simultaneously capture three different types of images at a frame rate of up to 30 Hz: a color image (1920 × 1280 pixels), an infrared-illuminated grayscale image (512 × 424 pixels), and a depth image (512 × 424 pixels, depth resolution of 1 mm). Typical images are shown in Figure 3. Details about the technology behind the sensor is given in [4].

Kinect v2 Sensor. Photograph courtesy of Microsoft Corp.
Figure 2

Kinect v2 Sensor. Photograph courtesy of Microsoft Corp.

Typical frames acquired with the Kinect v2 sensor. (A): color image, (B): IR illuminated scene, (C): depth image, (D): overlay of color and depth images.
Figure 3

Typical frames acquired with the Kinect v2 sensor. (A): color image, (B): IR illuminated scene, (C): depth image, (D): overlay of color and depth images.

Using these images, and the known intrinsics and extrinsics of the color- and IR-cameras inside the Kinect sensor, it is possible to determine the 3D position for each pixel in the depth image. We have developed an application that allows selecting and tracking up to 15 markers in real time. The general process is as follows:

  1. During setup, the user is shown a camera image of the subject and is asked to select the initial position of the markers and the template to use for tracking.

  2. The position of the template in the given regions of interest (ROI) is determined

  3. The distance of the center point of the template found is determined using template matching

  4. The matching ROIs are centered around the position of the last match

To reduce the noise in the measured depth value and to increase the depth resolution, the z coordinate of the template was computed using the average depth of all pixels in the template (20 × 20 pixels per template).

Template matching is done using cross correlation. It is implemented in C# using a wrapper library (EmguCV) around the OpenCV computer vision library.

2.1 Volunteer study

The Kinect sensor was attached to an industrial robot (Adept Viper s850) to allow accurate and stable placement. The setup is shown schematically in Figure 5. In a small volunteer study (four participants, one female, three male), we evaluated the possibility of feature tracking. Our volunteers were asked to lie down in supine position and breathe normally for three to four minutes.

Process of region-of-interest-based template matching. (A) – template found inside ROI (black). (B) – template moved. (C) – template found in old ROI. (D) – Old ROI (gray) and new ROI (black).
Figure 4

Process of region-of-interest-based template matching. (A) – template found inside ROI (black). (B) – template moved. (C) – template found in old ROI. (D) – Old ROI (gray) and new ROI (black).

Schematic setup of the experiment.
Figure 5

Schematic setup of the experiment.

2.2 Accuracy measurements

Finally, the stability and accuracy of the Kinect sensor was evaluated in another experiment. First, the robot as shown in Figure 5 was programmed to follow a sinusoidal motion (therefore similar to respiratory motion) along the z- axis while the distance to the patient couch was computed for each camera frame. Second, the distance to the patient couch was measured repeatedly for about twelve minutes to determine the amount of noise and possible drift.

3 Results

Using our multi-threaded implementation in C#, tracking eleven markers in the color camera image – using ROIs of twice the size of the marker template – was possible in real time using multi-threading on a MacBook Pro Retina (2.3 GHz Core i7, four cores, 16 GiB RAM, SSD). In general, the runtime of one template matching iteration was around 80 ms.

3.1 Volunteer study

Recording motion traces of the markers worked for all four volunteers (three male, one female), although markers one and three were difficult to track due to stretching of the fabric. Figure 6 shows the distances measured for all eleven templates. Note the large differences in amplitude between the individual sensors.

Anterior/posterior motion traces of the markers for subject one (female).
Figure 6

Anterior/posterior motion traces of the markers for subject one (female).

The depth motion trace of a second volunteer (subject four) is shown in Figure 7. Note the much larger amplitude for markers 5–8 and 11 and the sudden motion around t = 95 s due to the volunteer sneezing. Additionally, the values from markers one and three (red and blue, respectively) show that tracking them is difficult due to deformation.

Anterior/posterior motion traces of the markers for subject four (male). Note the sudden peak around 95 s, which is due to the volunteer sneezing and the low quality of markers one and three.
Figure 7

Anterior/posterior motion traces of the markers for subject four (male). Note the sudden peak around 95 s, which is due to the volunteer sneezing and the low quality of markers one and three.

Additionally, the in-image motion of the template was also evaluated. It is exemplarily shown for one marker (marker eight of subject one) in Figure 8. Here, it is clear that there is very little motion in the left/right direction, as would be expected. In the superior/inferior-direction, however, some motion is present (one pixel corresponds to approximately 1-1.5 mm in our setup, depending on the exact distance from the sensor), albeit not as strong as in the anterior/posterior-direction.

Inferior/superior and left/right motion traces for marker eight of subject one.
Figure 8

Inferior/superior and left/right motion traces for marker eight of subject one.

3.2 Accuracy measurements

Using the same setup as described before, we determined the absolute accuracy of the depth measurements. The trajectory of the robot – overlaid with the measured distance to the template – is given in Figure 9. Clearly, the distance measured by the Kinect sensor deviates substantially from the true motion of the robot, maximum is 3.7 mm and the root mean square error (RMSE) is 2.0 mm with a working distance on the order of 50 cm.

Sinusoidal motion trace performed by the robot (red) and as measured from template matching and averaging the depth values (blue).
Figure 9

Sinusoidal motion trace performed by the robot (red) and as measured from template matching and averaging the depth values (blue).

The results of the static measurement evaluation are shown in Figure 10. The measurement was taken directly after turning on the Kinect sensor and some kind of time-dependent drift is visible. We believe that this is due to the changing temperature of the sensor PCB. The depth value is determined – as outlined above – from averaging all pixels in the template, resulting in sub-millimeter resolution. The noise level, however, is still considerable: we observe a standard deviation of 0.4 mm.

Measurement noise from a static target, recorded over twelve minutes (blue) and running average (red).
Figure 10

Measurement noise from a static target, recorded over twelve minutes (blue) and running average (red).

4 Discussion

We have demonstrated that the Kinect v2 sensor’s data streams – color image and depth image – can be used to track multiple markers on the human chest in 3D and in real time using standard hardware. Additionally, by averaging the depth values inside the marker template, it is possible to substantially reduce the measurement noise to a standard deviation of 0.4 mm. On the other hand, however, we observed that the depth values measured using the robotic setup and the sinusoidal motion pattern deviate strongly from the actual data: the motion amplitude of the sine was 20 mm and the amplitude of the template matching was more than 25 mm – 25 % more. We believe that this is caused by multiple factors:

  1. Inaccurate alignment of the depth axis of the Kinect sensor with the robot’s z-axis and the template center

  2. Errors in sensor’s calibration (the Kinect sensor stores its intrinsics and extrinsics in firmware and we did not perform camera calibration)

As next steps, we plan to perform sub-pixel template matching to increase the resolution along the L/R- and S/I-axes and to further analyze the accuracy of the setup by tracking the marker with a dedicated tracking device (like NDI’s Polaris Spectra system). Also the operating speed of the system (now about 15 fps) could be increased due to massive code parallelization, so that every frame from the Kinect v2 is used. We need to make sure, however, that the light emitted by the Kinect v2 sensor does not interfere with the IR light used by the Spectra system. Both operate in the near-infrared range around 850 to 860 nm.

References

  • [1]

    O. Blanck, P. Jauer, F. Ernst, R. Bruder, and A. Schweikard. Pilot-Phantomtest zur ultraschall-geführten robotergestützten Radiochirurgie. In H. Treuer, editor,44. Jahrestagung der DGMP, Cologne, Germany, 2013. DGMP, pages 122–123. Google Scholar

  • [2]

    R. Dürichen, M. A. F. Pimentel, L. Clifton, A. Schweikard, and D. A. Clifton. Multi-task gaussian processes for multivariate physiological time-series analysis. IEEE Transactions on Biomedical Engineering, 62(1):314–322, 2014. . CrossrefGoogle Scholar

  • [3]

    J. Hanley, M. M. Debois, D. Mah, G. S. Mageras, A. Raben, K. Rosenzweig, B. Mychalczak, L. H. Schwartz, P. J. Gloeggler, W. Lutz, C. C. Ling, S. A. Leibel, Z. Fuks, and G. J. Kutcher. Deep inspiration breath-hold technique for lung tumors: the potential value of target immobilization and reduced lung density in dose escalation. International Journal of Radiation Oncology, Biology, Physics, 45(3):603–611, 1999. . CrossrefGoogle Scholar

  • [4]

    D. Lau. The science behind Kinects or Kinect 1.0 versus 2.0. http://www.gamasutra.com/blogs/DanielLau/20131127/205820/The_Science_Behind_Kinects_or_Kinect_10_versus_20.php, November, 2013. Online, last visited 2015-03-24. 

  • [5]

    Microsoft Corporation. Kinect for Windows SDK 2.0. http://www.microsoft.com/en-us/download/details.aspx?id=44561, October, 2014. Online, last visited 2015-03-24. 

  • [6]

    A. Schweikard, H. Shiomi, and J. R. Adler, Jr. Respiration tracking in radiosurgery. Medical Physics, 31(10):2738–2741, 2004. . CrossrefGoogle Scholar

  • [7]

    H. Shirato, S. Shimizu, K. Kitamura, T. Nishioka, K. Kagei, S. Hashimoto, H. Aoyama, T. Kunieda, N. Shinohara, H. DosakaAkita, and K. Miyasaka. Four-dimensional treatment planning and fluoroscopic real-time tumor tracking radiotherapy for moving tumor. International Journal of Radiation Oncology, Biology, Physics, 48(2):435–442, 2000. . CrossrefGoogle Scholar

About the article

Published Online: 2015-09-12

Published in Print: 2015-09-01


Author's Statement

Conflict of interest: Authors state no conflict of interest. Material and Methods: Informed consent: Informed consent has been obtained from all individuals included in this study. Ethical approval: The research related to human use has been complied with all the relevant national regulations, institutional policies and in accordance the tenets of the Helsinki Declaration, and has been approved by the authors’ institutional review board or equivalent committee.


Citation Information: Current Directions in Biomedical Engineering, ISSN (Online) 2364-5504, DOI: https://doi.org/10.1515/cdbme-2015-0048.

Export Citation

© 2015 by Walter de Gruyter GmbH, Berlin/Boston.

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

[1]
Kaveh Bakhtiyari, Nils Beckmann, and Jürgen Ziegler
Procedia Computer Science, 2017, Volume 109, Page 498

Comments (0)

Please log in or register to comment.
Log in