Skip to content
BY 4.0 license Open Access Published by Oldenbourg Wissenschaftsverlag June 24, 2021

Development of a millimetre wave based SAR real-time imaging system for three-dimensional non-destructive testing

Entwicklung eines millimeterwellenbasierten SAR-Echtzeitbildgebungssystems für dreidimensionales zerstörungsfreies Prüfen
Christopher Schwäbig

Christopher Schwäbig obtained a B. Sc. degree (measurement engineering and sensor technology) from the Koblenz University of Applied Sciences in 2014 and a M. Eng. degree (electrical systems development) from the Bonn-Rhein-Sieg University of Applied Sciences in 2018. Since 2018 he has been working as a technical employee at the Fraunhofer Institute for High Frequency Physics and Radar Techniques FHR in Wachtberg.

EMAIL logo
, Siying Wang

Siying Wang received the Dipl.Ing. degree in Electrical Engineering and Information Technology from the RWTH University, Germany in 2014. Since then she has been working in the Department of Integrated Circuits and Sensor Systems at Fraunhofer FHR. Her main research interests include SAR and MIMO imaging, as well as various aspects of radar signal processing and tracking.

and Sabine Gütgemann

Sabine Gütgemann obtained a B. Sc. degree (biomathematics) from the Koblenz University of Applied Sciences in 2012. Since 2012 she has been working as a technical employee at the Fraunhofer Institute for High Frequency Physics and Radar Techniques FHR in Wachtberg.

From the journal tm - Technisches Messen

Abstract

The following article describes the development of a millimetre wave based real-time imaging system for three dimensional non-destructive testing of goods. For this purpose a rotating antenna is used which is fed from an FMCW radar. The received measuring data is processed with a SAR algorithm. Due to the fact that a reflexive measurement method is used, the integration of the system into existing systems is simplified. To make the computing power-intensive SAR image processing possible, the complete signal processing chain of the image processing is executed on the graphics card. The article elucidates the concept for calculating the measurement parameters which have to be elaborated for the implementation of the image processing of the whole system.

Zusammenfassung

Dieser Artikel beschreibt die Entwicklung eines millimeterwellenbasierten Echtzeitbildgebungssystems für das dreidimensionale zerstörungsfreie Prüfen von Werkstücken. Zum Einsatz kommt eine rotierende Antenne, die von einem FMCW-Radar gespeist wird. Die empfangenen Messdaten werden mit einem SAR-Algorithmus prozessiert. Da hier Reflexionsmessungen verwendet werden, kann die Integration des Systems in bereits vorhandene Systeme erleichtert werden. Um die rechenleistungsintensivere SAR-Bildgebung in Echtzeit realisieren zu können, wird in diesem Ansatz die komplette Signalverarbeitungskette zur Bildgenerierung auf der Grafikkarte durchgeführt. Der Artikel erläutert das Konzept zur Berechnung der unterschiedlichen Messparameter, die für die Auslegung der Bildgebung eines solchen Systems berücksichtigt werden müssen.

1 Introduction

Reliable and efficient quality controls are essential in the production of a huge variety of goods. Over the past years different techniques of non-destructive testing (NDT) have been established for this purpose. Well known procedures are X-ray, ultrasound imaging, infrared thermography, optical methods and a large variety of others.

This article describes a scanning system which makes use of millimetre waves for the purpose of NDT. Millimetre waves depict the range from 30 GHz to 300 GHz of the electromagnetic spectrum and can be used for various inspection tasks of different goods (see [1]). They are able to penetrate plastics, wood, glass and other materials with a low relative permittivity ( ε r ). An advantage of millimetre waves in comparison to X-ray is the non-ionising radiation. In this case no special protection for the user is necessary during the operation. In comparison to optical methods millimetre waves can also penetrate and examine non-transparent materials.

The wavelength range is close to terahertz radiation (from 300 GHz to 3 THz). A comparison between terahertz and other NDT systems can be found in [2].

The following sections present the considerations of the measurement parameters and hardware requirements for modifying a millimetre wave scanning system for three-dimensional real-time imaging. In the previous millimetre wave scanning system only 2D imaging was possible (see [3]). The scrutiny is based on a previous article describing the real-time implementation of the signal processing chain for a millimetre wave scanning system (see [4]).

2 Hardware concept

The hardware and the mechanical concept of the scanning system are based on the millimetre wave scanning system “SAMMI” (see [3]). In order to enable three-dimensional imaging and to simplify the mechanical concept, the whole setup of the “SAMMI” scanning system is modified.

2.1 Previous system

The standalone millimetre wave imager “SAMMI” works with two rotating antennas for transmitting and two rotating antennas for receiving the electromagnetic wave. The transmitting antennas are placed underneath and the receiving antennas above a conveyor belt (Fig. 1).

Figure 1 
Mechanical concept of the current SAMMI version.
Figure 1

Mechanical concept of the current SAMMI version.

A continuous wave signal at a frequency of 90 GHz is generated by a chain of several frequency mixers, filters and amplifiers. While the antenna pairs are moving along the conveyor belt (detected by a light barrier) an analogue digital converter samples the referenced raw data of the received signal and transfers it to the computer, where the transmission images are calculated. These images are a 2D amplitude and a 2D phase image which represent the attenuation and the runtime of the electromagnetic wave within the object.

2.2 Development of a SAR based SAMMI

A new SAMMI system (see [4]) with a higher integration level, less mechanical expense and a frequency modulated continuous wave radar (FMCW) is being developed (Fig. 2). The concept of the new system provides additional depth information for real 3D imaging. In addition, the distance between the conveyor belt and the antenna should be made adjustable so that the scanning system can be customised for the intended use.

Figure 2 
Mechanical concept of the SAR-SAMMI.
Figure 2

Mechanical concept of the SAR-SAMMI.

The signal processing is based on synthetic-aperture radar (SAR) which is a retro-reflexive measurement method. Therefore the image processing algorithm becomes more complex compared to the transmissive measurement method of the previous SAMMI system. Unlike the previous system only one antenna of the double antenna arm is used. As the second antenna arm underneath the conveyor belt is not required any longer, a rotary shaft which connects both antenna arms for synchronisation is not required either. Therefore the whole mechanical setup can be reduced in its dimensions. Additionally the simplified mechanical concept allows an easier integration of the scanning system into existing conveyor belt systems.

For the scanning system an FMCW radar based on a highly integrated SiGe radar chip (see [5]) is used (Fig. 3).

Within the phase-locked loop (PLL) the phase comparator compares the phase position of a reference signal with the phase position of the output signal of the voltage controlled oscillator (VCO). The output signal of the phase comparator and loop filter is used to control the phase position of the output signal of the VCO.

The output signal of the VCO is the transmitted signal and is mixed with the reflected signal in order to achieve an indirect time of flight measurement. For short range applications the very high bandwidth of the radar chip allows ultra high resolution SAR imaging. The received signals are collected, digitised and transferred to the computer. For real-time applications the recorded data is transferred to the computer during the measurement process.

The FMCW radar sensors can measure distances and detect objects at different ranges and can be used for various applications for non-destructive testing (see [6]), e. g. thickness measurement of multilayer samples (see [7]) or determination of electromagnetic material properties.

Figure 3 
The hardware signal processing chain.
Figure 3

The hardware signal processing chain.

2.3 Measurement parameters

The antenna rotates with a rotation frequency f rot of 10 Hz so that the duration per cycle (360°) amounts to 100 ms ( T rot ). On the left and right hand side of the semicircle (Fig. 4), the movement path of the antenna is virtually in the direction of movement of the conveyor belt. Since the movement of the antenna arm should be ideally perpendicular to the movement of the conveyor belt, the measuring range along the semicircle is restricted to the area marked in light blue. While the antenna is within this measuring range which starts at 12° and ends at 168° ( β = 156°) of the semicircle, a defined number ( N s w ) of FMCW sweeps is recorded (Fig. 5) and transferred to the computer. The time per semicircle results in 43.3 ms ( T β ).

Figure 4 
The movement path of the antenna (yellow), the measuring range of the semicircle (
β
=
156\beta =156°, light blue area) and the side margins of the semicircle where no measurements are captured (dark blue area).
Figure 4

The movement path of the antenna (yellow), the measuring range of the semicircle ( β = 156°, light blue area) and the side margins of the semicircle where no measurements are captured (dark blue area).

Due to the fixed number of 196 sweeps per semicircle ( N sw ) each sweep can have a maximum duration t sweep , max of 0.22 ms (see Equation 1).

(1) t sweep , max = T β N sw = β 360 · T rot N sw = β 360 · 1 f rot N sw

With a sweep duration t sweep of 0.22 ms (which corresponds to a frequency resolution f res of 4545 kHz, see Equation 2) and an ADC sampling frequency f s of 1 MHz the number of sampling points N p amounts to 220 (see Equation 3).

(2) f res = 1 t sweep
(3) N p = f s f res = f s · t sweep

Within an FMCW sweep the transmit frequency is increased (sawtooth modulation, up-chirp) or decreased (sawtooth modulation, down-chirp) linearly (starting at the start frequency and up/down to the stop frequency). The bandwidth f BW is the frequency range between both frequencies. The runtime ( t meas ) of the electromagnetic wave and the distance ( r meas ) from the sensor to the target can be calculated with the help of the measured frequency f meas and the general FMCW formula (see Equation 4 with c 0 = 299792458 m / s [8]).

(4) r meas = c 0 · t meas 2 = c 0 · f meas 2 · f BW t sweep

In order to calculate the range resolution r res , the measured frequency f meas (in Equation 4) is replaced by the frequency resolution f res (see Equation 5). With a bandwidth f BW of 20 GHz (for example from 70 GHz up to 90 GHz) the range resolution r res is 7.5 mm.

(5) r res = c 0 · f res 2 · f BW t sweep = c 0 2 · f BW

In order to calculate the maximum measurable distance r max the measured frequency f meas (in Equation 4) is replaced by the frequency resolution f res multiplied with half of the total number of sampling points (considering the Nyquist–Shannon sampling theorem, see Equation 6). In this example the maximum measurable distance r max is 820 mm.

(6) r max = c 0 · N p 2 · f res 2 · f BW t sweep = c 0 · N p 4 · f BW = c 0 · f s · t sweep 4 · f BW

Figure 5 
An example of different sweeps captured at different antenna positions along the semicircle.
Figure 5

An example of different sweeps captured at different antenna positions along the semicircle.

The fast Fourier transform (FFT) is used to calculate a frequency intensity spectrum of each sweep whereby each frequency represents a distance (see Equation 4). Due to the fact that the FFT requires 2 n sampling points, the non-existent sampling points of the raw data are padded with zeros. This process, called zero padding, corresponds to an interpolation in the frequency domain. With the help of this interpolation method, a higher range accuracy can be achieved. With 220 sampling points and a 16384 points FFT, the FFT interpolation factor results in 74.5. In this case and according to a good signal-to-noise ratio (SNR) the frequency accuracy ( f res , see Equation 2) can be improved. Therefore the range accuracy ( r res , see Equation 5) can be improved from 7.5 mm to 0.1 mm.

According to the necessary processing time between the sweeps, the maximum sweep duration t sweep , max should be shorter than 0.22 ms. In addition to this the antenna covers a distance ( d sweep , see Equation 7) of 2.0 mm per sweep (which is above the final image resolution in the high resolution mode d res , high of 1.0 mm) during its rotation with a circular velocity v of 9.11 m/s (with f rot = 10 Hz and a radius of the antenna arm of r = 145 mm). As a result of both facts the sweep duration has to be reduced. For the low resolution mode with a resolution d res , low of 2.5 mm a value for d sweep of 2.0 mm would be acceptable.

(7) d sweep = v · t sweep = ω rot · r · t sweep = 2 · π · f rot · r · t sweep

With a sweep duration t sweep of 0.05 ms the antenna covers a distance d sweep of 0.46 mm during its rotation. This is acceptable as it is below the final image resolution d res , high of 1.0 mm in the high resolution mode so that the movement of the antenna during the measurement can be ignored. The reduction of the sweep duration to 0.05 ms results in 50 sampling points ( N p ) per sweep and a maximum measurable distance r max of 0.19 m. With 50 sampling points and a 16384 points FFT, the FFT interpolation factor results in 327.7 so that the range accuracy r res can be improved from 7.5 mm to 0.02 mm.

A doubling of the bandwidth would improve the smallest measurable distance by a factor of two and halve the maximum measurable distance. This idea could improve the measurement results but the bandwidth parameters are restricted due to the hardware (20 GHz max.) used here. A doubling of the sampling frequency would double the amount of sampling points and double the maximum measurable distance. This is optional for this intended use because the maximum measurable distance is already higher than necessary (the distance between the conveyor belt and the antenna has a default height of 50 mm).

In addition to this a doubling of the sampling frequency allows to halve the sweep duration in order to achieve the same values for r res and r max .

In order to enable the correct aspect ratio when building the image the correct speed of the conveyor belt has to be considered. With an antenna rotation frequency of f rot = 10 Hz a belt speed v belt , high of 10 mm/s has to be set when using the high resolution mode with an image resolution d res , high of 1.0 mm. In the low resolution mode with a resolution d res , low of 2.5 mm a belt speed v belt , low of 25 mm/s has to be set.

3 Image reconstruction and mathematical background

The whole image processing chain is based on a highly efficient SAR algorithm.

3.1 Precalculations for the initialisation of the scanner

In this approach the antenna positions during the circular antenna movement are presumed as constant due to the fact that the antenna covers a distance per sweep during the rotation which is less than the final image resolution (see section 2.3). At this point it is important to mention that a SAR-based image reconstruction is also possible if the distances between the individual measurements are larger than the final image resolution. But due to the fact that the antenna is moving during the measurement the time per one single measurement should be reduced to an appropriate time.

In this case a precalculation of all distances between the antenna positions on the semicircle and the voxels of a temporary 3D volume (Fig. 8) is possible.

Two look up tables are calculated for each sweep of the semicircle. The first look up table is a precalculated mask which reveals whether or not a voxel of the temporary 3D volume can receive information of the current sweep. The second look up table contains the distance for each voxel which has to be analysed in the sweeps spectrum (see [4]). The distance values in the look up table already take into account the semi circular movement path of the antenna. Therefore an x- (see Equation 8) and y-coordinate (see Equation 9) for each distance is calculated depending on the radius r and the number i of the semicircle. Both equations include the offset of the measurement area ( 180 β 2 ). The x-coordinates include an additional offset of the radius r in order to prevent negative coordinates. In this case the x-coordinates start from 0. The same applies to the y-coordinates (offset: r · sin ( 180 β 2 )) so that the top of the first semicircle starts at coordinate 0.

(8) x = r · cos ( 180 β 2 + i N sw · β ) + r
(9) y = r · sin ( 180 β 2 + i N sw · β ) r · sin ( 180 β 2 )

With the help of these precalculations a real-time implementation of the whole SAR algorithm is possible. The precalculations only have to be done once before the actual measurement process is started (Fig. 6).

Figure 6 
Precalculations for the initialisation of the scanner.
Figure 6

Precalculations for the initialisation of the scanner.

3.2 Analysis of the raw data

In the first step the DC offset of each sweep of the semicircle is subtracted. In the second step each sweep is filtered with a Hamming window in order to reduce the effect of spectral leakage. The frequency spectrum of each sweep is calculated with an interpolated FFT in order to achieve a higher range accuracy. The first look up table is used to see whether or not a voxel can acquire information of the current sweep. If a voxel of the temporary 3D volume can receive information of the current sweep, the second look up table and the nearest neighbour method are used to extract the image information from the sweeps spectrum (see [4]). Based on the conventional backprojection algorithm (see [9]) the extracted values are weighted with a complex exponential function (see Equation 10) dependent on their distance and are added into the temporary 3D volume (Fig. 7 and Fig. 8, see [4]).

(10) s ( m , τ n ) = fft ( s ( f k , τ n ) ) · exp + j 4 π f 1 Δ R ( m , τ n ) c 0

s ( m , τ n ) describes the weighted value of one sweep for one voxel within the temporary 3D volume, dependent on the quantised range bin m within the sweeps spectrum and the time τ n which depends on the number of the sweep. s ( f k , τ n ) describes the received quantised signal for one single sweep with k frequencies (f). Δ R ( m , τ n ) describes the quantised distance to the appropriated voxel. The frequency f 1 represents the start frequency of the FMCW down sweep or FMCW up sweep (see [4]). The final value of one voxel of the temporary 3D volume represents the sum of the calculated values for s ( m , τ n ) of all 196 sweeps.

Figure 7 
The signal processing chain of the SAR algorithm.
Figure 7

The signal processing chain of the SAR algorithm.

Figure 8 
Filling the voxels of the temporary 3D volume with extracted data of the FFT spectrum.
Figure 8

Filling the voxels of the temporary 3D volume with extracted data of the FFT spectrum.

3.3 Stepwise image build-up

After all sweeps of the semicircle have been processed, this temporary 3D volume is added into the final 3D volume (Fig. 9, see [4]). The temporary 3D volume is always inserted in the same position within the final 3D volume in order to achieve an endless image build up. Therefore the final 3D volume has to be shifted along the axis of the conveyor belt movement before the temporary 3D volume can be inserted. In this case the speed of the conveyor belt has to be adjusted to the chosen voxel size of the 3D volumes because the rotation speed of the antenna is assumed to be fixed (see [4], see section 2.3).

The content of the final 3D volume can be represented in different ways. One option is to display one single layer of the final 3D volume. This can be an x-layer, y-layer or z-layer. Another way is to display a sum of all layers along a specified axis in order to create a x-, y- and z-perspective.

Figure 9 
Filling the final 3D volume with temporary 3D volumes of each semicircle.
Figure 9

Filling the final 3D volume with temporary 3D volumes of each semicircle.

4 Implementation

The scanner software is divided into a CPU and a GPU part. The CPU part is separated into three threads and is responsible for GUI/visualisation, data capture (with hardware communication) and CPU-GPU data transfer. On the GPU the whole image processing chain is processed so that the array operations are calculated in parallel.

Before the actual scanning process can be started and the measurement data evaluation can be processed, the precalculated look up tables (two for each sweep) are merged into one (to save memory) and are copied to the GPU so that no transfer of this data is necessary during the measurement. In this case a memory transfer (of precalculated data) is no longer necessary during the measurement which improves the execution time of the whole signal processing chain. The different steps of the signal processing chain (Fig. 7) are implemented in single kernels, so that different CUDA® kernels represent the single steps of the signal processing chain. The sizes of the arrays (which are processed within the different kernels) vary in size. In relation to this most of the different kernels vary in their thread and block configuration.

In this implementation the implemented kernels are only responsible for a simple array operation and multiple access is not necessary. Therefore the use of shared memory by copying the data from the global memory into the shared memory is waived.

By using coalesced access to the global memory the execution time is much faster compared to uncoalesced access. In this case most of the kernels use a coalesced access to the memory in order to improve the speed of the algorithm. The kernel which rearranges the input data and the kernel which extracts the data of the FFT spectrums are much slower than the other kernels because their memory access is uncoalesced. The reason for this is that neighbouring GPU threads have to access elements which are not neighbours.

The recorded raw data and the three output images are saved within pinned memory in order to optimise the execution speed of the algorithm.

On the graphics card the FFT spectrums are calculated using the CUDA® FFT routine. The implementation makes use of the batched mode of the CUDA® FFT in order to calculate the multiple 1D FFTs (196 in total) of one semicircle at once and not 196 single 1D FFTs. In this case the FFT routine has only to be called one time for each semicircle.

For the purpose of allowing further signal processing steps (for example phase unwrapping) the implementation takes advantage of CUDA® streams so that the semicircles can be processed in parallel by putting them into different GPU streams. To achieve this each signal processing chain for one stream has to be controlled by its own CPU thread. In this case each GPU stream is mapped with its separate plan of a batched 1D FFT.

Asynchronous memory transfer (host to device or device to host) is used so that each CPU thread can copy the data within its corresponding GPU stream. The same applies to the emptying process of the temporary 3D volume.

5 Evaluation of the algorithm and implementation

In order to generate a dataset with rawdata of a circular measurement of the SAR-based SAMMI scanner, a measurement dataset of a Siemens Star (Fig. 10, left) made with an XY raster scanner is used. With the help of the Siemens star and the holes and patterns in different sizes around the Siemens Star the resolution of the scanning system can be illustrated. In this case it is used to determine which size of objects and which shapes of objects can be recognised in the scan.

Figure 10 
Left: The Siemens star pattern (metal, 25 cm × 25 cm). Right: An XY raster measurement of the Siemens star pattern. Within the area marked by the white line the raw data of sweeps is extracted and used to simulate the raw data of the rotating SAMMI scanner.
Figure 10

Left: The Siemens star pattern (metal, 25 cm × 25 cm). Right: An XY raster measurement of the Siemens star pattern. Within the area marked by the white line the raw data of sweeps is extracted and used to simulate the raw data of the rotating SAMMI scanner.

Figure 11 
A: The three different perspectives of the final 3D output volume with amplitude values (top left: x-perspective, bottom left: z-perspective, bottom right: y-perspective). P: The three different perspectives of the final 3D output volume with wrapped phase values (top left: x-perspective, bottom left: z-perspective, bottom right: y-perspective).
Figure 11

A: The three different perspectives of the final 3D output volume with amplitude values (top left: x-perspective, bottom left: z-perspective, bottom right: y-perspective). P: The three different perspectives of the final 3D output volume with wrapped phase values (top left: x-perspective, bottom left: z-perspective, bottom right: y-perspective).

The XY raster scanner moves the radar along the horizontal axis. During the horizontal movement a predefined amount of FMCW sweeps is recorded. After the measurements are recorded, the radar is moved by a defined step along the vertical axis and another horizontal measurement is created. This is repeated until the entire measuring object has been scanned.

The XY raster scanner uses a bandwidth f BW of 24 GHz (68 GHz to 92 GHz), a sweep duration t sweep of 4 ms and a sampling frequency f s of 1 MHz. In this case the number of sampling points N p amounts to 4000, so that the maximum measurable distance r max is 12.5 m (taking account of the Nyquist–Shannon sampling theorem). With the same bandwidth f BW of 20 GHz and the same sampling frequency f s , the range resolution r res would be the same, too. Due to the higher bandwidth f BW of 24 GHz, the range resolution r res is 6.2 mm. With 4000 sampling points and a 16384 points FFT, the FFT interpolation factor results in 4.1 so that the accuracy of the range resolution r res can be interpolated from 6.2 mm to 1.52 mm. In this situation the significantly longer sweep duration t sweep only affects the maximum measurable distance r max .

The FMCW sweeps are extracted from the XY scanner dataset at the corresponding coordinates (Fig. 10, right). The sweeps are extracted semi circularly and the raw data of each semicircle is forwarded to the scanner software at the appropriate time interval depending on the simulated rotation frequency.

5.1 Results

The scanner software represents three different perspectives of the final 3D output volume. In the current implementation it can be switched between amplitude and phase images. Figure 11 depicts the three different perspectives of the final 3D output volume with amplitude (A) and phase data (P).

The amplitude values are summed up along each axis of the final 3D output volume for generating the three different perspectives (Fig. 11, A). Optionally only one layer of each axis can be extracted and can be displayed within the user interface. With this feature it is possible to show the attenuation at a certain position within the measured object.

Contrary to the amplitude image, the phase image is generated by extracting only one layer (Fig. 13) of each axis from the final 3D output volume. These layers are written into the three images (Fig. 11, P). The amplitude y-perspective (Fig. 11, A, bottom right) shows that the measuring object is tilted along the x-axis. Due to the used bandwidth f BW of 24 GHz (68 GHz to 92 GHz) the centre frequency is 80 GHz so that the wave length λ amounts to 3.75 mm in vacuum. In total five phase cycles along the x-axis are visible in Figure 11, P (z-perspective, bottom left). Due to the fact that the measuring object is not tilted along the y-axis (Fig. 11, A, x-perspective, top left) the phase cycles are only visible in x-direction. The result of 1D phase unwrapping is shown in Fig. 12. It is obvious that a 2D phase unwrapping algorithm is necessary especially if the measuring object is tilted along two axes (see [10]).

Figure 12 
Phase unwrapping (left: 1D in x-direction, right: 1D in y-direction).
Figure 12

Phase unwrapping (left: 1D in x-direction, right: 1D in y-direction).

In this view only one z-layer is represented. It is necessary to mention that the phase values depicted in this layer represent the current phase position of the electromagnetic wave considering all previously penetrated z-layers. In other measured objects these other previously penetrated z-layers can be different materials with different refractive indexes.

The phase image shows the diffraction effects of the electromagnetic waves which occur at the edges of the measuring object. The layer extracted here is located in the area of the measuring object. This is the layer which contains the highest values in the amplitude image. In this layer the strongest diffraction effects can be measured (Fig. 13). If a layer is selected which is further in front of or further behind the object, a phase image of the object can be also reconstructed. But the greater the distance to the object is, the lower the reconstruction quality of the measuring object is. This is because the diffraction effects become less the greater the distance from the object is.

This is not the case for the amplitude image. The amplitude image can only be extracted in the area of the measuring object.

Figure 13 
Position and angle of the measured object with the selected phase layer.
Figure 13

Position and angle of the measured object with the selected phase layer.

5.2 Kernel runtimes and memory utilisation

The largest GPU array of the implementation is the look up table which includes the voxel mask and the distances to be analysed for each voxel. In order to show the entire volume of the slot (distance between antenna and the conveyor belt), the measurement will be executed for a height of 60 mm so that no border areas are missing. In low resolution mode (2.5 mm) the final output has a size of 120 px (300 mm) × 160 px (400 mm) × 24 px (60 mm) (in total 460800 elements). In high resolution mode (1.0 mm) the final output has a size of 300 px (300 mm) × 400 px (400 mm) × 60 px (60 mm) (in total 7 200 000 elements). To get the size of the look up table the total number of elements has to be multiplied with the number of sweeps per semicircle (low resolution mode: 0.362 GB, high resolution mode: 5.645 GB, Tab. 1). Without combining the voxel mask and the distance look up table into one array the memory consumption would be twice as high. The look up table (marked in red in Tab. 1) is the largest array of all GPU arrays.

If the slot size is increased the measuring range has to be increased, too (low resolution mode: 0.063 GB, high resolution mode: 0.979 GB per an additional height of 10 mm).

Table 1

GPU memory consumption for the low and high resolution mode (here 4 byte float values per element, sw.: sweeps, s. p.: sampling points, cpx.: complex number with one real element and one imaginary element).

Array Low resolution (2.5 mm per pixel length) High resolution (1.0 mm per pixel length)
Raw data input 0.3 MB (196 sw. × 200 s. p. × cpx.) 0.3 MB (196 sw. × 200 s. p. × cpx.)
Window function 0.8 KB (200 s. p.) 0.8 KB (200 s. p.)
FFT in 26 MB (196 sw. × 16384 s. p. × cpx.) 26 MB (196 sw. × 16384 s. p. × cpx.)
FFT out 26 MB (196 sw. × 16384 s. p. × cpx.) 26 MB (196 sw. × 16384 s. p. × cpx.)
Look up table
Temp. 3D vol. 3.7 MB (120 px × 160 px × 24 px × cpx.) 58 MB (300 px × 400 px × 60 px × cpx.)
Fin. 3D vol. 3.7 MB (120 px × 160 px × 24 px × cpx.) 58 MB (300 px × 400 px × 60 px × cpx.)
Fin. 3D vol. copy 3.7 MB (120 px × 160 px × 24 px × cpx.) 58 MB (300 px × 400 px × 60 px × cpx.)
Fin. 3D vol. abs 1.9 MB (120 px × 160 px × 24 px) 29 MB (300 px × 400 px × 60 px)
Fin. 3D vol. ph. 1.9 MB (120 px × 160 px × 24 px) 29 MB (300 px × 400 px × 60 px)
2D X image (amp.) 16 KB (160 px × 24 px) 96 KB (400 px × 60 px)
2D Y image (amp.) 12 KB (120 px × 24 px) 72 KB (300 px × 60 px)
2D Z image (amp.) 77 KB (120 px × 160 px) 480 KB (300 px × 400 px)
2D X image (ph.) 16 KB (160 px × 24 px) 96 KB (400 px × 60 px)
2D Y image (ph.) 12 KB (120 px × 24 px) 72 KB (300 px × 60 px)
2D Z image (ph.) 77 KB (120 px × 160 px) 480 KB (300 px × 400 px)
Total 428 MB 5928 MB

Table 2 shows the runtimes of the different kernels of the signal processing chain. The high resolution mode is not available on the smaller device due to the high memory consumption (see Tab. 1). The kernel which executes the nearest neighbour method to get the values from the corresponding distances within the FFT spectrum is most time consuming kernel due to the uncoalesced memory access (marked in red in Tab. 2). The maximum frame rates are much higher than necessary because of the fact that the antenna rotates with 10 Hz and the scanning system delivers only the data of 10 recorded semicircles per second.

Table 2

GPU kernel runtimes on different devices.

Kernel NVIDIA® Quadro™P400 (2 GB) NVIDIA® Quadro RTX™6000 (24 GB)


Low resolution High resolution Low resolution High resolution
HtoD-MemcpyAsync (raw data) 36.799 µs not available 47.840 µs 30.304 µs
Detect and substract DC offset 254.748 µs not available 226.910 µs 230.558 µs
Window and sort data for batched FFT 746.516 µs not available 30.336 µs 30.367 µs
Batched FFT 4480.030 µs not available 198.654 µs 200.286 µs
MemsetAsync (empty temp. 3D vol.) 73.695 µs not available 4.864 µs 49.216 µs
Nearest neighbour method with sum not available
Copy final 3D vol. 276.956 µs not available 10.080 µs 200.862 µs
Shift final 3D vol. 830.610 µs not available 31.160 µs 712.729 µs
Insert temp. 3D vol. 1854.950 µs not available 21.568 µs 692.313 µs
ComplexToAbs/ComplexToPhase 816.179 µs not available 26.752 µs 695.480 µs
Calculate X sums (X image) 210.973 µs not available 39.359 µs 226.398 µs
DtoH-MemcpyAsync (X image) 2.368 µs not available 2.016 µs 8.224 µs
Calculate Y sums (Y image) 344.891 µs not available 67.327 µs 339.549 µs
DtoH-MemcpyAsync (Y image) 2.016 µs not available 1.696 µs 6.784 µs
Calculate Z sums (Z image) 289.179 µs not available 16.064 µs 589.498 µs
DtoH-MemcpyAsync (Z image) 7.392 µs not available 6.720 µs 38.336 µs
Total 18031 µs not available 1094 µs 9688 µs
Maximum frame rate 55.5 Hz not available 914.1 Hz 103.2 Hz

6 Conclusion

On the basis of a raw dataset generated by an XY scanner, a new dataset has been generated which corresponds to the data of the rotating scanning system presented here. The results show that a three-dimensional amplitude image can be generated and the measurement objects can be displayed in three different perspectives. Contrary to the original version of the scanning system the mechanical setup of the new scanning system can be simplified. This is possible due to the use of FMCW radar and SAR. The problem of the high demands of computing capacity for achieving SAR based real-time image processing can be solved by outsourcing the signal processing chain to the graphics card. Due to the fact that the achieved throughput is much higher than necessary, there is enough remaining computational capacity to implement additional 2D phase unwrapping on the graphics card. In order to generate the phase image further steps of 2D phase unwrapping are necessary and must be investigated in the future. For the purpose of evaluating the performance of the final measuring system, it is necessary to scan different samples. This is also important when evaluating the influence of the rotation of the antenna arm during the measurement. Within future projects the rotating antenna could be replaced by a radar sensor array in order to eliminate rotating parts and to simplify the SAR image processing.

About the authors

Christopher Schwäbig

Christopher Schwäbig obtained a B. Sc. degree (measurement engineering and sensor technology) from the Koblenz University of Applied Sciences in 2014 and a M. Eng. degree (electrical systems development) from the Bonn-Rhein-Sieg University of Applied Sciences in 2018. Since 2018 he has been working as a technical employee at the Fraunhofer Institute for High Frequency Physics and Radar Techniques FHR in Wachtberg.

Siying Wang

Siying Wang received the Dipl.Ing. degree in Electrical Engineering and Information Technology from the RWTH University, Germany in 2014. Since then she has been working in the Department of Integrated Circuits and Sensor Systems at Fraunhofer FHR. Her main research interests include SAR and MIMO imaging, as well as various aspects of radar signal processing and tracking.

Sabine Gütgemann

Sabine Gütgemann obtained a B. Sc. degree (biomathematics) from the Koblenz University of Applied Sciences in 2012. Since 2012 she has been working as a technical employee at the Fraunhofer Institute for High Frequency Physics and Radar Techniques FHR in Wachtberg.

References

1. D. Nüßler and J. Jonuscheit, Terahertz based non-destructive testing (NDT), tm - Technisches Messen (2020), 10.1515/teme-2019-0100.Search in Google Scholar

2. S. Wohnsiedler, C. Matheis, J. Jonuscheit and R. Beigang, Zerstörungsfreie Prüfung von Verbundwerkstoffen mit Terahertz-Technik im Vergleich zu etablierten Prüfverfahren, ZfP-Zeitung 138 (2014), 49–52.Search in Google Scholar

3. A. Küter, C. Schwäbig, R. Brauns, S. Kose and D. Nüßler, A stand alone millimetre wave imaging scanner: system design and image analysis setup, in: 48th European Microwave Conference (EuMC) (2018), 1505–1508.10.23919/EuMC.2018.8541396Search in Google Scholar

4. C. Schwäbig, S. Wang and S. Gütgemann, A real-time SAR image processing system for a millimetre wave radar NDT scanner, in: Forum Bildverarbeitung 2020 (2020).Search in Google Scholar

5. C. Bredendiek, K. Aufinger and N. Pohl, Full waveguide E- and W-band fundamental VCOs in SiGe:C technology for next generation FMCW radars sensors, in: 14th European Microwave Integrated Circuits Conference (EuMIC) (2019).10.23919/EuMIC.2019.8909457Search in Google Scholar

6. R. Herschel and S. Pawliczek, 3D Millimeter Wave Screening of Wind Turbine Blade Segments, in: 15th European Radar Conference (EuRAD) (2018), 115–117.10.23919/EuRAD.2018.8546559Search in Google Scholar

7. S. Pawliczek, R. Herschel and N. Pohl, High precision surface reconstruction based on coherent near field synthetic aperture radar scans, in: 19th International Radar Symposium (IRS) (2018).10.23919/IRS.2018.8448051Search in Google Scholar

8. J. Göbel (2011), Radartechnik - Grundlagen und Anwendungen, Berlin, VDE Verlag ISBN: 978-3-8007-3141-1.Search in Google Scholar

9. LeRoy A. Gorham and Linda J. Moore, SAR image formation toolbox for MATLAB, in: Algorithms for Synthetic Aperture Radar Imagery XVII (2010).10.1117/12.855375Search in Google Scholar

10. S. Pawliczek, R. Herschel and N. Pohl, 3D millimeter wave screening for metallic surface defect detection, in: 16th European Radar Conference (EuRAD) (2019), 113–116.Search in Google Scholar

Received: 2021-02-19
Accepted: 2021-05-31
Published Online: 2021-06-24
Published in Print: 2021-08-27

© 2021 Schwäbig et al., published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 6.12.2022 from frontend.live.degruyter.dgbricks.com/document/doi/10.1515/teme-2021-0029/html
Scroll Up Arrow