Simulation study and experimental validation of a neural network-based predictive tracking system for sensor-based sorting

: Sensor-based sorting offers cutting-edge solutions for separating granular materials. The line-scanning sensors currently in use in such systems only produce a single observation of each object and no data on its movement. According to recent studies, using an area-scan camera has thepotentialtoreducebothcharacterizationandseparation error in a sorting process. A predictive tracking approach based on Kalman ﬁlters makes it possible to estimate the followed paths and parametrize a unique motion model for each object using a multiobject tracking system. While earlier studies concentrated on physically-motivated motion models, it has been demonstrated that novel machine learning techniques produce predictions that are more accurate. In this paper, we describe the creation of a predictive tracking system based on neural networks. The new algorithm is applied to an experimental sorting system and to a numerical model of the sorter. Although the new approach does not yet fully reach the achieved sorting quality of the existing approaches, it allows the use of the general method without requiring expert knowledge or a fundamental understanding of the parameterization of the particle motion model.


Introduction
Sensor-based sorting provides state-of-the-art solutions for sorting of granular materials. This general term refers to a group of systems that allow for the physical separation of distinct objects from a material stream based on data collected by one or more sensors. It is regarded as a crucial technology for achieving a circular economy [1], among other fields of application. Since particle classification and separation are carried out in separate steps, the technology is sometimes referred to as indirect sorting in contrast to mechanical processes like magnetic or electrostatic sorting, froth flotation, or float/sink processes [2]. Theoretically, it is possible to recognize any number of classes for sorting, and it is also theoretically feasible to divide the material stream into multiple fractions. However, binary sorting, or sorting into "product" and "residue", is preferred in industrial applications because multi-way sorting necessitates intricate mechanical handling.
The following is a brief summary of the functional principle of sensor-based sorting. The material is initially fed into the system using a conveyor mechanism. The material is then moved onward using a transport medium, for instance a conveyor belt. The process of acquiring sensorbased data occurs during the transport. The gathered information is assessed in an effort to identify and classify particles in the material stream. The classification outcome serves as the foundation for the sorting decision, which is carried out by an actuator. The de facto standard is the usage of an array of pneumatic nozzles for this purpose.
The variety of industrially available sensors that can be used in sensor-based sorting systems is a particular strength of the sorting technology. As a result, there is a lot of flexibility in the detectable material properties and the sorting criteria that can be used. Imaging sensors currently hold a dominant position due to their suitability for systems with high material throughput.

Problem statement
Line-scanning sensors are used in current systems, which is practical because the material is perceived during transportation. Line-scan cameras in the visible spectrum are used when sorting criteria based on color, shape, or texture are sufficient. This results in a single observation of each object only, with no knowledge of their movement. Calculating the location and time for separation requires making assumptions about the velocity due to the delay between localization and separation [3,4], which mainly exists due to the required time for data processing and actuator activation. Therefore, it is crucial to guarantee that all objects are transported at uniform velocity. This is frequently a difficult task.
Recent studies have demonstrated that switching from a line-scanning camera to an area-scanning one has the potential to reduce both characterization error [5] and separation error [6] in sensor-based sorting. Individual objects are observed at various times when the frame rate is high enough. This enables the estimation of the followed paths as well as the parametrization of a unique motion model for each object by using a multiobject tracking system. The latter enables precise predictions regarding which actuators should be activated at what time in order to deflect an object and thereby remove it from the material stream. As a result, this method, known as predictive tracking, leads to higher sorting quality.
However, these aforementioned earlier studies concentrate on physically-motivated motion models. Choosing the motion model is not straight forward as it is a hard problem to ensure the model structure is capable of capturing the motion behavior. Furthermore, identifying optimal parameters for such models is cumbersome. This may include identification of constants for initialization of the motion model, e.g., an average time bias in the predictions of when to activate the actors. Additionally, other design choices need to be made and it might only be possible to experimentally determine a suitable setting. For a motion model-based approach, this involves deciding on a basic function family, which is typically based on assumptions on the particles' fundamental motion behavior, such as the assumption of constant velocity or constant acceleration, and situationdepended adaptions on the basis of these models.

Contribution
In this paper, we present a neural network-based predictive tracking system for sensor-based sorting. The system is integrated both in a numerical simulation as described in ref. [7] as well as in a laboratory-scale sorting system with an area-scan camera as described in ref. [6]. For this purpose, the complete development cycle required to make such machine learning-based methods applicable in an industrial sorting setting is considered. The approach presented enables the use of predictive tracking to achieve high sorting quality without requiring expert knowledge for motion model setup. Plant operators with typically limited trained personnel are thus given access to the technology by using the process model for setup based on examples instead of mathematical parameters in the further course of the paper.
The proposed model is based on a multilayer perceptron as described in ref. [8]. It takes observation coordinates of individual objects as an input and generates the predictions for future time points, in our case for the separation stage, as an output. The input coordinates are obtained using the numerical simulation or by means of real-time image processing, respectively, and managed by the multiobject tracking system. To simulate the sorting process, the Discrete Element Method is coupled to Computational Fluid Dynamics (DEM-CFD). The approach is validated both in terms of simulated and actual sorting experiments and compared with existing approaches in terms of the sorting accuracy.

Related work
Contrary to the commonly used line-scanning sensors, usage of area-scan color cameras for sensor-based sorting has been proposed in ref. [9]. The authors propose a multiobject tracking algorithm, in which the parameterization of a motion model can incrementally be updated with each observation and eventually be used to predict a particle's future trajectory. This in turn can be used for calculation of the actuator control signals for separation. The potential of this approach is demonstrated in refs. [10,11] using a numerical simulation of a sorting system and in ref. [6] experimentally on a lab-scale sorting system. The presented results suggest that the approach is particularly advantageous for sorting scenarios in which non-uniform transport velocities exist.
While earlier studies concentrated on physicallymotivated motion models, it is demonstrated in refs. [8,12] that cutting-edge machine learning techniques offer a potent tool for improving prediction accuracy, particularly in challenging sorting scenarios. For example [8], proposed a mixture of experts approach for adaptive combination of both physically-motivated models and learned neural networks to enhance both, predictions accuracy and generalization capability.
Utilization of the DEM-CFD simulation method is widely spread in the field of process engineering. The method is used to model pneumatic conveying [13], fluidized beds [14] and other processes, where the interaction of a solid and a fluid phase plays a crucial role [15,16]. A comprehensive review is given by [17]. Recently, the DEM-CFD was used to model a full optical sorting system and to compare the results against experiments [7]. In another study, the numerical model was used to optimize the sorting stage. It was shown that the optimized setups can improve the sorting accuracy under certain conditions [18].

Materials and methods
We pick a model sorting scenario from the recycling of construction waste. In this field of application, the material is prepared for the production of recycled construction materials by producing pure fractions from construction and demolition waste [19]. In our hypothetical case, we take into account a brick and sand-lime brick input stream, see Figure 1. The task is to remove brick from the waste stream. Prior to sorting, the material is crushed into grains that are 4-6 mm in size.

Experimental setup
We employ the lab-scale sorting system depicted in Figure 2 for both the acquisition of training data and the experimental validation. A thorough explanation of the system is given in ref. [6]. The material is fed into the system using a vibrating feeder. A 140 mm wide conveyor belt with a total length of 600 mm is used for transportation. An areascan camera and a ring light are used to record the material stream just before discharge at the belt's end. Fast switching pneumatic valves are used to perform separation after discharge during a flight phase.
In order to locate and classify specific particles, the acquired image data is processed. The sorting decision is based on a classification that relies on the color of a particle. If a particle needs to be removed from the material stream, a control signal is calculated and sent. It specifies when and which valves in the array need to be opened. The focus of the current study is precisely this calculation, which is referred to as the prediction model, see Section 2.3.

Simulation model
Coupled DEM-CFD simulations are used to model the sorting system. With the DEM, particle-particle as well as particle-wall interactions are  simulated, while the bursts of compressed air are simulated with the CFD. At the sorting step, both phases are coupled to model the deflection of particles by the air jets. A stationary one-way coupling is used. Here, the fluid field is computed once and thus not influenced by the particles. Consequently, the particle-fluid force acting on each particle is obtained by drag correlations.
The translational particle movements in the DEM are described by the acceleration⃗ x i times the mass m i of particle i. They are governed by the sum of acting forces ⃗ F. The resulting force is composed of the contact force ⃗ Additionally, rotational particle movements are caused by Torques originating from contacts, ⃗ T c i , or from rolling movements ⃗ T r i . Analogously to the translation expressed by mass and acceleration, the rotation of particle i is described by the mass tensor of inertia J i , the angular velocity ⃗ i and the angular acceleratioṅ⃗ i . The transformation from the global frame to the body fixed frame is given by Λ −1 i . .
Linear contact models were used for contact force calculation. The normal force component ⃗ The tangential contact force is computed such that product of tangential spring stiffness k t and tangential displacement ⃗t is limited by the Coloumb friction, given by the friction coefficient c and the normal force component ⃗ F n . This yields ⃗ t with ⃗ t being the tangential vector. The parameters of all contact pairs were calibrated by conducting and simulating small scale experiments and comparing both. For further details regarding the parameters and the model of rolling friction, we refer the reader to [7]. The air jet was simulated using the commercial CFD-software ANSYS Fluent. The pressure difference was 0.75 bar. Using the fluid velocity around the particles, the fluid force acting on the particle was calculated by [20] The force depends on the fluid density f , the difference between fluid velocity ⃗ u and particle velocitẏ⃗ x, the drag coefficient c D , the particle area perpendicular to the flow direction A ⊥ and the local voidage f , powered by an empirical correction factor (1 − ). The drag correlation of [21] was used to consider the particle shape implicitly in c D . More details on the DEM-CFD coupling are given in ref. [18]. To model the ejection by the nozzle jets, the fluid field was activated and deactivated at the positions and at the times computed by the respective prediction model, see next Section 2.3.
Due to the irregular shape of the considered material (see Figure 1), a representation of particles with clusters of overlapping spheres was chosen. This approach yields a sufficient trade-off between computational effort and approximation of the shapes. A time step of 1 · 10 −5 s was used for the simulations. An image of the simulated sorting system is shown in Figure 3. Note that an inflow of randomly generated particles was used to place the material onto the chute and to match the mass flow in the experiments.

Prediction models
We validate the proposed approach comparatively. In total, we consider three prediction models. Besides the newly proposed approach, we also take into account two prediction models that were analyzed in earlier studies for the computation of the separation control signals.
First, we consider a system using a line-scan camera. This is in line with the setup that was prevalent at the time this article was written. There is no knowledge of the motion of the particles in this situation. Consequently, it is necessary to assume a uniform transport velocity. The temporal component of the prediction is computed by adding a fixed, typically experimentally determined delay to the observational time of a particle. It is further assumed that there is no velocity perpendicular to the transport direction. As a result, the valves that need to be opened match the particle's lateral position as seen by the camera.
Second, we look at the strategy first put forth in ref. [9] and experimentally supported in ref. [6]. Particles contained in the material stream are observed at various points in time and tracked using a multiobject tracking system and a high-speed area-scan camera. This allows for the individual determination of motion parameters for each particle, such as the velocity parallel to and perpendicular to the transport direction. These variables are used to precisely estimate the optimal separation control signal in conjunction with a motion model. The method focuses on applying Kalman filters to the particle centroid for predictive tracking. Constant velocity (CV) and other linear, physicallymotivated models are used in this course.
Third, we experimentally validate the novel data-driven approach introduced in this paper. The proposed approach uses a multilayer perceptron with four hidden layers as a predictor, each consisting of 16 neurons. A detailed description of the architecture is provided in ref. [8]. As an input, the model takes the last five captured position measurements of each particle and it directly outputs the control signal for separation, i.e., the estimated arrival time and location of the particle at the separation bar. This distinguishes our new approach from the original predictive tracking algorithm, which relies on the estimated positions and velocities from the underlying Kalman filter for this purpose. However, the input, i.e., the measurements, are obtained by precisely the same multiobject tracker as in the original predictive tracking setup.

Test methodology and experimental results
In the following, experimental results for the simulation study as well as the experimental setting are presented. In both cases, the true negative rate (TNR) and true positive rate (TPR) were determined as performance indicators for the sorting quality. In this context, the TNR depicts the proportion of residue material that has been successfully removed. The TPR refers to the proportion of product material that has successfully been accepted, i.e., not been removed.

Experimental validation
We carry out sorting experiments with the materials and methods discussed in Section 2. One experiment is equivalent to batch sorting 200 g of the material. Three different mixing ratios are investigated in addition to the three prediction models that are discussed in Section 2.3. More specifically, we take into account brick ratios of 10 %, 25 % and 50 %. We perform tests using mass flows of 10 g s −1 and 20 g s −1 .

Model training and deployment
A data set of particle tracks, i.e., measured positions, recorded on the lab-scale sorting system described in Section 2.1 was used to train a multilayer perceptron. The tracks were obtained from an earlier offline run of the multiobject tracking algorithm. The multilayer perceptron was trained on just one specification, a mass flow of 20 g s −1 with a ratio of brick of 25 %, where we used the tracks of both brick and sand-lime brick for training. This specification corresponds to the highest mass flow considered in our experiments, which has the advantage that the number of tracks, and thus training examples, is sufficiently high. In contrast, we test the novel approach on several mass flows and mixing ratios, which allows for testing the generalization capability of the approach. A frame rate of 100 Hz was used to capture the images. The belt moved at a speed of about 1 m s −1 .
Since the camera does not record the scene at the separation bar, and the temporal resolution is limited, the ground truth for the arrival time and location of the particle was created using the idea of a "virtual separation bar" (see [8,9]). Here, only the images from the area scan camera are used for training. The prediction is made with respect to a specific row of pixels in the camera image corresponding to the virtual separation bar, and the tracking phase is accordingly shortened. In addition, the coordinate system of the recorded measurements is shifted, i.e., we add a constant offset to the coordinates describing the position in transport direction so that the virtual and the actual separation bar match. In other words, we move the image so that both separation bars coincide, and act as if the image now depicts the scene as it would be captured in the vicinity of the real separation bar. By linearly interpolating between the final measurement taken before and the first measurement taken after the virtual separation bar, the ground truth is then determined. When deployed, the trained network is used with non-shifted measurements and the initial configuration. This concept has the advantage of not requiring additional sensors and allowing the network to be trained in an unsupervised manner without incurring additional costs for manually labeling the ground truth. However, it introduces some errors due to interpolation and the assumption of similar particle motion on the belt and in the flight phase.
Early tests showed that additionally to training the developed model on the basis of the generated image sequences, it is beneficial to include knowledge about the system structure in the implementation, see Figure 4. More precisely, parameters relating to the separation, such as the distance between the camera observation area and the separation bar, were taken into account. To compensate for errors potentially arising due to measurement inaccuracies, parameters for manual configuration of an offset, e.g., with regard to the distance, were implemented.
For the purpose of using the model under real-time conditions as present in sensor-based sorting, we implement an inference engine using TensorRT from NVIDIA in the programming language C++. By that, inference can be executed on dedicated NVIDIA graphics cards. However, conversion of the model is necessary in order to be compatible with the framework. We use the onnx format for this purpose. The overall setting for deployment is depicted in Figure 5.

Experimental results
The experimental results are presented in Figure 6. The individual markers represent the result of an individual experiment. The preliminary results demonstrate that the novel system, while not outperforming a highly optimized Kalman filter-based one, achieves results that are comparable to those of the latter. However, given the early stage of development and the potential for improving performance, such as through the use of training data, we believe it to be an promising area for future research. As a result of the novel approach's data-driven nature, it is possible to avoid tedious manual tuning of the motion model's parameters. Instead, these parameters can be learned by using examples that are provided.

Simulation study
Analogously to the experimental sorting experiments, simulations of the sorting process are carried out. Identical scenarios are used in terms of mass flows, material proportions and prediction models.

Model training and deployment
A data set of particle tracks was created using the DEM-CFD simulation model, covering a time equivalent of 60 s. As with the experimental data, the data set was recorded at a simulated frame rate of 100 Hz. In contrast to the data obtained on the real sorting system, the identity of each measurement is known, i.e., it is known a priori which particles generated the measurements. Furthermore, there is no noise (stemming, e.g., from a sensor) on the measured positions and no image processing is required to obtain the centers of the particles. Again, the MLP was trained with only one specification, a mass flow of 20 g s −1 and a ratio of bricks of 50 %, and tracks of both particle classes were used for training.
To obtain a ground truth for model training, opposed to the experimental setup, we observe the separation bar, i.e., we are provided with position measurements recorded in the vicinity of the separation bar. However, we again assume that these measurements are only available at a frame rate of 100 Hz. Thus, we again apply an interpolation as described in Section 3.1.1 to obtain a precise ground truth.
To deploy the trained model, the simulation exchanges data with the predictive tracking algorithm that includes the MLP. Note that, although known to the simulation, we do not directly use the positions of each simulated particle as input to the MLP. Rather, we also apply a multiobject tracking on the measurements provided by the DEM-CFD simulation for a fair comparison between simulation and experiment. Analogous to the experiments, we have to take into account the geometric dimensions, such as the distance between the camera observation area and the separation bar.

Simulation results
The results of the simulated sorting scenarios are shown in Figure 7. For the true negative rate, all prediction models perform similarly accurate with rates at or near 100 %. The tracking with MLP slightly outperforms the Kalman filterbased tracking approach. In the true positive rate, the trend of declining accuracy with increasing ratio of reject material can be observed. Again, all three prediction models yield nearly equal results.

Result discussion
In Table 1, the results of the experimental sorting experiments are compared with the simulated sorting experiments. The TNR and TPR are listed for each prediction at every scenario, i.e., mass flow and proportion of reject and accept material. At the first scenario, a gap of up to 13.8 % in the TNR is observed between experiment and simulation. In the simulations, all particles were correctly sorted out. There may be several reasons for the deviation. First, an exact detection of particle centers is applied in the simulations. Second, the transient built-up of the air jet is not considered. Thus, the jet is fully developed after the nozzle being activated by the prediction model. Third, there exist fewer sources that lead to scattering of particle motion in the simulation, such as irregularities in the feeding process. The trend of higher TNR in the simulations is well-marked in all scenarios. Interestingly, the line-scan prediction model outperforms both tracking algorithms in the simulations, while the Kalman filter-based tracking yields the highest TNR in the experiments. As far as the TPR is concerned, both experiment and simulation show a tendency of declining TPR with an increasing ratio of reject material. This is due to the higher frequency of nozzle activations, as shown in ref. [18]. Similarly to the TNR, the overall level of the TPR is higher in the simulations.
In general, the new predictive tracking approach based on neural networks has clearly shown its potential to yield highly accurate sorting results. It outperforms the line-scan model in the experiments and is near equally accurate as both comparison models in the simulations.

Conclusions
In this paper, we introduced a novel neural network-based predictive tracking system for application in sensor-based sorting. An advantage of the novel system compared to approaches presented so far is that tiresome manual tuning of parameters of the motion model is avoided and thus no expert knowledge for describing the particle motion is required. We validated the approach both using numerical simulation as well as sorting experiments on a laboratoryscale sorting system that was equipped with an area-scan camera. In both cases, we compared the performance to a line-scan-based system as well as a multiobject tracking system with physically-motivated motion models. With respect to the experimental results it was shown that the novel system achieves results comparable to a highly optimized Kalman filter-based one, although it does not outperform it. When comparing results obtained via numerical simulation and experiments, it was shown that although results do not match accurately with respect to absolute values, comparable trends were achieved. The full potential of our new approach was even more pronounced in the simulations. The achieved sorting results suggest choosing a Kalman filter-based approach over the novel one for maximum sorting efficiency. However, in the spirit of data-driven methods, the new approach enables to set up the system by provid-ing examples in terms of images recorded by the sorting system.
So far it remains unclear whether the novel approach is capable of also outperforming the Kalman filter-based one. Yet there are measures to be taken in order to potentially increase its performance. It is believed that especially selecting training data more carefully could contribute towards this goal. A second approach is to compensate for errors in ground truth generation and geometrical model mismatch. Furthermore, a system combining physically-motivated as well as machine learning-based models as described in ref. [8] is of great interest.
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.