A 4D Trajectory Prediction Model Based on the BP Neural Network

Abstract To solve the problem that traditional trajectory prediction methods cannot meet the requirements of high-precision, multi-dimensional and real-time prediction, a 4D trajectory prediction model based on the backpropagation (BP) neural network was studied. First, the hierarchical clustering algorithm and the k-means clustering algorithm were adopted to analyze the total flight time. Then, cubic spline interpolation was used to interpolate the flight position to extract the main trajectory feature. The 4D trajectory prediction model was based on the BP neural network. It was trained by Automatic Dependent Surveillance – Broadcast trajectory from Qingdao to Beijing and used to predict the flight trajectory at future moments. In this paper, the model is evaluated by the common measurement index such as maximum absolute error, mean absolute error and root mean square error. It also gives an analysis and comparison of the predicted over-point time, the predicted over-point altitude, the actual over-point time and the actual over-point altitude. The results indicate that the predicted 4D trajectory is close to the real flight data, and the time error at the crossing point is no more than 1 min and the altitude error at the crossing point is no more than 50 m, which is of high accuracy.


Introduction
4D trajectory prediction is one of the key technologies of air traffic management (ATM) [4]. It is of great significance for enhancing air traffic safety, accelerating air traffic flow and improving ATM efficiency. At present, Single Europe Sky Air Traffic Management Research and New Generation Air Transportation Systems (NextGen) have proposed the concept of trajectory-based operation (TBO) and elevated TBO to a new level: Management by Trajectory [3,15]. At the same time, China's New Generation of Air Traffic Management System also proposed that "TBO is used to manage airspace and trajectory, and decision making in each time period is related to 4D trajectory." Thus, it can be seen that accurate prediction and deduction analysis of the trajectory is an essential component of the safe operation of air traffic in the future.
Traditional trajectory prediction methods are mostly based on aerodynamic models and hybrid estimation theory [12,17,20,22]. Based on the kinematics model, Kaneshige et al. [10] proposed a trajectory prediction algorithm that can improve the reliability and robustness of trajectory operation. The authors simulated the flight prediction process through simulation experiments and found that trajectory-based prediction can reduce energy consumption and improve the efficiency of flight operations [10]. Li et al. proposed a prediction method of the general aviation aircraft trajectory basic model based on data fusion, which combined the basic aircraft flight model with data fusion to predict and form a complete 4D trajectory [11]. The above research work is based on the aircraft dynamics model, which has the disadvantages of too many parameters and low prediction accuracy. In addition, there is a method of mixed estimation theory. Liu and Hwang predicted the transition between flight states through the random linear mixed system and studied the key technology of NextGen-4D trajectory prediction and conflict detection [13]. Since flight modal changes are closely related to real-time status, Hwang and Seah proposed a state-dependent modal switching update algorithm [7]. Maeder et al. introduced wind speed and wind direction information, proposed a state-dependent hybrid estimation algorithm and realized trajectory prediction [14]. In recent years, with the rise of air traffic "big data" research, the use of data mining for 4D trajectory prediction has become an emerging trajectory prediction technology. Wu and Pan proposed a linear regression statistical prediction model to solve the problem of large error in 4D flight trajectory prediction of the traditional aircraft dynamics model [19]. Although this model is more accurate than the traditional aircraft dynamics model, it is still a linear model, and each flight must be modeled, so the data volume and required storage space are large. Through the analysis of the historical radar trajectory, Chen et al. constructed the mapping relationship between the altitude, speed, approach flight distance and approach flight time of the aircraft when entering the port by using the radial basis function neural network, which provided a reference for the prediction of approach flight time, but did not predict the flight position [2]. On the target trajectory prediction in a hot spot area, Qian et al. proposed an air target trajectory prediction model based on the backpropagation (BP) neural network, through which the target trajectory was predicted in advance [16]. However, the model only considered the longitude and latitude of the trajectory, and did not take into account the over-point time and altitude of the trajectory.
In view of the fact that most of the existing methods predict the trajectory in two-dimensional space (horizontal plane), this paper proposes a method for accurately predicting the trajectory in 4D space. As shown in Figure 1, our method mainly includes three parts: trajectory feature extraction, BP neural network building and 4D trajectory real-time prediction.

Trajectory Feature Extraction Based on ADS-B Data Analysis
ADS-B (Automatic Dependent Surveillance -Broadcast) is a kind of aircraft operation monitoring technology based on satellite positioning and ground/air data link communication, which can automatically transmit the 4D position data (time, longitude, latitude and altitude) and aircraft identification information from airborne equipment to the ground through the ground-air data link [8]. By mining historical ADS-B data, valuable feature information can be extracted for trajectory prediction.

Analysis of ADS-B Trajectory Data
The data returned by ADS-B is the trajectory point information of each aircraft at a certain time during the whole flight [18]. The data sampling period of ADS-B is about 1 s. Therefore, the trajectory data of each aircraft is not continuous, but consists of a series of discrete trajectory points.
Suppose there is a historical trajectory set L, and the number of historical trajectories is N; then, where L k represents the kth trajectory in the set L.
Suppose each trajectory is composed of n trajectory points; then, where m i represents the ith trajectory point on the trajectory L k . Each trajectory point consists of s attribute variables; then, where m ij represents the jth attribute of the trajectory point m i . Each ADS-B trajectory data collected in this paper contains the following attributes: m i = {time, ICAO address code, flight number, longitude, latitude, altitude, velocity, heading, vertical speed}.

Trajectory Feature Extraction
Trajectory feature of the aircraft at time t is expressed as follows: X(t) = {t, lon, lat, alt, vel}, where t, lon, lat, alt, vel represent time, longitude, latitude, altitude and velocity, respectively. The trajectory feature of the sequence points reflects the trajectory of the aircraft. The time information is extracted by clustering methods. The position and velocity information of the aircraft is mined by cubic spline interpolation. The trajectory feature of aircraft is constructed and used as the input data of the trajectory prediction model.

Cluster Analysis of Total Flight Time
When the aircraft is flying on a particular route, the total flight time is generally fixed. However, due to weather, regulation and ADS-B receiving flight trajectory incompleteness, the total flight time will fluctuate within a certain range. Therefore, by analyzing multiple flight time data, the central value of the aircraft's flight time fluctuation on the air route from Qingdao to Beijing is obtained.
In this paper, agglomeration hierarchical clustering and k-means clustering are adopted. The traditional Euclidean distance is used to measure the similarity of flight time [1]. Suppose the flight times of the two trajectories are t i and t j ; then the Euclidean distance of flight times of the two trajectories is The single-linkage method was used to cluster the aggregation hierarchy and draw a tree graph to analyze the distribution of flight time, so as to facilitate the subsequent k-means clustering division. The k-means algorithm process is as follows: a) the number of clusters k is determined by the hierarchical clustering, and k flight times are selected as the cluster centroid; b) calculate the distance between each flight time and the cluster centroid, and classify it into the cluster closest to the centroid; c) the time average of each cluster is taken as the centroid of the cluster; d) repeat steps (b) and (c) until the cluster's centroid is no longer changing.
Thus, the unified running time T p is obtained by the hierarchical clustering algorithm and the k-means clustering algorithm.

Time Normalization and Position Interpolation
In Section 2.2.1, we have drawn the unified running time T p of flights from Qingdao to Beijing. Next, we need to normalize the flight's historical flight data to the time interval [0 T p ]. The normalization method [19] is as follows. Let T i denote the flying time of the ith trajectory and M it denote the position at time t. After normalizing to the time interval [0 T p ], the time when the aircraft is at position M it is changed to t′. Then Since the trajectory has missing points in the receiving process, it is impossible to ensure that the trajectory point is available at each moment, so the normalized flight data needs to be interpolated. We use cubic spline interpolation [5,6] to divide the trajectory into longitude, latitude, altitude and velocity dimensions and interpolate on each dimension, and obtain the interpolation results. In this way, the historical flight position at the same time interval T is obtained.
Through the above methods, the aircraft trajectory features are extracted, and these features are input into the neural network. The neural network is used to learn the running rules of the trajectory on the current route, so as to realize the accurate prediction of the trajectory.

4D Trajectory Prediction Model Based on the BP Network
The aircraft trajectory data is a sequence arranged according to the time sequence of its occurrence. Therefore, we can regard the trajectory as a multivariate time series. The BP neural network is one of the most widely used and matured artificial neural networks [9,21]. It is a multi-layer feedforward network trained by the error BP algorithm. Therefore, the BP neural network is used as the learning model of trajectory prediction.

Analysis of Model Parameters
The parameters of the neural network model affect the learning rate and prediction accuracy of the model, which is a crucial factor for the establishment of the model. Through the analysis of neural network parameters, the neural network structure is optimized to improve the accuracy of model prediction.

Selection of the Time Window
The selection of the time window determines the number of nodes in the input layer and the number of nodes in the output layer, and has a certain influence on the prediction accuracy of the model. That is to say, the number of training time points of the target trajectory is different, and the model scale is different. If the input dimension is small, the cumulative influence of the historical trajectory information cannot be fully considered, and the prediction accuracy is reduced; if the input dimension is large, the model is more complicated and the degree of fitting is too high. Therefore, in this paper, we set the time window as 3, that is, the trajectory data of the first three moments predicts the trajectory at the next moment.
After the window is selected, starting from the first data of the time, longitude, latitude altitude and velocity sequences, the trajectory point data at time t − 2, time t − 1 and time t is taken as the training samples, and the trajectory point at time t + 1 is used as labels. Then the time window is moved one bit backward, starting with the second data of the sequence, and the training samples and targets are selected in the same way until the last trajectory point is selected.

Selection of the Number of Hidden Layer Nodes
The number of input and output layer nodes of the model is determined by the length of the time window and the feature of the training samples. In the training process, the number of hidden layer nodes has a great influence on the prediction accuracy of the model. If the number of nodes is set too small, the neural network will be under-fitting. If the number of nodes is set too high, the training time of the model will increase and even cause the over-fitting phenomenon. For the selection of the number of hidden layer nodes, there is an empirical formula as follows: where m and n are the number of nodes of the input layer and the output layer, respectively, and a denotes a constant in [0 10].

Structure of the 4D Trajectory Prediction Model
According to the above analysis of the model parameters and the splitting of training samples and labels, a BP neural network 4D trajectory prediction model is established. The model structure is shown in Figure 2. The model has three layers: x is the input layer; the number of nodes is 15; the input data is the time, longitude, latitude, altitude and velocity information at time t − 2, t − 1 and t; h is the hidden layer with 14 nodes; y is the output layer with four nodes, which correspond to the time, longitude, latitude and altitude information of the predicted time t + 1. w ij is the connection weight of the input layer node and the hidden layer node, a j is the offset from the input layer to the hidden layer, v jk is the connection weight of the hidden layer node and the output layer node, and b k is the offset from the hidden layer to the output layer.
The training process of the model is realized by the forward propagation of the trajectory data and the BP of the error. The specific training process is as follows: a) Initialize the connection weights. b) Enter a training sample to find the output of each node in the hidden layer: (4) c) Calculate the output of each node of the output layer: where t is the true data of the trajectory sample, and λ is the learning rate. e) Revise the value of w ij : f) The offset a j is updated as follows: g) The offset b k is updated as follows: h) If the conditions for the end of training are met, the end of training process; otherwise, repeat steps (b) to (g).

Experiments and Results Analysis
In this paper, the aircraft ADS-B trajectory from Qingdao International Airport (ZSQD) to Beijing Capital International Airport (ZBAA) is used for simulation experiments. Flight numbers involved are CCA1526, CCA1580, CBJ5568, CCA1576, CCA1570, CDG4651, CDG4653, CES5227, CES5195 and CES5193. The flight path is shown in Figure 3. Figure 4 shows the three-dimensional display of these trajectory data in the coordinates of longitude, latitude and altitude.

Cluster Simulation Results of Flying Time
First, the flying time of the trajectory is clustered by agglomeration hierarchy, and a clustering tree is drawn to analyze the time distribution, as shown in Figure 5. It can be seen from Figure 5 that it is better to classify the trajectory flying time into three categories. Next, k-means clustering is performed on the trajectory flying time, and the number of class clusters is k = 3. The clustering results are shown in Figure 6. As can be seen from Figure 6, there are some points belonging to a class with a large flying time and some points belonging to a class with a small flying time. Since these points have a great influence on the running time T p of the trajectory, the mean value of the remaining sample points is taken as the unified running time of the trajectory after these points are eliminated, and the obtained T p is 2620 s.   Figure 7A is the interpolation of longitude. Figure 7B is the interpolation of latitude. Figure 7C is the interpolation of altitude. As can be seen from Figure 7, after cubic spline interpolations, the position points (longitude, latitude and altitude) of the flight trajectory become uniform, the flight trajectory data with the same time interval are obtained, and the multi-dimensional flight trajectory features are extracted.

Results of 4D Trajectory Prediction
We set the target error of the network as 1 × 10 −5 , the learning rate as λ = 0.01 and the maximum number of iterations as 1000. We select 140 sets of trajectory as training data (91,420 trajectory points), and select  5% of the training data as validation; the remaining 20 sets are used as test data (13,060 trajectory points). The test set is predicted based on the determined optimal BP neural network structure and the trained model parameters. The prediction results are shown in Figures 8 and 9. It can be seen from Figures 8 and 9 that the predicted trajectory is consistent with the actual trajectory, and the predicted trajectory is very close to the real trajectory. The 3D prediction of the trajectory is shown in Figures 8A and 9A. The 3D prediction trajectory better represents the running curve of the real trajectory. In Figures 8A and 9A, although the model encounters slightly large errors when considering the altitude dimension, it can still reflect the actual altitude of the trajectory. Figures 8B and 9B show latitude and longitude prediction of the trajectory. The predicted value of latitude and longitude is almost overlapping with the expected value.
The predicted error results of specific trajectory points are discussed below. Five trajectory points are selected from the predicted trajectory for CDG4651 flight, and the comparison results of the predicted overpoint time, the predicted over-point altitude, the actual over-point time and the actual over-point altitude are presented, as shown in Table 1 (converted the altitude unit feet to meters, and 1 foot was equal to 0.3048 meters). As can be seen from Table 1, during the prediction process, the prediction time error of the model is controlled within 1 min, and the prediction altitude can also reflect the trend of the actual trajectory, with the maximum error not exceeding 50 m, so the prediction accuracy is relatively high.  In this paper, the common measurement indicators such as maximum absolute error (MAX), mean absolute error (MAE) and root mean square error (RMSE) are used to evaluate the trajectory prediction model. The smaller the value of the three indicators, the higher the accuracy of the trajectory prediction model describing the experimental data. The three error indicators are defined by the following formulas: where f i is the actual trajectory, and ∧ f i is the predicted trajectory. We used the same data set. The statistical results of the BP and support vector machine (SVM) models' prediction errors for a single trajectory feature, which include time, longitude, latitude and altitude, are given in Table 2. As can be seen from Table 2, the prediction errors of the two models for a single trajectory feature are all within the acceptable range, but the prediction error of the BP model is smaller than that of the SVM, and the prediction accuracy is higher. This indicates that the 4D trajectory prediction model based on the BP neural network in this paper can meet the requirements of aircraft trajectory monitoring.
Finally, a quantitative comparison of the 4D trajectory prediction model in this paper, the statistical regression model in [19] and the BP neural network model in [16] is made. The results are compared from the aspects of prediction dimension, prediction timeliness, method linearity and prediction error index, as shown in Table 3.
As can be seen from Table 3, the 4D trajectory prediction model based on the BP neural network in this paper has realized the prediction of the trajectory in 4D space, while the BP neural network model in [16] has only predicted the longitude and latitude of the trajectory, without considering the over-point time and over-point altitude of the trajectory, so the prediction dimension is insufficient. This paper implements the real-time prediction of the 4D trajectory. Reference [19] mentions the fitting of the trajectory by the statistical regression method, which cannot meet the requirements of real-time prediction. The model output of the literature [16] is the target regular trajectory, which also fails to reflect the real-time performance of trajectory predictions. Moreover, statistical regression is a linear method, and the trajectory has nonlinear characteristics. The nonlinear model of this paper can approximate the nonlinear mapping relationship of the trajectory, which can better predict the flight trend of the aircraft. In addition, our method gives the prediction accuracy evaluation index of the model -MAX, MAE and RMSE, which are not included in [16,19]. Therefore, under the premise of ensuring real-time performance, the model in this paper can well predict multi-dimensional and high-precision trajectory.

Conclusions
In this paper, the 4D trajectory prediction model based on the BP neural network is studied, which overcomes the limitations and drawbacks of the existing trajectory prediction method, and can accurately predict the aircraft trajectory in 4D space. Our trajectory prediction method can effectively learn and identify the trajectory features such as time, longitude, latitude and altitude of the aircraft, so that the 4D trajectory of the aircraft can be predicted timelessly and accurately. It provides a basis for grasping air traffic flow and ATM decision making, and has certain theoretical and practical significance. Since the predicted results are also affected by other external factors such as weather, regulatory factors and so on, for future work, it will improve the model by considering the addition of weather information data such as wind speed and regulatory factors.