Passenger demand forecasting for railway systems

: The rapid increase of the population and the number of motor vehicles brought about the transporta tion problem today. It has brought the e ﬀ orts of the opera tors to determine the headway of the vehicles during the day in order to minimize the waiting times of the passen gers at the stops and increase the satisfaction of the passengers, taking into account the passenger demand. Nowadays, especially during the current pandemic period ( COVID - 19 ) , passenger demand forecasting becomes much more signi ﬁ cant, so that measures can be taken and headway planning can be made to adjust the social dis tance by identifying the number of passengers in advance. In this study, the signi ﬁ cance of demand forecasting in the railway sector is considered, and the study tackles the issue in two stages: on line and station basis that make the study di ﬀ erent from others. In the ﬁ rst stage of the study, passenger demand forecasting is made on line basis with statistical techniques such as regression analysis and simple average, the mean absolute percentage error values are calculated and compared. Regression analysis is con ducted with SPSS Statistics 21.0 programme. In the second stage of the study, passenger demand forecasting is made with arti ﬁ cial neural network and machine learning ( ML ) algorithms technique on station basis and the error values ( mean absolute error, BIAS, mean squared error, mean absolute percentage error, and root mean squared error ) are compared. As a result of the study, while the best demand forecasting method is simple average on line basis, it is seen that the most successful and reliable results for demand forecasting on station basis are obtained through decision tree, which is one of the ML algorithms.


Introduction
Urbanization, which started with industrialization in the world, has brought many problems with it. Transportation is one of these problems. Public transportation has come to the fore in developed countries in order to solve the problem of urban planning and traffic congestion and to realize efficient passenger transportation [1][2][3][4]. Public transportation types are supported and encouraged to prevent transportation problems. At this point, reliable, fast, and convenient urban rail systems come to the fore for public transportation [5].
Effective management of rail systems, which provide an effective solution to the transportation problem in big cities which get crowded day by day, is of great importance in terms of operational efficiency and passenger service satisfaction [6].
Regression models are powerful tools for characterizing the relationship between demands and other important factors, but they require complicated modeling techniques and enormous data to produce acceptable results. Expert system models are built up of rules for demand forecasting based on the knowledge of a human expert. It is extremely difficult to transform the knowledge of an expert to mathematical rules [7,8].
Demand forecasting is important for the correct planning of the supply of the service to be caused by the passenger demand. Passenger demand forecasting plays an important role in decision making and planning. In recent years, artificial neural networks (ANNs) are often preferred because they can be used in any linear or nonlinear function. The biggest advantage of ANNs over other prediction models is learning and working with incomplete and improper data. ANNs are used in many areas such as banking, economy, energy demand, tourism forecast modeling, supply chain, and transportation [9][10][11]. Short-term forecasting is the key to the success of transportation operations planning such as train timetabling and operator allocation [8].
Machine learning (ML) is a subset of artificial intelligence, where the ML algorithm acts or performs the task without being explicitly programmed. The machine can learn automatically from the past raw data to generate predictive models based on predesigned algorithms. In general, there are two types of learning algorithms: supervised and unsupervised learning. Supervised ML algorithms learn from labeled data: input and output. The algorithm is responsible for finding the relationship between the input and the output and stops learning when it achieves an acceptable performance level [12].
As a result of the literature reviews, it has been observed that the studies carried out in the past mainly deal with the different stages of the rail transportation systems. In this study, the passenger demand of the future period is estimated by using the previous period data of the Yenikapı-Kirazlı metro line. Unlike the studies in the literature, this study includes two stages for demand forecasting. In the first stage of the study, linebased demand forecast is made for weekdays and weekends, while in the second stage of the study, the estimation is made on a station basis. Making the study both line based and station based contributes to the literature. Considering the critical importance of passenger demand forecasting, demand estimation has been made using ANNs and ML algorithms for station basis, and statistical techniques, such as regression analysis technique and simple average method, for line basis. Also, ML algorithms, such as decision tree, linear regression, random forest (RF), polynomial regression, and support vector regression (SVR), as well as an ensemble model of these, are used in building the prediction models. After obtaining the results of demand forecast, finally, their performances are evaluated using the mean absolute error (MAE), BIAS, mean squared error (MSE), mean absolute percentage error (MAPE), and the root mean squared error (RMSE).
To the best knowledge of the authors, this study is one of the first attempts to make use of both line-based and station-based forecasting simultaneously.
The article is organized as follows: Section 2 reviews the literature of passenger demand forecasting. Section 3 explains the materials and methods used in the study. Section 4 describes the structure and mathematical formulation of the proposed system. Section 5 compares the predictive performance between the proposed approaches. Finally, Section 6 gives the conclusion of the study.

Literature review
Travel demand model, also known as travel demand forecasting model, aims to establish a spatial distribution of trips and estimate the travel behavior and travel demand for a specific time frame based on certain assumptions. Travel demand forecasting is an attempt to predict the future travel pattern and to quantify it [13]. Knowing the number of passengers who use the public transportation vehicle is very important in terms of planning. When the literature is reviewed, it is seen that algorithms such as ANNs, deep learning, ML, and statistical methods are used to make this prediction.
Few studies exist in the literature for rail transportation systems, and, the current studies use SVR, fuzzy, and ANNs algorithms. Zhao and Mi [14] proposed a novel hybrid model, specially, the singular spectrum analysiswavelet packet decomposition-SVR model, for shortterm high-speed railway passenger demand forecasting that explicitly considers the relevance of neighbor time data. Li and Sheng [15] proposed empirical models for forecasting the market share of air and high-speed rail integration service using multinomial logit-based discrete choice models. A stated preference survey was conducted to estimate the parameters in the proposed models. Dou et al. [16] applied fuzzy set theory, portfolio optimization, and train operation adjustment theory. First, fuzzy passenger demand forecasting model is established to predict passengers during holidays. The results show that the fuzzy passenger forecasts' predict is more accurate than autoregressive integrated moving average (ARIMA) model. Then, authors study train deployment theory in sudden large passenger flow considering the total operation cost of the train dispatching, unserved passenger volume, the required number of trains, station capacity, section capacity, and train set configuration. Lastly, train dispatching optimization model is established, the validity of this model is illustrated with a case study. Çelebi et al. [8] adopted neural networks to develop short-term passenger demand forecasting models to be used in operational management of light rail services. A multi-layer perceptron model is preferred due to not only its simple architecture but also proven success of solving approximation problems. For eliminating the significant seasonality in time slots, each time slot is handled independent of the others, and an ANN based on daily data is developed for each.
Jin et al. [17] developed an approach for short term air passenger forecasting on the basis of the variational mode decomposition, autoregressive moving average model (ARMA), and kernel extreme learning machine. Gong [18] formulated the "intercity passenger demand forecast problem." To solve the problem, an algorithm termed as ARMA-GRNN was proposed. Cyril et al. [13] reported forecasts of public transport demand from Trivandrum district to five other districts in Kerala using the ARIMA model. Kim and Shin [19] developed a model for forecasting short-term air passenger demand using big data from search engine queries. To ensure that it had some predictive ability, time shifts ranging from 0 to 11 months, at 1-month intervals, were used to develop the forecasting model.
Ke et al. [20] explored the short-term passenger demand forecasting under the on-demand ride service platform via a novel spatio-temporal DL approach. Accurate real-time passenger demand forecasting can provide suggestions for the platform to rebalance the spatial distribution of cruising cars to meet passenger demand in each region, which will improve the car utilization rate and passengers' degree of satisfaction. Li et al. [21] aimed to predict the passenger demand under hybrid ridesharing service modes. Fuloria [22] compared exponential smoothing, multiple regression, and LSTM for forecasting accuracy in training as well as validation datasets. Bai et al. [23] proposed a novel deep learning framework for multi-step citywide passenger demand forecasting, and formulated the citywide passenger demand on a graph and employed the hierarchical graph convolution architecture to extract spatial and temporal correlations simultaneously.
Picano et al. [24] addressed the passengers demand forecasting problem acting with real data from Didi Chuxing, the most famous Transportation Network Companies (TNC) in China. In order to forecast the future behavior of passenger demands, authors proposed a Chaos Theory (CT) approach to deal with the corresponding nonlinear scalar time series. Picano et al. [25] addressed the problem of the prediction of the service requests for the TNCs. In particular, different algorithms for different real datasets have been presented. The predictive methods designed for three analyzed datasets are based on the CT principles and the corresponding phase space has been reconstructed, the chaotic behavior studied, through the analysis of the largest Lyapunov exponent. Furthermore, a different CT based algorithm has been proposed for the different datasets studied. Alekseev and Seixas [26] developed models based on ANNs for the air transport passenger demand forecasting. It is found that neural processing can outperform the traditional econometric approach used in this field and can accurately generalize the learnt time series behavior, even in practical conditions, where a small number of data points are available. Table 1 summarizes the methods in the literature. It can be obviously seen that the studies are line based. Unlike other studies, this study is both station-based and line-based that makes the study privileged.

Materials and methods
This section briefly introduces the demand forecasting methods used: statistical analysis techniques, such as

Regression analysis technique
Unlike the time series models that create demand forecasts for future periods using the demand data of the past, regression analysis is a statistical forecasting method that uses the relationship between variables [27]. It is accepted that there are two variables in the method and the relationship between them is linear [28]. The equation of a line representing the linear relationship between dependent and independent variables is formulated with bivariate regression analysis [29]. The formula used in the method is given in equation (1) where Yi = dependent variable, a = initial value of the regression line, b = regression line slope, and Xi = independent variable [31].

Simple average method
The simple average method is to collect the past period data one by one and divide them by the number of periods [32]. The advantage of the method is that it provides a flattened forecasting by using all periods and is easy to apply. The mathematical equation of the simple average method is shown in equation (2) [33].
where t = period, F t+1 = predicted value of the next period, and Y i = is the actual demand value in the period i. When a new observation, Y t+1 , is available, this new value is added to equation (2) while creating a forecast for time t + 2, and equation (3) is obtained [33].

ANNs
ANNs are computer systems developed to automatically perform the functions of the human brain, such as learning, understanding, and gaining experience such as revealing new information. Artificial neural networks consist of input layer, output layer and hidden layer. These layers generate models for the data in the computer network. Using these models, they can make a decision by looking at the examples of the events, making generalizations about the relevant event and collecting information, using the information learned about the examples for situations they will encounter later. ANNs consist of layers connected in parallel. These layers are a structure simulated according to the nervous system in the human brain. ANNs [34,35] consist of three layers, including the input layer, one or more hidden layers, and output layers. The connections between these layers form the function of the network. By adjusting the weight values of the layers that are connected with each other, the network is trained for the realization of a certain function. Thus, an output is produced for an input in the

ML techniques
ML techniques are able to learn patterns and solve complex problems just by processing (very often) large size databases. Probably, the most classical machine learning approach is constituted by the ANN paradigm [36].
(a) Decision tree: A decision tree is a tree whose internal nodes can be taken as tests (on input data patterns) and whose leaf nodes can be taken as categories (of these patterns). These tests are filtered down through the tree to get the right output to the input pattern.
Decision tree algorithms can be applied and used in various different fields [37]. (b) RF regression: RF is an ensemble learning classification and regression method suitable for handling problems involving grouping of data into classes. The algorithm was developed by Breiman and Cutler [38,39]. (c) Linear regression: Linear regression is the most common predictive model to identify the relationship among the variables. Apart from univariate or multivariate data types the concept is linear [40][41][42]. (d) Polynomial regression: Polynomial regression is a regression algorithm that models the relationship between a dependent variable (y) and independent variable (x) as nth degree polynomial [43]. (e) SVR: SVM is one of the supervised learning models for classification and regression [9,10]. SVM for regression is specifically said to be SVR. SVM can be linear or non-linear using respective kernel functions [40,44,45].

Evaluation of different methods
We use five different measures of forecast errors for evaluating the model performance and the accuracy of the methods; they are MAE, MSE, BIAS, and MAPE [10,[46][47][48] and RMSE. Assume X 1 , X 2 ,…, X n are actual data and F 1 , F 2 ,…, F n are forecasted data, and then the n values of forecast errors, e 1 , e 2 ,…, e n , are given by e 1 = F 1 − X 1 , e 2 = F 2 − X 2 ,…, e n = F n − X n .
(e) RMSE: It measures how much error there is between two data sets:

Proposed system
In this section, passenger demand forecast for 2020 has been made for Yenikapı M1-Kirazlı M1 line by using 2019 data. Demand forecasting is made by regression analysis technique, simple average method, ANNs, and ML algorithms. Demand forecasting in this study consists of two stages. First, line-based demand forecasting is made using the statistical techniques such as regression analysis technique and simple average method. In the second stage of the study, station-based daily forecasting is made for all stations on Yenikapı M1-Kirazlı M1 line using ANN and ML algorithms. The steps of the study are shown in Figure 1.

Results and discussions
In this section we perform a thorough presentation of results, with clear discussion on the model structure.

Regression analysis technique
In this section, passenger demand forecasts for 2020 are made for Yenikapı M1-Kirazlı M1 line by using 2019 data. Demand forecasting is made by using the least squares method in SPSS Statistics 21.0 programme. With this method, weekday and weekend passenger demand forecasts were made on line basis. First, the test data are divided into two groups as weekdays and weekends. Then, the data are transferred to SPSS Statistics 21.0 programme. According to the results of the regression analysis, the model is successful approximately 95% according to the weekday result of 2019 (as shown in Table 2) and it is 86% successful according to the weekend result (as shown in Table 3). The values in the coefficients table give the regression coefficients and their significance levels to be used in the regression equation (Tables 4 and 5). When the coefficient values in the table are transformed into a regression model, the equation (9) is obtained: The work done for Monday was applied for 7 days in a week.
When the X value is substituted in the equation (9), the forecasting values are obtained.
When the test values are tested with this equation, MAPE (average absolute percentage error) values of 0.01 for the weekdays and 0.04 for the weekends are obtained. Figures 2 and 3 show the actual value and regression output. While Figure 2 shows the comparison of the results of regression analysis and test values for weekdays, Figure 3 shows the comparison of the results of regression analysis and test values for weekends.

Simple average method
The simple average method is calculated by taking the average of all passengers in the previous period. The previous period data are divided into groups as weekdays and weekends, then simple average method is applied as determined in equations (2) and (3). Previous period data and forecasting data obtained by simple average method are shown in Table 6. With this method, when test values are tested, MAPE (average absolute percentage error) values of 0.001 for weekdays and 0.002 for weekends are obtained. Figures 4 and 5 show the actual value and simple average method output for Weekdays and Weekends, respectively.
The MAPE is calculated for the forecasts found by regression analysis and simple average method as determined in equation (7). MAPE values for both methods are given in Table 7. As it can be seen from the forecasting results obtained by regression analysis and simple    average method, the simple average method has been more successful in estimating passenger demand on line basis.

ANN
In this stage of the study, test data are used for station based forecasting. Training and test data for ANNs method are transferred to Matlab software environment. In order to determine the hidden layer numbers in the ANN structure, the number of neurons in these layers, the activation function of the layers, α learning rate, and momentum coefficients, various combinations have been tried by using the trial and error method and the parameters with the lowest error have been obtained. The most successful experiment is the ANN with an input layer of 5 neurons, a hidden layer of 20 neurons, and an output layer of 1 neuron. Levenberg-Marguardt was used as an educational function. The ANN architecture with the lowest error is given in Figure 6. The most successful combination compared to the actual data is shown in Figure 7. It is seen that the ANN generally gives very close results to the station-based test data of the total number of passengers.

ML algorithms
In this section, passenger demand forecasts for 2020 are made for Yenikapı M1-Kirazlı M1 line by using 2019 data, and the data are used for station based forecasting. Decision tree, linear regression, polynomial regression, SVM, and RF are applied via Python programme in order to forecast the demand. While applying the mentioned ML algorithms "pandas" library of the Python programme is used [49,50]. First, we transferred the actual data format into csv. After transferring, the csv file is read for each station separately. Machine is trained for decision tree algorithm and forecasting is made. The same process is also applied for linear regression algorithm, and the forecasting data are obtained. For RF algorithm, the machine is trained. Random_state makes the output of the model non-reproducible, so when the value ran-dom_state is specified, the same parameters will produce the same results if the same training data are given. n_estimators indicates the number of decision trees to be created. We ensure that the machine we train using the RF algorithm makes a prediction according to the data we provide. While carrying out the process for polynomial regression, we transform the values in the training column with polynomial features, before training the machine. The degree parameter here is the degree of the polynomial, the more the degree is increased, the healthier the result. The forecasting is made after training the machine for polynomial regression algorithm. Finally, the actual data are  scaled and then forecasting is made according to the scaled data for SVR after machine is trained.
Forecasting data obtained by using "pandas" library of the Python programme are given in Table 9. Figure 9 displays the comparison of the forecast results of ML algorithms for the stations in the line. The comparison of the total number of passengers based on stations (actual data, ANN, and decision tree) is given in Figure 10.
The actual data for 2019 and the forecast values for ANNs and ML algorithms can be seen in Table 10. As it can be seen in Figure 11, the decision tree, which is a ML algorithm, made the best passenger demand forecasting. After

Conclusion
In big cities like Istanbul, transportation is a huge problem. The transportation problem can be solved by public transportation. Rail transportation is almost a remedy for this problem. The success of strategic and detailed planning of public transportation highly depends on accurate 0 50,00,000 1,00,00,000  Passenger demand forecasting for railway systems  115    35.1998 demand information data. Passenger demand forecasting is very important in railway transportation systems in terms of accurate headway scheduling. If the passenger demand is estimated correctly, the frequency of headway is optimized, increasing passenger satisfaction and reducing operating costs. In this study, although passenger demand estimation is made with eight different methods, the study consists of two stages. In the first stage of the study, demand forecasting was made for the Yenikapı-Kirazlı line. During this stage, simple average method and regression analysis are made for demand forecasting, also the regression is performed with SPSS Statistics 21.0 programme. As a result, the MAPE values of these two techniques were compared. It has been observed that the simple average method gives more accurate results.
In the second phase of the study, station-based demand forecasts were made for all stations on the Yenikapı-Kirazlı line. While making this forecasting, ANNs and decision tree, linear regression, RF, polynomial regression, and SVM methods, which are among ML algorithms, were used. BIAS, MAE, MSE, MAPE, and RMSE error rates were compared, and for this dataset, the lowest error rates were found as (0.03% MAPE) decision tree, (0.54%) RF, and (0.57%) SVR. As it can be seen from the results, the most accurate forecasting is obtained by ML algorithms.
This study is important as it will provide input for the headway scheduling studies in the future works of railway transportation systems.
And also, unlike the studies in the literature, this study includes two stages for demand forecasting. In the first stage of the study, line-based demand forecast is made for weekdays and weekends, while in the second stage of the study, the estimation is made on a station basis. Making the study both line based and station based contributes to the literature. So that unlike other studies in the literature, in this study, demand forecasting was made separately in a single study, taking into account the line density and station density. Thus, considering the line density, return stations can be determined according to the density of the stops, so the train may not have to complete the entire line. Cost profit can be achieved by not going to the last station every time, depending on the line density.
In addition, during the current pandemic period (COVID-19), passenger demand forecasting for public transportation has become more important, especially in metropolitan cities such as Istanbul, because the number of passenger demanding rail transportation is known in advance and social distance is ensured, it is important for the prevention of infectious diseases and the prevention of epidemics.
Funding information: There are no funding sources for this study.
Author contributions: Both authors have read and agreed to the published version of the manuscript.