Relevance- and Frequency-Enabled Trip Planning Model Based on Socio-economic Status

Anand Sesham 1 , P. Padmanabham 2 , A. Govardhan 3 , and Rajesh Kulkarni 4
  • 1 Department of Computer Science and Engineering, M.V.S.R Engineering College, Nadergul, Hyderabad 501510, India
  • 2 Department of Computer Science and Engineering, Bapuji Institute of Engineering and Technology, Jawaharlal Nehru Technological University, Hyderabad, India
  • 3 School of Information Technology and Executive Council Member, Jawaharlal Nehru Technological University, Hyderabad, India
  • 4 Department of Computer Engineering, JSPM, Narhe, Pune, India
Anand Sesham
  • Corresponding author
  • Department of Computer Science and Engineering, M.V.S.R Engineering College, Nadergul, Hyderabad 501510, India
  • Email
  • Search for other articles:
  • degruyter.comGoogle Scholar
, P. Padmanabham
  • Department of Computer Science and Engineering, Bapuji Institute of Engineering and Technology, Jawaharlal Nehru Technological University, Hyderabad, India
  • Search for other articles:
  • degruyter.comGoogle Scholar
, A. Govardhan
  • School of Information Technology and Executive Council Member, Jawaharlal Nehru Technological University, Hyderabad, India
  • Search for other articles:
  • degruyter.comGoogle Scholar
and Rajesh Kulkarni

Abstract

Planning a trip not only depends on the traveling cost, time, and path, but also on the socio-economic status of the traveler. This paper attempts to introduce a new trip planning model that is able to work on real-time data with multiple socio-economic constraints. The proposed trip planning model processes real-time data to extract the relevant socio-economic attributes; later, it mines the most frequent as well as the feasible attributes to plan the trip. The relevance of the socio-economic constraints is defined using correlations, whereas the frequent as well as the feasible attributes are mined through the sequential pattern mining approach. Real-time travel information of about 38,303 trips was acquired from the Indian city of Hyderabad, and the proposed model was subjected to experimentation. The proposed model maintained a substantial trade-off between multiple performance metrics, though the trip mean model performed statistically.

1 Introduction

Day by day, plenty of issues are found to emerge in transport and transport management [5]. Serious interferences that exist among the regional distributed traffic and cross-border traffic, heavy traffic load, inadequate static traffic facilities, and severe conflicts that result from person/vehicle mixed travel [16] can be stated as a few instances of such issues. These issues are created due to the urban traffic hubs, which possess a bulkier traffic and entirely overlapped travel space that are filled with a variety of transportation means that include bicycles, public transport, and minibuses. The main cause for the traffic and vehicle congestion is the progression in the socio-economic factors [11]. So, the intelligent transport management system [simply, the intelligent transport system (ITS)] that effectively controls the transport system has become highly crucial in the urban sectors [5, 16]. The ITSs are global systems that draw higher level of attention from the automotive industry, transportation professionals, and political decision makers, which exist all around the globe [5].

A larger number of countries that include Europe, the US, and Japan [15], in particular, perform a wide variety of research-oriented tasks that are highly focused toward exploring and setting up ITSs [2]. Executing such tasks allow the transmission technology of data communication, advanced information technology, electric sensor technology, computer processing technology, and electric control technology to integrate and apply over the entire transport management system in an effective way, in addition to setting up a complete transport management system that is direct, precise, and effective [9, 14]. Despite the fact that the methods that are presented in the literature determine the travel time reliability [1], the generation of realistic reliability measures in the output of traffic simulation models and planning models continue to be a serious issue. This paper addresses the key issues in the ITS that exploit less information about the socio-economic factors, and hence derive a simulation model to plan the trip based on such socio-economic factors. We further discuss about the impact of socio-economic factors on the ITS decision and organize the paper as follows.

1.1 Socio-economic Factors Influencing ITS

With ITS, it is possible to bring about enhancements in driver support, road transport, and mobility. In the upcoming years, the potential investments on ITS would rise in a rapid manner. During the appraisal phases of ITS projects, various kinds of assessments that pertain to technology, user acceptance, traffic, environment, and socio-economy should be considered. However, in most of the IT assessments, only a few of the above-mentioned factors have been addressed with little consideration on the socio-economic factors.

The socio-economic assessments play a lead role in making government policy decisions. In fact, while appropriate evaluation guidelines for the ITS projects in the US and Europe were formulated, extensive works have been carried out previously and at present in relation to the socio-economic assessments. Yet, the information regarding the way the impacts can be evaluated is missing. In addition, it was more troublesome to measure or define several of its benefits in a required form. Numerous efforts have been put forth for disclosing a wide variety of potential advantages, while having decreased level of stress over the cost. Further, the comparison between the results of two different projects is not an easy task because the projects would have their own guidelines and cost, as well as benefit evaluation schemes [6].

2 Related Works

Zeng et al. [16] have insisted the need to have an intelligent management system in the sophisticated urban transport system. The authors have exploited a number of technologies, wherein the objective was to design and create an intelligent transport management system that is more vital in traffic hubs. They have also addressed the problems that are associated with design, framework, and functional modules, so that the degree of exploitation, protection, and ease can be enhanced along with the improvement in the government’s decision-making ability.

Dong and Mahmassani [4] have developed an ITS system through the inclusion of vehicular technology, in which they have considered the mobility as well as the speed of the vehicles to model the breakdowns that occur very often in traffic. The objective of this scheme was to envisage the change in the travel time, which was produced due to this kind of stochastic events. The breakdown was supposed to be occurring at varying flow limits with some probability and, further, it was found to be prolonging for an arbitrary amount of time. The modeling at the microscopic level was achieved through considering the variations in speed, wherein a leading vehicle has caused the initiation and the following vehicles that contain correlated-distributed behavioral parameters have enabled the propagation. The numerical results that were obtained using the Monte Carlo simulation has revealed the effectiveness of the proposed stochastic modeling approach in offering the realistic macroscopic traffic flow behavior and generating travel time distributions.

Korhonen et al. [7] have inspected the state of telecommunication technology, service, and economy of ITS in Helsinki metropolitan area council (termed as YTV). As a result of this investigation, they have suggested a number of ways to develop a triumphant ITS, as follows: (i) the networking services/technologies have to be acquired from the telecommunication operator; (ii) the networks have to be constructed individually; and (iii) a hybrid solution has to be produced. The authors have recommended the exploitation of the characteristics pertaining to timeline diagrams and investment sensitivity estimations for attaining a successful decision-making process in the planned ITS.

Di Lecce and Amato [3] have introduced the flexible ITS to determine a suitable route for every vehicle through the mitigation of the impact, which is produced from the transportation of hazardous materials. They have implemented a negotiation process between the intelligent agents. The system was able to keenly observe the route that every single vehicle follows, and to check whether this route matches with the previously specified route or not. An increasingly flexible on-board unit that was placed in the vehicles serves as the most important component of the ITS. It owns a modular structure, wherein several sensors can be linked as per the application requirement. Here, a range of parameters that include the position, speed, load balance, and acceleration of the vehicles were examined in detail. With this information, the system was able to detect a few characteristics, like the activities of the driver and the risky operating condition of the vehicle. The authors have put forth a multi-level metaphor-based graphic user interface to make a representation of this information. The initial stage of the interface has offered a rapidly understood view of the state, which was under observation, and it has relied on the on-site first response as well as the route monitoring operators. The second level has given the information in a large and thorough form to the incident managers and the additional operators. A human panel has been employed, while the interface was assessed. Once a volunteer has made use of the interface for half an hour, he/she was enabled to compile a questionnaire. The objective of this work was to jot down a few methods that can be employed for route planning and the user interface.

Zhang et al. [17] have systematically studied the development of data-driven ITS, under which they have described about the various functions that are related to the key elements. Moreover, the problems that are encountered at the time of deploying were also depicted. Their review has summarized the key research gaps and the directions that persist for further development of data-driven systems.

Lathia et al. [8] have worked on a personalized trip time estimation model to notify the traveler. They have acquired the travel history of the user for developing such a personalized trip time estimation model. They have proposed a prediction model to predict the personalized trip times for the travelers, and it was followed by a ranking method to rank the stations based on the interest of the travelers. In such a way, they have made the prediction of the future mobility patterns.

2.1 Review

The contributions that have been associated with ITS are vast and diverse. The researchers have attended various motivation scenarios to improve the ITS. However, the primary objective of providing a smart ITS to facilitate the travelers have not been achieved. For instance, in Ref. [4], the authors have worked to simulate the vehicle breakdowns in a congested road plot. Though the simulation has based its version on vehicular mobility modeling, the primary intention was to aid the prediction of the travel time reliability with improved precision. However, it was complex and imprecise, apart from analytical models. Similarly, the transportation of hazardous materials has been seriously considered in Ref. [3] to aid safe route planning, monitoring, and accident management. This reveals that an urban planning decision system depends on the simulation environment, which should consider multiple parameters. For instance, a traveler’s decision depends on numerous factors, such as the travel time reliability [4, 10]. Hence, precise modeling of the entire traveling details is significant for any trip or transportation planning system.

Few personalized prediction models [8], which are reported in the literature, have also attracted us greatly because they have been associated with the travel history and the statistical analysis. However, these models remain static. For instance, the trip familiarity, the trip context, and the combined models have been experimented for the ability to handle non-linear and diverse data characteristics such as the travelers’ personal details, transportation mode of interest, etc.

3 Article Overview

3.1 Problem Formulation

Let us assume that a traveler wants to have a trip from source S to destination D. The trip (S,D) includes numerous socio-economic constraints, such as the vehicle to be used, traveling stages, availability of parking facilities, traveling cost, parking cost, etc. These socio-economic constraints remain unpredictable. The traveler will be benefited if any of these socio-economic constraints are precisely estimated and recommended for the trip. For instance, if the traveler comes to know about the availability of the parking facilities and its usage in the past trips, then it would highly help him/her in deciding the trip, mode of travel, travel cost, and many other constraints. However, it is not just a process of mining the historical data. Moreover, it is challenging to acquire the relevant socio-economic constraints that play a key role in the trip.

3.2 Our Contributions

Our contributions in the paper are listed as follows:

  • Contribution 1: An in-depth survey is done to acquire the real-time data that are associated with the trips, which are carried out by various travelers.
  • Contribution 2: A recommendation system is introduced, wherein the algorithm for mining the interesting, significant, and relevant socio-economic constraints that are associated with the trip is proposed. The developed recommendation system for handling the socio-economic factors in the decision-making process is based on an intelligent algorithm. The algorithm extracts the attributes based on the mined information from the data.
  • Contribution 3: An interestingness metric is proposed to evaluate the significance of the mined socio-economic patterns.
  • Contribution 4: Performance consistency and reliability are investigated using a polynomial fitting model.

3.3 Data Acquisition

Data acquisition is the significant task in our work. We have collected the data from the Indian city of Hyderabad. The entire city of Hyderabad has been divided into 147 zones from where the travelers are met and asked to fill up four individual forms. Form 1 and Form 2 were used to collect the family details and the personal socio-economic status of each traveler, in addition to gathering information regarding the way in which they are connected with the travel. The socio-economic status of the family includes the education level, income level, satisfaction level, vehicle details, traveling schedule, and traveling cost. Form 3 and Form 4 provide specific information about the trip. These forms have the details of every individual trip of the travelers of these zones, like origin, destination, vehicles used, parking facilities, travel comfort, travel cost, parking cost, parking, traveling stages, etc. The entire information concerning 48,853 trips has been acquired, and these trips have been found to have association with 303 places of Hyderabad. As the entire details have been filled up in a user-understandable format, it is essential to process the details as per the machine-understandable format. In the rest of the paper, the socio-economic attributes of the trip or the travelers will be simply called as attributes or characteristics.

4 Our Methodology

4.1 Pre-processing

The sequence of steps that is involved in our methodology is illustrated in Figure 1. The first step in the proposed methodology, termed as acquisition of raw data, has been described in the earlier section. The data pre-processing steps include the extraction as well as the organization of the socio-economic attributes of user interest, in accordance with the origin and the destination of the trips, so that further processing can be performed easily. Despite the fact that numerous details about the trip and the traveler information have been collected, this paper considers 29 most significant attributes along with the origin and the destination of the trips. The 29 attributes that are considered are listed in Table 1. The data pre-processing step includes three stages of processing, namely filtering, filling, and splitting.

Figure 1:
Figure 1:

Sequence of Steps Involved in the Proposed Methodology.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

Table 1:

Selected Attributes Associated with Every Trip.

Attribute name
1. Purpose of travel
2. Waiting time in stage I16. Waiting time in stage III
3. Stage I distance17. Stage III distance
4. Mode of travel in stage I18. Mode of travel in stage III
5. Travel time in stage I19. Travel time in stage III
6. Type of parking in stage I20. Type of parking in stage III
7. Cost of parking in stage I21. Cost of parking in stage III
8. Travel cost in stage I22. Travel cost in stage III
9. Waiting time in stage II23. Waiting time in stage IV
10. Stage II distance24. Stage IV distance
11. Mode of travel in stage II25. Mode of travel in stage IV
12. Travel time in stage II26. Travel time in stage IV
13. Type of parking in stage II27. Type of parking in stage IV
14. Cost of parking in stage II28. Cost of parking in stage IV
15. Travel cost in stage II29. Travel cost in stage IV

Filtering: As the raw data has been acquired on the field, it includes numerous erroneous as well as missing fields. The trips with complete information are only helpful for further processing. Among 48,853 trips, 38,303 trips are found to be complete and useful. For these 38,303 trips, the 29 attributes are acquired and the database Draw:|Draw|=Mraw×Nraw is constructed. Each element of Draw is represented as dijraw:0iMraw1,0jNraw1, where Mraw and Nraw refer to the total number of trips and fields in the raw dataset, respectively. Here, Mraw=38,303 and Nraw consists of source field, destination field, and 29 attributes. Hence, Nraw=31.

Filling:Draw is further subjected to the filling process, wherein the irrelevant stage information is filled up with zero. For instance, trip 1 starts from location A and ends in location D. Similarly, trip 2 starts from location A and ends in location B. Trip 1 is a two-stage travel, i.e. from location A to location C and then from location C to D. On the other hand, trip 2 has only one stage, i.e. directly from location A to location B. In these circumstances, d1jraw has filled fields that are related to stage 1 and stage 2; however, d2jraw has incomplete information in the fields that are related to stage 2. Hence, these fields are filled up with zeros.

Splitting: The splitting process segregates Draw into multiple sub-datasets, dtidproc, where 0m(tid)Mtidproc1 and 0n(tid)Ntidproc1. Each sub-dataset consists of multiple trips with the same origin as well as destination. The pseudo code of the algorithm that is used to split Draw and construct Dproc is given in Figure 2. At first, the splitting algorithm extracts the set of origins {O} and the set of destinations (targets) {T} from Draw.

Figure 2:
Figure 2:

Pseudo Code for the Splitting Algorithm.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

The representation of {O} and {T} can be orderly expressed as {O}∈dj=1i and {T}∈dj=2i. Hence, {O} and {T} exhibit the properties {O}, {T}⊂Draw and |O||T|<Mraw. Further, tid refers to the trip ID, and each tid refers to a pair of origin and destination. Dproc consists of all the elements of Draw, which is obtained after filling, but it is structured based on the origin and the destinations of the trips.

Remark 1: The number of elements in Dproc is equal to the number of elements in the filled Draw, i.e. |Dproc|=|Draw|.

4.2 Extraction of Relevant Attributes

In order to extract the relevant attributes, we determine the correlation of every attribute with the other attributes. Based on the positivity of correlation, the attributes are defined as relevant and significant for the particular trip. The algorithm for extracting the relevant attributes is illustrated in Figure 3.

Figure 3:
Figure 3:

Pseudo Code of the Algorithm to Extract the Relevant Attributes.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

Definition 1: The relevance of an attribute is defined here as the attribute that is highly correlating with the other attributes of the trip.

Initially, the algorithm extracts only the attributes atttid (m) from dtidproc(m,n) because dtidproc(m,n)|n=0 and dtidproc(m,n)|n=1 refer to the origin and destination of the trip, in a respective fashion. The correlation coefficient of atttid (Ctid) is determined, and it is subjected to diagonal elimination in which the diagonal elements are set as zero. Based on Ctid, the correlation index for every attribute, Ctididx, is determined as follows:

Ctididx(l)=m=1MtidprocCtid(m,l)Mtidproc:0lNtidproc1, (1)

where Mtidprocis the number of trips that are associated with the trip ID tid and Ntidproc is the number of attributes that are considered. The lth attribute participates in dtidcorr, only if Ctididx(l) is in the positive plane. Hence, the dataset with highly correlating (relevant) attributes, termed as dtidcorr(m,k):dtidcorrdtidproc, is constructed such that 0m(tid)Mtidcorr1 and 0k(tid)Ntidcorr1.

Remark 2: dtidcorr is equal to dtidproc, when all the Ntidprocattributes enable positive correlation with each other.

Remark 3: dtidcorr remains as the multiples of the number of attributes, which enable positive correlation with the other attributes, provided Mtidproc=Mtidcorr.

4.3 Mining the Frequent as well as the Feasible Patterns

This step first extracts the frequent patterns, and then the reconstruction of the frequent patterns follows to determine the feasible patterns. This step also includes a preliminary process to skip the trips that do not have any relevant attributes. The steps are described in the pseudo code, which is illustrated in Figure 4. This process generates L length patterns from the available attributes of dtidcorr. Moreover, possible combinations of the available attributes atttididx are determined in this step.

Figure 4:
Figure 4:

Pseudo Code for Mining the Frequent as well as the Feasible Patterns.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

In Figure 4, ptid,pfreq(q) refers to the frequent pattern elements and F(pfreq(q))tid,p refers to the corresponding frequency. Example 1 interprets the pseudo code with sample data.

Example 1: Considerdtidcorr|tid=1, where Mtidcorr=10 and Ntidcorr=4, for which the data samples are presented in Figure 5A. Let L=2, and hence six possible combinations can be generated as in Figure 5B. For every combination, the two length patterns that exhibit maximum frequencies are presented in Figure 5C. The highly frequent patterns are relocated in the respective attribute column along with its frequency in Figure 5D. In every attribute, the highly frequent pattern element is retained and the other pattern elements are eliminated as in Figure 5E and presented as in Figure 5F.

Figure 5:
Figure 5:

An Example to Illustrate the Algorithm for Mining Frequent and Feasible Patterns.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

5 Results and Discussion

The proposed trip planning model is simulated along with three personalized trip planned models to demonstrate the performance of the proposed model. Despite the fact that there are few other personalized models reported in the literature [8], we consider the trip mean model and the clustering model because of their personalization effect to handle our data.

The trip mean model takes the mean of every attribute of each trip and recommends it. As it takes the global average of every attribute, it is believed that it reflects highly frequent attributes. This model can also be stated as a simple statistical model. In the clustering model, every trip data are clustered into two and their centroids are determined. Here, the clustering is performed using agglomerative hierarchical clustering [12]. The first centroid refers to the recommendation from cluster model 1, and the second centroid refers to the recommendation from cluster model 2. The existing models recommend non-integer plans, and hence, a tolerance value is adjusted. Based on the adjustments of tolerance, the experimentation is categorized into two test cases.

5.1 Test Case 1 (Zero Tolerance)

In case 1, no tolerance is considered for the trip plans. Here, the plans from the trip mean model and the cluster models are rounded to the nearest integer, and the frequency of its occurrence is identified. Similarly, the frequency of occurrence of the plans that are recommended by the proposed model is also observed. Based on the results, a normalized interestingness factor is calculated as follows:

Interestingnes(mdl)=F(mdl)max(F)mdl, (2)

where mdl indicates the planning model and F(mdl) represents the frequency of occurrence of the trip plan that is proposed by mdl in the dataset. Despite the fact that the given interestingness factor is not a benchmark metric, it is derived based on the evaluation process that is given in Ref. [13]. Further, the distribution of the interestingness factor is determined using the cumulative distribution function (CDF). Later, the obtained interestingness factor and the distribution metrics for all the trips are plotted as in Figure 6.

Figure 6:
Figure 6:

Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (B) Its Distribution for Zero Tolerance.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

5.2 Test Case 2 (with Tolerance)

In this case, a percentage tolerance is defined between the plan and the data elements. For instance, let the trip data be [1, 2]. Here, 1 refers to the travel by bus and 2 refers to the medium travel cost. However, assume the trip plan as [1.1, 2.3] and the corresponding deviation is defined as MEAN(|1.11|1,|2.32|2)=12.50%. Here, tolerances of 10%, 20%, and 50% are applied to enable the trip plan to be considered under a 20% and 50% tolerance. As 12.50%>20% and 50%, the trip plan is not considered under a 10% tolerance. Hence, the entire trip plans are investigated under the three tolerances of 10%, 20%, and 50%, and the results are plotted in Figures 79 , respectively.

Figure 7:
Figure 7:

Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (b) Its Distribution for 10% Tolerance.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

Figure 8:
Figure 8:

Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (B) Its Distribution for 20% Tolerance.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

Figure 9:
Figure 9:

Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (B) Its Distribution for 50% Tolerance.

Citation: Journal of Intelligent Systems 26, 3; 10.1515/jisys-2016-0012

5.3 Statistical Analysis

The statistical analysis includes five first-order statistical metrics, such as best-case performance, worst-case performance, mean performance, median performance, and standard deviation. The best-case performance is defined as the number of normalized interesting factor with a value of 1 throughout the entire trips, whereas the worst-case performance refers to the normalized interesting factor with a value of 0. Hence, the obtained values are tabulated in Tables 25.

Table 2:

Statistical Report on Trip Planning Models for Zero Tolerance.

Batch trip12345678910Cumulative meanCumulative rank
Best-case scenario
 Proposed104102826790738271767081.72
 Trip mean83981009291901079711112599.41
 Cluster model 1245765488105.94
 Cluster model 26128241378436.73
Worst-case scenario
 Proposed47525962555768657174612
 Trip mean5338342633363129192332.21
 Cluster model 112811710583951021261089698105.84
 Cluster model 2132104951011089311310890105104.93
Mean score
 Proposed0.670.650.570.500.610.560.540.520.510.490.562
 Trip mean0.560.670.700.730.670.660.720.710.780.790.701
 Cluster model 10.080.120.120.150.150.120.090.130.170.170.134
 Cluster model 20.110.180.140.120.120.180.130.140.170.150.143
Median score
 Proposed1110.3333110.85710.66670.50.50.782
 Trip mean0.66671111111110.961
 Cluster model 1000000000003
 Cluster model 2000000000003
Standard deviation
 Proposed0.440.450.470.480.470.470.470.470.470.470.474
 Trip mean0.450.420.420.400.420.440.410.410.360.370.413
 Cluster model 10.180.220.220.260.250.240.200.250.260.270.231
 Cluster model 20.230.290.250.210.220.310.250.260.240.20.252
Table 3:

Statistical Report on Trip Planning Models for 10% Tolerance.

Batch trip12345678910Cumulative meanCumulative rank
Best-case scenario
 Proposed111104837392698673736783.11
 Trip mean77798481677486729210281.42
 Cluster model 151212012411.94
 Cluster model 2127116867111358.63
Worst-case scenario
 Proposed4540525348455749676251.82
 Trip mean6842484250404145403545.11
 Cluster model 11099386848382101877679884
 Cluster model 27659545155607362465158.73
Mean score
 Proposed0.680.700.590.560.650.590.590.590.510.520.602
 Trip mean0.510.620.620.620.550.620.640.600.660.710.611
 Cluster model 10.120.130.130.120.140.130.100.120.180.150.134
 Cluster model 20.200.200.230.210.210.200.170.210.280.200.213
Median score
 Proposed111111110.50.50.91
 Trip mean0.50.8110.7111110.91
 Cluster model 1000000000.000.04
 Cluster model 20.00.10.10.10.10.10.00.10.20.10.13
Standard deviation
 Proposed0.450.430.460.480.460.460.470.470.480.470.464
 Trip mean0.470.430.450.460.450.460.440.460.440.430.453
 Cluster model 10.220.190.200.200.210.200.170.200.230.200.201
 Cluster model 20.280.250.270.250.270.260.250.290.290.240.272
Table 4:

Statistical Report on Trip Planning Models for 20% Tolerance.

Batch trip12345678910Cumulative meanCumulative rank
Best-case scenario
 Proposed8890727180707158637073.32
 Trip mean117116123118112103125118133121118.61
 Cluster model 131110312601.84
 Cluster model 2781312910517131010.43
Worst-case scenario
 Proposed5250646965557174867165.72
 Trip mean39302926342822292637301
 Cluster model 11008988909685104101819092.44
 Cluster model 25644435454566563435052.83
Mean score
 Proposed0.600.610.510.500.530.530.500.450.440.490.522
 Trip mean0.700.750.760.760.710.740.780.760.780.730.751
 Cluster model 10.130.150.140.140.120.150.110.110.190.140.144
 Cluster model 20.220.250.270.250.230.230.180.250.280.240.243
Median score
 Proposed0.810.50.480.670.50.50.330.260.50.552
 Trip mean111111111111
 Cluster model 100.020.00000000.080.010.014
 Cluster model 20.170.20.170.180.140.120.10.140.250.180.173
Standard deviation
 Proposed0.440.440.450.460.460.450.460.450.460.460.454
 Trip mean0.430.390.390.380.420.400.370.400.380.410.403
 Cluster model 10.210.200.190.200.170.230.180.180.250.180.201
 Cluster model 20.250.250.290.280.260.280.230.310.280.260.272
Table 5:

Statistical Report on Trip Planning Models for 50% Tolerance.

Batch trip12345678910Cumulative meanCumulative rank
Best-case scenario
 Proposed6765474760505133514751.82
 Trip mean177188192188175176185196193186185.61
 Cluster model 17492101178897.54
 Cluster model 26994817141081610.13
Worst-case scenario
 Proposed3839617767597576716062.32
 Trip mean98718754205.11
 Cluster model 17662607163658969536066.84
 Cluster model 239233722342931382324303
Mean score
 Proposed0.580.570.470.430.460.460.430.380.430.470.472
 Trip mean0.920.940.950.960.930.940.940.960.960.970.951
 Cluster model 10.180.210.220.170.200.230.160.190.240.210.204
 Cluster model 20.240.300.270.270.270.300.280.280.300.310.283
Median score
 Proposed0.60.60.50.40.40.50.40.30.40.50.52
 Trip mean111111111111
 Cluster model 10.10.10.10.10.10.10.10.10.20.10.14
 Cluster model 20.20.30.20.30.20.20.20.30.30.30.23
Standard deviation
 Proposed0.380.370.390.400.420.390.410.370.390.390.394
 Trip mean0.240.210.190.140.220.210.200.170.140.140.191
 Cluster model 10.240.230.250.200.250.280.230.240.260.250.242
 Cluster model 20.240.240.250.230.250.290.270.260.240.280.253

5.4 Discussion

In all the performance illustrations, i.e. from Figures 69, the trip mean model remains in the first position, whereas the proposed model grabs the second position in terms of normalized interestingness factor. The two cluster models take third and fourth positions, respectively. This demonstrates that the socio-economic constraint that is defined by the trip mean model is highly interesting based on the user history, whereas the plan of the proposed model are second likely interesting. However, the CDF illustration of each case shows that the trip mean model and the proposed model grab the fourth and third position, respectively. On the other hand, the cluster models take the first two positions. The interpretation here is that the outcomes, which are obtained from the trip mean model, are loosely distributed. In other words, the dominating performance is not achieved throughout the trips. This can also be seen from the interestingness plot, where the trip mean model has reached a negative score at the final stage of the trips. Despite the fact that the distribution is high in the cluster models, it remains in the final position. In other words, their plans are less likely interesting throughout the trips. In contrast, the plans of the trip mean model are either highly interesting or not at all interesting. Under these circumstances, the proposed model maintains a good trade-off between these two metrics by accomplishing the second position in the interestingness factor, when the distribution almost reaches 0.5.

Similar results have been observed from the statistical report that is tabulated in Tables 25. For instance (refer to Table 2 – best-case scenario), the proposed model produces 81.7 (on average) highly interesting patterns (means, the pattern that has a normalized interestingness score of 1) for every trip. However, the two clustering models have orderly produced only 5.9 and 6.7 patterns. Similarly (in the worst-case scenario), the proposed model produces 61 non-interesting patterns, whereas the cluster models produce around 105 non-interesting patterns. Despite the fact that the trip mean model is dominating in both scenarios (99.4 interesting patterns and only 32.2 non-interesting patterns), the distribution over the 50% interesting patterns is worst (as per Figures 6B, 7B, 8B, and 9B).

The mean scores and the median scores are highly different from the values that are produced in the best-case scenario and the worst-case scenario. For instance, the value 104 (Table 1 – best-case scenario by the proposed model) refers to the number of plans that are achieved with the normalized interestingness factor of 1. The value 0.67 (Table 1 – mean by the proposed model) refers to the average interestingness factor that is accomplished by all the plans, which are proposed by our model. The high performance of the proposed model over the clustering models in the statistical report and the trip mean model in the distribution performance demonstrates that the proposed model maintains a good trade-off between these two metrics, which are essential to ensure a reliable and consistent performance.

The maintained trade-off can be further substantiated through determining the absolute difference between the averaged ranks of the mean rank of all the tolerance scenarios. Accordingly, the proposed model orderly secures the second and third ranks for interestingness- and distribution-based measures. On the other hand, the trip mean model grabs the first and fourth positions, cluster model 1 secures the fourth and first position, and cluster model 2 is ranked as third and second. By determining the absolute difference between the ranks of the two measures, the proposed model and cluster model 2 have the first position (because the absolute difference is determined between 3 and 2). In contrast, the trip mean model and cluster model 1 have the third position (absolute difference between 1 and 4 = 3, and the first rank is shared between trip mean model and cluster model 2). Eventually, cluster model 2 and the proposed model maintain a trade-off because of the low absolute difference (it also has to be noted that the average rank between the two metrics of all the models is the same, i.e. 2.5). Yet, the proposed model is found to be better by prioritizing the interestingness.

5.5 Practical Implications

As stated earlier, the ITS based on socio-economic factors is crucial. As a follow-up, our trip planning model will lay the cornerstone to plan the travel based on the socio-economic constraints. Numerous guides have been distributed to the travelers by various governments as follows:

  1. http://www.zettrans.org.uk/sustainabletravel/documents/ChooseAnotherWayDocument.pdf;
  2. http://webarchive.nationalarchives.gov.uk/20120214193844/http:/dft.gov.uk/pgr/sustainable/travelplans/work/essentialguide.pdf.

Such guides are high level and highly technical, while the commercial applications have many limitations. Such limitations can be overcome using the trip planning model, which provides low-level information to define the trip.

6 Conclusion and Future Scope

This paper has introduced a new trip planning model using data mining approaches. Real-time travel information has been acquired from the Indian city of Hyderabad, and the experimentation has been carried out to demonstrate the performance of the proposed planning model. The proposed planning model was able to produce the socio-economic constraints, which are highly relevant to the trip, rather than its frequency. Three levels of performance investigation have revealed that the proposed model has maintained an adequate trade-off between all these performance metrics. The obtained results have been found to be encouraging, and hence, the performance can be substantially improved than the trip mean model because the trip mean model dominates in the statistical analysis. Moreover, the tolerance analysis can be extended with varying pattern lengths. Further, the intelligence can be strengthened using artificial intelligence techniques to extract the relevant attributes.

Bibliography

  • [1]

    H. Al-Deek and E. B. Emam, New methodology for estimating reliability in transportation networks with degraded link capacities, J. Intell. Transp. Syst. 10 (2006), 117–129.

    • Crossref
    • Export Citation
  • [2]

    K. Chiew and S. Qin, Scheduling and routing of AMOs in an intelligent transport system, IEEE T. Intell. Transp. Syst. 10 (2009), 547–552.

    • Crossref
    • Export Citation
  • [3]

    V. Di Lecce and A. Amato, Route planning and user interface for an advanced intelligent transport system, IET Intell. Transp. Syst. 5 (2011), 149–158.

    • Crossref
    • Export Citation
  • [4]

    J. Dong and H. S. Mahmassani, Stochastic modeling of traffic flow breakdown phenomenon: application to predicting travel time reliability, in: 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), 2011 pp. 2112–2117, Washington, DC, 2011.

  • [5]

    L. Figueiredo, I. S. Jesus, J. A. T. Machado, J. R. Ferreira and J. L. Martins de Carvalho, Towards the development of intelligent transportation systems, in: IEEE Conference on Intelligent Transportation Systems, pp. 1206–1211, Oakland, CA, 2001.

  • [6]

    Z. Juan, J. Wu and M. McDonald, Socio-economic impact assessment of intelligent transport systems, Tsinghua Sci. Technol. 11 (2006), 339–350.

    • Crossref
    • Export Citation
  • [7]

    T. Korhonen, T. Väärämäki, V. Riihimäki, R. Salminen and A. Karila, Selecting telecommunications technologies for intelligent transport system services in Helsinki municipality, IET Intell. Transp. Syst. 6 (2012), 18–28.

    • Crossref
    • Export Citation
  • [8]

    N. Lathia, J. Froehlich and L. Capra, Mining public transport usage for personalised intelligent transport systems, in: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 887–892, Sydney, NSW, 2010.

  • [9]

    I. Masaki, Machine-vision systems for intelligent transportation systems, IEEE Intell. Syst. 13 (1998), 24–31.

    • Crossref
    • Export Citation
  • [10]

    R. B. Noland and J. W. Polak, Travel time variability: a review of theoretical and empirical issues, Transport Rev. 122 (2002), 39–54.

  • [11]

    W. Peng, W. Jiang-Ping and X. Jing, The application of particle swarm optimization on intelligent transport system, International Colloquium on Computing, Communication, Control, and Management4 (2009), 389–391, Sanya, China.

  • [12]

    L. Rokach and O. Maimon, Clustering methods, in: Data Mining and Knowledge Discovery Handbook, Springer US, pp. 321–352, 2005.

    • Crossref
    • Export Citation
  • [13]

    S. Shankar and T. Purusothaman, A new utility-emphasized analysis for stock trading rules, Intelligent Data Analysis17 (2013), 271–294.

    • Crossref
    • Export Citation
  • [14]

    X. Yan, H. Zhang and C. Wu, Research and development of intelligent transportation systems, in: 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES, 2012), pp. 321–327, Guilin, 2012.

  • [15]

    Y. Yu, The track research on the latest developments of intelligent transportation systems abroad, in: 11th World Congress on Intelligent Control and Automation (WCICA), 2014, pp. 5132–5137, Shenyang, 2014.

  • [16]

    N. Zeng, K. Qin and J. Li, Intelligent transport management system for urban traffic hubs based on an integration of multiple technologies, in: IEEE 17th International Industrial Engineering and Engineering Management (IE&EM), 29–31 October 2010, pp. 1178–1183, Xiamen, 2010.

  • [17]

    J. Zhang, F. Y. Wang, K. Wang, W. H. Lin, X. Xu and C. Chen, Data-driven intelligent transportation systems: a survey, IEEE T. Intell. Transp. Syst. 12 (2011), 1624–1639.

    • Crossref
    • Export Citation

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1]

    H. Al-Deek and E. B. Emam, New methodology for estimating reliability in transportation networks with degraded link capacities, J. Intell. Transp. Syst. 10 (2006), 117–129.

    • Crossref
    • Export Citation
  • [2]

    K. Chiew and S. Qin, Scheduling and routing of AMOs in an intelligent transport system, IEEE T. Intell. Transp. Syst. 10 (2009), 547–552.

    • Crossref
    • Export Citation
  • [3]

    V. Di Lecce and A. Amato, Route planning and user interface for an advanced intelligent transport system, IET Intell. Transp. Syst. 5 (2011), 149–158.

    • Crossref
    • Export Citation
  • [4]

    J. Dong and H. S. Mahmassani, Stochastic modeling of traffic flow breakdown phenomenon: application to predicting travel time reliability, in: 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), 2011 pp. 2112–2117, Washington, DC, 2011.

  • [5]

    L. Figueiredo, I. S. Jesus, J. A. T. Machado, J. R. Ferreira and J. L. Martins de Carvalho, Towards the development of intelligent transportation systems, in: IEEE Conference on Intelligent Transportation Systems, pp. 1206–1211, Oakland, CA, 2001.

  • [6]

    Z. Juan, J. Wu and M. McDonald, Socio-economic impact assessment of intelligent transport systems, Tsinghua Sci. Technol. 11 (2006), 339–350.

    • Crossref
    • Export Citation
  • [7]

    T. Korhonen, T. Väärämäki, V. Riihimäki, R. Salminen and A. Karila, Selecting telecommunications technologies for intelligent transport system services in Helsinki municipality, IET Intell. Transp. Syst. 6 (2012), 18–28.

    • Crossref
    • Export Citation
  • [8]

    N. Lathia, J. Froehlich and L. Capra, Mining public transport usage for personalised intelligent transport systems, in: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 887–892, Sydney, NSW, 2010.

  • [9]

    I. Masaki, Machine-vision systems for intelligent transportation systems, IEEE Intell. Syst. 13 (1998), 24–31.

    • Crossref
    • Export Citation
  • [10]

    R. B. Noland and J. W. Polak, Travel time variability: a review of theoretical and empirical issues, Transport Rev. 122 (2002), 39–54.

  • [11]

    W. Peng, W. Jiang-Ping and X. Jing, The application of particle swarm optimization on intelligent transport system, International Colloquium on Computing, Communication, Control, and Management4 (2009), 389–391, Sanya, China.

  • [12]

    L. Rokach and O. Maimon, Clustering methods, in: Data Mining and Knowledge Discovery Handbook, Springer US, pp. 321–352, 2005.

    • Crossref
    • Export Citation
  • [13]

    S. Shankar and T. Purusothaman, A new utility-emphasized analysis for stock trading rules, Intelligent Data Analysis17 (2013), 271–294.

    • Crossref
    • Export Citation
  • [14]

    X. Yan, H. Zhang and C. Wu, Research and development of intelligent transportation systems, in: 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES, 2012), pp. 321–327, Guilin, 2012.

  • [15]

    Y. Yu, The track research on the latest developments of intelligent transportation systems abroad, in: 11th World Congress on Intelligent Control and Automation (WCICA), 2014, pp. 5132–5137, Shenyang, 2014.

  • [16]

    N. Zeng, K. Qin and J. Li, Intelligent transport management system for urban traffic hubs based on an integration of multiple technologies, in: IEEE 17th International Industrial Engineering and Engineering Management (IE&EM), 29–31 October 2010, pp. 1178–1183, Xiamen, 2010.

  • [17]

    J. Zhang, F. Y. Wang, K. Wang, W. H. Lin, X. Xu and C. Chen, Data-driven intelligent transportation systems: a survey, IEEE T. Intell. Transp. Syst. 12 (2011), 1624–1639.

    • Crossref
    • Export Citation
OPEN ACCESS

Journal + Issues

Search

  • View in gallery

    Sequence of Steps Involved in the Proposed Methodology.

  • View in gallery

    Pseudo Code for the Splitting Algorithm.

  • View in gallery

    Pseudo Code of the Algorithm to Extract the Relevant Attributes.

  • View in gallery

    Pseudo Code for Mining the Frequent as well as the Feasible Patterns.

  • View in gallery

    An Example to Illustrate the Algorithm for Mining Frequent and Feasible Patterns.

  • View in gallery

    Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (B) Its Distribution for Zero Tolerance.

  • View in gallery

    Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (b) Its Distribution for 10% Tolerance.

  • View in gallery

    Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (B) Its Distribution for 20% Tolerance.

  • View in gallery

    Performance Comparison Using (A) Normalized Interestingness Factor Over Every Trip Plan and (B) Its Distribution for 50% Tolerance.