A data-mining methodological framework to reveal drivers' route choice preferences from vehicle trajectory data through traffic-aware characterization and evaluation.
Citation info:
Wan, Z., & Dodge, S. (2025). Revealing drivers’ route choice preferences through traffic-aware characterization and evaluation. Journal of Location Based Services. (Accepted) http://dx.doi.org/10.1080/17489725.2025.2601133
Understanding driver route choice behavior is essential for advancing navigation systems, autonomous driving, and intelligent transportation planning. While existing studies have identified a wide range of potential route choice factors across different regions, spatial and cultural heterogeneity may influence their relevance. To address this, we propose a traffic-aware, data-driven framework for revealing route choice preferences through interpretable modeling and clustering. The framework comprises three core components: (1) constructing a traffic-aware road network with hourly resolution from large-scale trajectory data, (2) identifying region-specific influential route choice factors by contrasting observed routes to randomized alternatives using a random forest model, and (3) clustering routes based on these factors to reveal distinct preference profiles. Applied to a dense vehicle trajectory dataset from Shenzhen, China, the framework uncovers three dominant route choice preferences: systematic time-focused routing, adaptive strategies prioritizing maneuver simplicity and traffic avoidance, and moderately efficient, habitual or routine-based behaviors. We further evaluate whether standard, non-preference-driven route generation algorithms can replicate these preferences and find that they fall short in capturing the full spectrum of driver behavior, particularly for context-sensitive and habitual routing styles. These findings underscore the value of the proposed framework in capturing meaningful behavioral diversity and point toward the need for preference-aware route generation strategies.
Due to the file size limitation of GitHub, the relevant data is not included. Please refer to our Figshare project for the complete code with data.
- Preprocessing
RunPreprocessing.ipynbin "Preprocessing" to preprocess the raw vehicle trajectory data. Runosrm_tracepoints.pyandosrm_pts2edges.pyin "Preprocessing/Map_Matching" to map match the trajectory to obtain routes. - Traffic-aware road network construction
A traffic-aware road network with time-varying travel time information is used in this framework. RunConstruct_Traffic-aware_Road_Net.pyin "Traffic-aware_Road_Net" to construct it by estimating real travel time from vehicle trajectory data. - Synthetic route generation
Runsyn_routes_labeling.py,syn_routes_lkpen.py,syn_routes_lkelim.py,syn_routes_sim.py, andsyn_routes_kshort.pyin "Synthetic_Route_Generation" to generate synthetic routes using the labeling, link penalty, link elimination, simulation, and k-shortest-time paths algorithms, respectively. - Route characterization
Run
charact_routes.pyin "Route_Characterization" to characterize the observed and generated routes. - Route choice factor identification
RunGet_Factor_Dataset.ipynbandRandom_Forest_Classifier.ipynbin "Significant_Choice_Factors" to identify the region-specific significant route choice factors by contrasting observed routes with randomly generated alternatives to identify and quantify the most influential route choice factors using a flexible and interpretable random forest model. - Preference-driven route choice set generation
In "Preference_Choice_Sets", runUniversal_Choice_Sets.ipynbto generate universal choice sets,Train_Pref_Clf.ipynbto train an MLP-based route choice preference classifier, andPref_Choice_Set_Generation.ipynbto generate preference-aligned route choice sets. Finally, runExamine_Replication.ipynbto examine replication rates.
