Date of Award


Degree Name

MS in Civil and Environmental Engineering


Civil and Environmental Engineering


Anurag Pande


This thesis describes the development and evaluation of real-time crash risk assessment models for four freeway corridors, US-101 NB (northbound) and SB (southbound) as well as I-880 NB and SB. Crash data for these freeway segments for the 16-month period from January 2010 through April 2011 are used to link historical crash occurrences with real-time traffic patterns observed through loop detector data.

The analysis techniques adopted for this study are logistic regression and classification trees, which are one of the most common data mining tools. The crash risk assessment models are developed based on a binary classification approach (crash and non-crash outcomes), with traffic parameters measured at surrounding vehicle detection station (VDS) locations as the independent variables. The classification performance assessment methodology accounts for rarity of crashes compared to non-crash cases in the sample instead of the more common pre-specified threshold-based classification.

Prior to development of the models, some of the data-related issues such as data cleaning and aggregation were addressed. Based on the modeling efforts, it was found that the turbulence in terms of speed variation is significantly associated with crash risk on the US-101 NB corridor. The models estimated with data from US-101 NB were evaluated based on their classification performance, not only on US-101 NB, but also on the other three freeways for transferability assessment. It was found that the predictive model derived from one freeway can be readily applied to other freeways, although the classification performance decreases. The models which transfer best to other roadways were found to be those that use the least number of VDSs–that is, using one upstream and downstream station rather than two or three.

The classification accuracy of the models is discussed in terms of how the models can be used for real-time crash risk assessment, which may be helpful to authorities for freeway segments with newly installed traffic surveillance apparatuses, since the real-time crash risk assessment models from nearby freeways with existing infrastructure would be able to provide a reasonable estimate of crash risk. These models can also be applied for developing and testing variable speed limits (VSLs) and ramp metering strategies that proactively attempt to reduce crash risk.

The robustness of the model output is assessed by location, time of day and day of week. The analysis shows that on some locations the models may require further learning due to higher than expected false positive (e.g., the I-680/I-280 interchange on US-101 NB) or false negative rates. The approach for post-processing the results from the model provides ideas to refine the model prior to or during the implementation.