Postprint version. Published in Safety Science, Volume 47, Issue 1, January 1, 2009, pages 145-154.
Copyright © 2009 Elsevier.
The definitive version is available at http://dx.doi.org/10.1016/j.ssci.2007.12.001.
Data mining applications are becoming increasingly popular for many applications across a set of very divergent fields. Analysis of crash data is no exception. There are many data mining methodologies that have been applied to crash data in the recent past. However, one particular application conspicuously missing from the traffic safety literature until recently is association analysis or market basket analysis. The methodology is used by retailers all over the world to determine which items are purchased together. In this study, crashes are analyzed as supermarket transactions to detect interdependence among crash characteristics. The results from the analysis include simple rules that indicate which crash characteristics are associated with each other. The application is demonstrated using non-intersection crash data from the state of Florida for the year 2004. In the proposed methodology no variable needs to be assigned as dependent variable. Hence, it is useful in identifying previously unknown patterns in the data obtained from large jurisdictions (such as the State of Florida) as opposed to the data from a single roadway or intersection. Based on the association rules discovered from the analysis, it was concluded that there is a significant correlation between lack of illumination and high severity of crashes. Furthermore, it was found that under rainy conditions straight sections with vertical curves are particularly crash prone. Results are consistent with the understanding of crash characteristics and point to the potential of this methodology for the analysis of crash data collected by the state and federal agencies. The potential of this technique may be realized in the form of a decision support tool for the traffic safety administrators.
Civil and Environmental Engineering