Available at: https://digitalcommons.calpoly.edu/theses/1535
Date of Award
MS in Civil and Environmental Engineering
Civil and Environmental Engineering
Anurag Pande, PhD
This study develops, analyzes, and applies transit-system-specific regression tree models that identify and prioritize transit system improvements through analysis and application of ridership, Census, routing and scheduling, and transit stop characteristic data. Regression trees identify and rank independent variables that split dependent variable datasets into meaningful subsets according to significant relationships with independent variable datasets, and regression tree models can be used to identify and prioritize transit system improvements. In this study, ridership datatypes are the dependent variables (i.e., boardings and alightings) and Census, routing and scheduling, and transit stop characteristic datatypes are the independent variables. Data associated with the San Luis Obispo Regional Transit Authority (RTA) is the basis of this study.
The literature review for this study identified no other studies that use regression trees to identify and/or prioritize transit system improvements. The analysis method herein can help identify and prioritize improvements to any transit system. The findings of this study may be applicable to other transit systems if assumptions can be made about the similarity of other systems to the San Luis Obispo Regional Transit Authority system.
Relationships between transit ridership and independent variables that may be effective predictors of transit ridership are evaluated in this study. Traditional independent variables used to forecast transit ridership include population and employment densities, land use types, income distributions, service frequencies, and transit stop accessibility; other independent variables that may be significant predictors of transit ridership include transit stop amenities, characteristics, and connecting and nearby infrastructure.
Ridership data needed for the analysis presented in this study can be obtained from transit agencies. Census data needed for the analysis presented in this study is available through the United States Census Bureau. Routing and scheduling data needed for the analysis presented in this study can be extracted from local transit system schedules. Transit stop characteristic data needed for the analysis presented in this study can be gathered by using a survey instrument during field-visits.
The regression tree models developed in this study show a positive relationship in the RTA system between transit ridership and population density (specifically Asian and twenty to twenty-four years old residential population densities), the number of trips serving transit stops, and transit stop characteristics (specifically the presence of a trash can). According to these findings, this study offers recommendations for improvements to RTA’s transit system and marketing and planning strategies. More general conclusions that could be applicable to more transit systems could be drawn if the analysis method used in this study were performed with more and/or larger datasets (e.g., other transit agency, regional, statewide, national, and/or global datasets) comprised of more robust, accurate, and precise datatypes, and this concept is the basis for the future work recommended by this study.