Date of Award


Degree Name

MS in Computer Science


Computer Science


College of Engineering


Alexander Dekhtyar

Advisor Department

Computer Science

Advisor College

College of Engineering


Over the past two decades there has been a rapid decline in public oversight of state and local governments. From 2003 to 2014, the number of journalists assigned to cover the proceedings in state houses has declined by more than 30\%. During the same time period, non-profit projects such as Digital Democracy sought to collect and store legislative bill and hearing information on behalf of the public. More recently, AI4Reporters, an offshoot of Digital Democracy, seeks to actively summarize interesting legislative data.

This thesis presents STRAINER, a parallel project with AI4Reporters, as an active data retrieval and filtering system for surfacing newsworthy legislative data. Within STRAINER we define and implement a process pipeline by which information regarding legislative bill discussion events can be collected from a variety of sources and aggregated into feature sets suitable for machine learning. Utilizing two independent labeling techniques we trained a variety of SVM and Logistic Regression models to predict the newsworthiness of bill discussions that took place in the California State Legislature during the 2017-2018 session year. We found that our models were able to correctly retrieve more than 80\% of newsworthy discussions.