DOI: https://doi.org/10.15368/theses.2018.152
Available at: https://digitalcommons.calpoly.edu/theses/1960
Date of Award
12-2018
Degree Name
MS in Computer Science
Department/Program
Computer Science
Advisor
Alex Dekhtyar
Abstract
Physical activity can have immediate and long-term benefits on health and reduce the risk for chronic diseases. Valid measures of physical activity are needed in order to improve our understanding of the exact relationship between physical activity and health. Activity monitors have become a standard for measuring physical activity; accelerometers in particular are widely used in research and consumer products because they are objective, inexpensive, and practical. Previous studies have experimented with different monitor placements and classification methods. However, the majority of these methods were developed using data collected in controlled, laboratory-based settings, which is not reliably representative of real life data. Therefore, more work is required to validate these methods in free-living settings.
For our work, 25 participants were directly observed by trained observers for two two-hour activity sessions over a seven day timespan. During the sessions, the participants wore accelerometers on the wrist, thigh, and chest. In this thesis, we tested a battery of machine learning techniques, including a hierarchical classification schema and a confusion matrix boosting method to predict activity type, activity intensity, and sedentary time in one-second intervals. To do this, we created a dataset containing almost 100 hours worth of observations from three sets of accelerometer data from an ActiGraph wrist monitor, a BioStampRC thigh monitor, and a BioStampRC chest monitor. Random forest and k-nearest neighbors are shown to consistently perform the best out of our traditional machine learning techniques. In addition, we reduce the severity of error from our traditional random forest classifiers on some monitors using a hierarchical classification approach, and combat the imbalanced nature of our dataset using a multi-class (confusion matrix) boosting method. Out of the three monitors, our models most accurately predict activity using either or both of the BioStamp accelerometers (with the exception of the chest BioStamp predicting sedentary time). Our results show that we outperform previous methods while still predicting behavior at a more granular level.
Included in
Other Computer Sciences Commons, Other Public Health Commons, Software Engineering Commons