Available at: https://digitalcommons.calpoly.edu/theses/3341
Date of Award
6-2026
Degree Name
MS in Electrical Engineering
Department/Program
Electrical Engineering
College
College of Engineering
Advisor
Ria Kanjilal
Advisor Department
Electrical Engineering
Advisor College
College of Engineering
Abstract
This thesis investigates deep learning approaches for affect recognition using wearable physiological signals and facial image data. The sensor-based component evaluates stress and affect recognition on the WESAD dataset using wrist-based physiological windows and examines multiple temporal modeling strategies, including convolutional, recurrent, hybrid CNN-LSTM, attention-based, ensemble, and time-frequency approaches.
The image-based component evaluates hard-label facial expression recognition on AffectNet+ using pre-trained ResNet-50, EfficientNet-B3, and ConvNeXt-Tiny architectures across Easy, Challenging, and Difficult subsets representing different levels of expression ambiguity.
Experimental results show that the proposed Multi-Branch Attention CNN-BiLSTM (MBA-CNN-BiLSTM) model achieves the strongest wearable stress- and affect-recognition performance among the evaluated WESAD models under the random sample-level evaluation setting, reaching 99.91% binary accuracy and 98.10% three-class accuracy while reducing error rate by 5.67 and 11.03 percentage points, respectively, relative to the reproduced 1D CNN baseline.
On AffectNet+, the best mean accuracies were 88.82% on the Easy subset using EfficientNet-B3 and 59.48% and 41.83% on the Challenging and Difficult subsets using ConvNeXt-Tiny, corresponding to error-rate reductions of 2.96, 7.86, and 7.49 percentage points relative to the published ResNet-50 Hard-FER reference targets.
Together, these results demonstrate that physiological and visual modalities capture complementary aspects of affective state, motivating future multimodal affect-recognition systems that integrate internal physiological responses with external expressive cues.