Available at: https://digitalcommons.calpoly.edu/theses/3231
Date of Award
3-2026
Degree Name
MS in Statistics
Department/Program
Statistics
College
College of Science and Mathematics
Advisor
Kelly Bodwin
Advisor Department
Statistics
Advisor College
College of Science and Mathematics
Abstract
Semantic segmentation of eelgrass from drone imagery is crucial for coastal habitat monitoring, restoration, and management, as these habitats continue to see rapid changes due to climate change and human influence. However, the reliability of generalizing a deployed classification model relies on both high-accuracy segmentation as well as robust uncertainty quantification that holds up when conditions change over years or locations. Conformal prediction (CP) is a method that converts a classifier's output into prediction sets with a guaranteed average coverage level for in-distribution data. However, the “vanilla” conformal score can often under-cover in hard or out-of-distribution (OOD) regions under drift. Motivated by difficulty-adaptive conformal approaches, this study investigates normalizing conformal scores by a proxy for epistemic uncertainty, ensemble disagreement. Nonconformity scores are rescaled by an estimated metric of local difficulty learned on the calibration set, so that the global quantile is conservative where OOD risk is high. This study evaluates two normalization methods: a parametric linear normalization and a nonparametric normalization. This creates a locally adaptive score. On drone imagery of the Morro Bay estuary, California (2018-2022), the models were trained on 2018-2021, calibrated on 2021, and evaluated on 2022 on OOD points. Normalized scores recover all or almost all of the lost coverage seen by vanilla split conformal prediction, increase the percentage of singletons, and reduce the variability in spatial coverage. In this work, we provide an easy-to-implement drop-in framework for regaining coverage in challenging regions and out-of-distribution data, eliminating the need for retraining or labels at test time.
Included in
Applied Statistics Commons, Artificial Intelligence and Robotics Commons, Data Science Commons, Environmental Monitoring Commons