Available at: https://digitalcommons.calpoly.edu/theses/3283
Date of Award
6-2026
Degree Name
MS in Statistics
Department/Program
Statistics
College
College of Science and Mathematics
Advisor
Kelly Bodwin
Advisor Department
Statistics
Advisor College
College of Science and Mathematics
Abstract
Unsupervised clustering algorithms today are used across a wide variety of fields such as biology, engineering, and industry in order to classify observations into groups where labels are not provided. This can provide important latent information regarding the observations within groups, as well as insight regarding the groups themselves. In order to judge the optimal number of clusters for an unsupervised clustering algorithm, many methods exist such as the Elbow Method and Silhouette Score; however, these methods come with drawbacks and are not necessarily flexible across many unsupervised methods. We present a novel clustering score framework relying on a resampling-based approach that classifies pairs of points as either “rivals” or “buddies”, constructing a score for a given number of clusters. We then present results demonstrating the properties of this metric on various simulated and real datasets, measuring the performance of the algorithm as well.