Available at: https://digitalcommons.calpoly.edu/theses/3329
Date of Award
6-2026
Degree Name
MS in Statistics
Department/Program
Statistics
College
College of Science and Mathematics
Advisor
Kelly Bodwin
Advisor Department
Statistics
Advisor College
College of Science and Mathematics
Abstract
Unsupervised clustering often faces the challenge of determining the correct number of clusters in the absence of a true target variable. Traditional methods such as the Elbow Method and the Silhouette Score can produce ambiguous results and rely on assumptions about cluster shape or separation. To address this, we created the Clustering Rivals and Buddies (CRAB) algorithm which evaluates clusters based on stability across multiple subsamples. CRAB uses pairwise classifications to identify points that consistently group together called “Buddies” and points that remain separated called “Rivals.” Applied with K-means, CRAB accurately recovers underlying cluster structures in both spherical and non-spherical datasets, even under varying sample sizes and variances. An accompanying R package implements CRAB with a tidy workflow and easy to follow syntax, making it accessible for research and educational use. By emphasizing stability, CRAB offers a robust alternative for selecting cluster numbers in datasets where traditional methods may fall short.