Date of Award

6-2026

Degree Name

MS in Statistics

Department/Program

Statistics

College

College of Science and Mathematics

Advisor

Kelly Bodwin

Advisor Department

Statistics

Advisor College

College of Science and Mathematics

Abstract

Unsupervised clustering often faces the challenge of determining the correct number of clusters in the absence of a true target variable. Traditional methods such as the Elbow Method and the Silhouette Score can produce ambiguous results and rely on assumptions about cluster shape or separation. To address this, we created the Clustering Rivals and Buddies (CRAB) algorithm which evaluates clusters based on stability across multiple subsamples. CRAB uses pairwise classifications to identify points that consistently group together called “Buddies” and points that remain separated called “Rivals.” Applied with K-means, CRAB accurately recovers underlying cluster structures in both spherical and non-spherical datasets, even under varying sample sizes and variances. An accompanying R package implements CRAB with a tidy workflow and easy to follow syntax, making it accessible for research and educational use. By emphasizing stability, CRAB offers a robust alternative for selecting cluster numbers in datasets where traditional methods may fall short.

Included in

Data Science Commons

Share

COinS