Available at: https://digitalcommons.calpoly.edu/theses/1958
Date of Award
MS in Electrical Engineering
As the concept of the Smart Home is being embraced globally, IoT devices such as the Amazon Echo, Google Home, and Nest Thermostat are becoming a part of more and more households. In the data-driven world we live in today, internet service providers (ISPs) and companies are collecting large amounts of data and using it to learn about their customers. As a result, it is becoming increasingly important to understand what information ISPs are capable of collecting. IoT devices in particular exhibit distinct behavior patterns and specific functionality which make them especially likely to reveal sensitive information. Collection of this data provides valuable information and can have some serious privacy implications. In this work I present an approach to fingerprinting IoT devices behind private networks while only examining last-mile internet traffic . Not only does this attack only rely on traffic that would be available to an ISP, it does not require changes to existing infrastructure. Further, it does not rely on packet contents, and therefore works despite encryption. Using a database of 64 million packets logged over 15 weeks I was able to train machine learning models to classify the Amazon Echo Dot, Amazon Echo Show, Eufy Genie, and Google Home consistently. This approach combines unsupervised and supervised learning and achieves a precision of 99.95\%, equating to one false positive per 2,000 predictions. Finally, I discuss the implication of identifying devices within a home.