Science graduates from many disciplines will be required to work with and analyze massive data sets. These large data sets, called “Big Data,” cannot be handled by conventional database hardware and software, but need special software and analytics tuned to work with the volume of data, the velocity in which it steams, and the variety of unstructured data which needs to be analyzed. Students need a place to work with big data prior to graduating, and the hands-on use of big data will make them better prepared for the world of work.

The “Big Data Studio” is a place to educate students in big data literacy by creating a controlled environment, a ‘sandbox’ where students can work with and analyze these data sets. A cross-disciplinary project at a polytechnic university, the Big Data Studio uses faculty and students from the computer science department to set up, tune, and maintain the hardware and software. Statistics students and faculty will help with the analytics and R programming. Students from science, business, and other disciplines will explore, mine, and analyze the large data sets for results. Datasets might come from the Ocean Observatories Initiative (OOI), clickstream data from online retailers, or seismic data from USArray. Students will improve their big data literacy and statistical literacy by working with the database tools in the Big Data Studio.


Library and Information Science



URL: https://digitalcommons.calpoly.edu/lib_fac/77