DOI: https://doi.org/10.15368/theses.2018.79
Available at: https://digitalcommons.calpoly.edu/theses/1849
Date of Award
6-2018
Degree Name
MS in Computer Science
Department/Program
Computer Science
Advisor
Brian Granger
Abstract
With the emergence of big data, scientific data analysis and visualization (DAV) tools are critical components of the data science software ecosystem; the usability of these tools is becoming extremely important to facilitate next-generation scientific discoveries. JupyterLab has been considered as one of the best polyglot, web-based, open-source data science tools. As the next phase of extensible interface for the classic iPython Notebooks, this tool supports interactive data science and scientific computing across multiple programming languages with great performances. Despite these advantages, previous heuristics evaluation studies have shown that JupyterLab has some significant flaws in the data visualization side. The current DAV system in JupyterLab heavily relies on users’ understanding and familiarity with certain visualization libraries, and doesn’t support the golden visual-information-seeking mantra of “overview first, zoom and filter, then details-on-demand”. These limitations often lead to a workflow bottleneck at the start of a project.
In this thesis, we present ‘JupyterLab_Voyager’, an extension for JupyterLab that provides a graphical user interface (GUI) for data visualization operations and couples faceted browsing with visualization recommendation to support exploration of multivariate, tabular data, as a solution to improve the usability of the DAV system. The new plugin works with various types of datasets in the JupyterLab ecosystem; using the plugin you can perform a high-level graphical analysis of fields within your dataset sans coding without leaving the JupyterLab environment. It helps analysts learn about the dataset and engage in both open-ended exploration and target specific answers from the dataset. User testings and evaluations demonstrated that this implementation has good usability and significantly improves the DAV system in JupyterLab.