Title
Exploring Mathematical Strategies for Finding Hidden Features in Multi-Dimensional Big Datasets
Recommended Citation
October 1, 2016.
Abstract
With advances in technology in brighter sources and larger and faster detectors, the amount of data generated at national user facilities such as SLAC is increasing exponentially. Humans have a superb ability to recognize patterns in complex and noisy data and therefore, data is still curated and analyzed by humans. However, a human brain is unable to keep up with the accelerated pace of data generation, and as a consequence, the rate of new discoveries hasn't kept pace with the rate of data creation. Therefore, new procedures to quickly assess and analyze the data are needed. Machine learning approaches are effective in reducing the complexity of data and finding hidden trends and contrasts in large datasets. The primary goal of this project is to develop a new algorithm using recent advances in image processing, machine learning techniques, and employing different types of distance metrics such as Euclidian, Manhattan, and Cosine to a large amount of diffraction data collected at a synchrotron beamline in high-throughput experimentation. The new algorithm enables analysis and extraction of hidden features from a large multi-dimensional dataset on-the-fly and near real-time with minimal computational cost and human intervention. When the algorithm is performed on a large number of x-ray diffraction patterns, the algorithm can be used to find the structural phase boundaries leading to the discovery of the composition-structure relationship, which is often an end goal of many materials science experiments.
Disciplines
Algebra | Logic and Foundations | Numerical Analysis and Scientific Computing | Plasma and Beam Physics
Mentor
Fang Ren, Apurva Mehta
Lab site
SLAC National Accelerator Laboratory (SLAC)
Funding Acknowledgement
This material is based upon work supported by the National Science Foundation through the Robert Noyce Teacher Scholarship Program under grant# 1546150. Any opinions, finding, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. The research was made possible by the California State University STEM Teacher Researcher Program.
Included in
Algebra Commons, Logic and Foundations Commons, Numerical Analysis and Scientific Computing Commons, Plasma and Beam Physics Commons
URL: https://digitalcommons.calpoly.edu/star/412