Available at: https://digitalcommons.calpoly.edu/theses/1228
Date of Award
MS in Computer Science
Many modern-day Bioinformatics algorithms rely heavily on statistical models to analyze their biological data. Some of these statistical models lend themselves nicely to standard high performance computing optimizations such as parallelism, while others do not. One such algorithm is Markov Chain Monte Carlo (MCMC). In this thesis, we present a heterogeneous compute solution for optimizing GenSel, a genetic selection analysis tool. GenSel utilizes a MCMC algorithm to perform Bayesian inference using Gibbs sampling.
Optimizing an MCMC algorithm is a difficult problem because it is inherently sequential, containing a loop carried dependence between each Markov Chain iteration. The optimization presented in this thesis utilizes GPU computing to exploit the data-level parallelism within each of these iterations. In addition, it allows for the efficient management of memory, the pipelining of CUDA kernels, and the use of multiple GPUs. The optimizations presented show performance improvements of up to 1.84 times that of the original algorithm.