Date of Award


Degree Name

MS in Computer Science


Computer Science


Zoe Wood


Simulating light is a very computationally expensive proposition. There are a wide variety of global illumination algorithms that are implemented and used by major motion picture companies to render interesting and believable scenes. Every algorithm strives to find a balance between speed and accuracy. The Point Based Approximate Color Bleeding algorithm is one of the most widely used algorithms in the field today. The Point Based Approximate Color Bleeding(PBACB) global illumination algorithm is based on the central idea that the geometry and direct illumination of the scene can be approximated by using a point cloud representation. This point cloud representation can then be used to generate the indirect illumination. The most basic unit of the point cloud is a surfel. A surfel is a two dimensional circle in space that contains the direct illumination for that section of space. The surfels are gathered in a tree structure and approximations are generated for the different levels of the tree. This tree is then used to calculate the appropriate color bleeding effect to apply to the surfaces in a rendered image. The main goal of this project was to explore the possibility of applying CUDA to the PBACB global illumination algorithm. CUDA is an extension of the C/C++ programing languages which allows for GPU parallel programming. In this paper, we present our GPU based implementation of the PBACB algorithm. The PBACB algorithm involves three central steps, creation of a surfel point cloud, generation of the spherical harmonics approximations for the point cloud, and using the surfel point cloud to generate an approximation for global illumi- nation. For this project, CUDA was applied to two of the steps of the PBACB algorithm, the generation of the spherical harmonic representations and the ap- plication of the surfel point cloud to generate indirect illumination. Our final GPU algorithm was able to obtain a 4.0 times speedup over our CPU version. We also discuss future work which could include the use of CUDA’s Dynamic Parallelism and a stack free implementation which could increase the speedups seen by our algorithm.