Available at: https://digitalcommons.calpoly.edu/theses/3246
Date of Award
3-2026
Degree Name
MS in Computer Science
Department/Program
Computer Science
College
College of Engineering
Advisor
Jonathan Ventura
Advisor Department
Computer Science
Advisor College
College of Engineering
Abstract
Novel view synthesis (NVS) aims to generate images of a scene from unseen camera viewpoints. Recent work, such as Stable Virtual Camera, shows that large-scale image diffusion models like Stable Diffusion can be adapted for pose-conditioned view synthesis by incorporating video-generation techniques with camera conditioning. In this thesis, we introduce MVFlow, a new NVS model that extends this approach to a different image generation architecture: a flow-matching diffusion transformer, specifically FLUX.1, which has demonstrated strong performance in image synthesis. We evaluate MVFlow under varying input view counts and pose distance settings. Our results show that this architectural transfer is feasible; however, the current architecture and conditioning strategy do not yet match the performance of a mature baseline such as Stable Virtual Camera.