Available at: https://digitalcommons.calpoly.edu/theses/3244
Date of Award
3-2026
Degree Name
MS in Computer Science
Department/Program
Computer Science
College
College of Engineering
Advisor
John Clements
Advisor Department
Computer Science
Advisor College
College of Engineering
Abstract
Recent advancements in generative artificial intelligence have revolutionized music generation, yet research has predominantly focused on raw audio synthesis over music in symbolic form, i.e. a score. This thesis presents the first neurosymbolic model designed to generate imitative Renaissance counterpoint in symbolic (MIDI) format. By leveraging an autoregressive Transformer architecture, this research explores the capacity of deep learning models to manage independent voices and strict stylistic constraints.
We compare multiple data representation strategies with distinct tokenization methods. The proposed model incorporates a symbolic component that enforces fundamental contrapuntal rules. Additionally, this thesis contributes a preprocessed dataset of Renaissance polyphony, in order to facilitate future research in polyphonic music generation. Ultimately, this work demonstrates that flattened sequence models, when combined with appropriate tokenization and symbolic constraints, are capable of generating rich, multi-layered contrapuntal music.