Master's Theses

Closing the Sim-to-Real Gap in Multirotor Swarm Learning Through Scalable Digital-Twin Learning Pipelines

Cameron R. Wolff, California Polytechnic State University, San Luis ObispoFollow

Available at: https://digitalcommons.calpoly.edu/theses/3375

Date of Award

6-2026

Degree Name

MS in Computer Science

Department/Program

Computer Science

College

College of Engineering

Advisor

Siavash Farzan

Advisor Department

Electrical Engineering

Advisor College

College of Engineering

Abstract

Deploying coordinated multirotor swarms requires solving three coupled problems within a single experimental pipeline: (i) agile low-level flight control, (ii) decentral- ized multi-agent coordination, and (iii) simulation-to-hardware transfer. This thesis formulates, designs, and validates a scalable digital-twin learning pipeline for nano quadrotor reinforcement learning in NVIDIA Isaac Lab, using the Crazyflie 2.1 as the physical validation platform.

The digital twin is formulated as a deployment-oriented vehicle abstraction for nano- class quadrotors rather than a static simulation asset. It couples a collective-thrust body-rate (CTBR) control interface with first-order asymmetric actuator dynamics, thrust-coefficient and battery-ceiling randomization, aerodynamic drag, command delay, an observation-noise curriculum, and ONNX policy export. A single-agent validation path demonstrates that this formulation transfers: a hover policy trained entirely in simulation deploys zero-shot to a Crazyflie 2.1 brushless quadrotor, achiev- ing 0.081 m mean position error across 28 hardware flights, and a trajectory-tracking policy attains 0.24 m mean error (0.27 m RMSE) with 96

The same vehicle model and control interface are then extended to decentralized swarm learning through a manager-based multi-agent environment layer designed for centralized training with decentralized execution and multi-agent proximal policy op- timization. Two attention-based swarm tasks are designed on this layer: waypoint navigation with local signed-distance obstacle observations, and formation flight. Stage 1 evaluations demonstrate single-active-drone waypoint navigation with 100

The result is a reproducible baseline that connects simulation, training, policy export, standardized logging, and hardware deployment through a shared nano-quadrotor vehicle model and CTBR interface. Because the swarm policies inherit the hardware- validated digital twin and action interface, multi-agent sim-to-real transfer becomes iva direct extension of the pipeline rather than a re-engineering effort, and is identified as future work.

Master's Theses

Closing the Sim-to-Real Gap in Multirotor Swarm Learning Through Scalable Digital-Twin Learning Pipelines

Date of Award

Degree Name

Department/Program

College

Advisor

Advisor Department

Advisor College

Abstract

Included in

Search

Browse

Author Corner

LINKS

Master's Theses

Closing the Sim-to-Real Gap in Multirotor Swarm Learning Through Scalable Digital-Twin Learning Pipelines

Author

Date of Award

Degree Name

Department/Program

College

Advisor

Advisor Department

Advisor College

Abstract

Included in

Share

Search

Browse

Author Corner

LINKS