Master's Theses

Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

Journey McDowell, California Polytechnic State University, San Luis ObispoFollow

DOI: https://doi.org/10.15368/theses.2019.117
Available at: https://digitalcommons.calpoly.edu/theses/2100

Date of Award

11-2019

Degree Name

MS in Mechanical Engineering

Department/Program

Mechanical Engineering

College

College of Engineering

Advisor

Charles Birdsong

Advisor Department

Mechanical Engineering

Advisor College

College of Engineering

Abstract

Two controller performances are assessed for generalization in the path following task of autonomously backing up a tractor-trailer. Starting from random locations and orientations, paths are generated to loading docks with arbitrary pose using Dubins Curves. The combination vehicles can be varied in wheelbase, hitch length, weight distributions, and tire cornering stiffness. The closed form calculation of the gains for the Linear Quadratic Regulator (LQR) rely heavily on having an accurate model of the plant. However, real-world applications cannot expect to have an updated model for each new trailer. Finding alternative robust controllers when the trailer model is changed was the motivation of this research.

Reinforcement learning, with neural networks as their function approximators, can allow for generalized control from its learned experience that is characterized by a scalar reward value. The Linear Quadratic Regulator and the Deep Deterministic Policy Gradient (DDPG) are compared for robust control when the trailer is changed. This investigation quantifies the capabilities and limitations of both controllers in simulation using a kinematic model. The controllers are evaluated for generalization by altering the kinematic model trailer wheelbase, hitch length, and velocity from the nominal case.

In order to close the gap from simulation and reality, the control methods are also assessed with sensor noise and various controller frequencies. The root mean squared and maximum errors from the path are used as metrics, including the number of times the controllers cause the vehicle to jackknife or reach the goal. Considering the runs where the LQR did not cause the trailer to jackknife, the LQR tended to have slightly better precision. DDPG, however, controlled the trailer successfully on the paths where the LQR jackknifed. Reinforcement learning was found to sacrifice a short term reward, such as precision, to maximize the future expected reward like reaching the loading dock. The reinforcement learning agent learned a policy that imposed nonlinear constraints such that it never jackknifed, even when it wasn't the trailer it trained on.

controlTrailerKinematics.zip (7308 kB)
MATLAB Simulink Tractor-Trailer Control and Simulator
gym-truck-backerupper.zip (10 kB)
Python OpenAI Gym Tractor-Trailer Simulator
DDPG.zip (11 kB)
Deep Deterministic Policy Gradient Python Code Using Tensorflow 1.12
ModernControls_v_ReinforcementLearning.zip (19 kB)
Scripts for running comparison of controllers, saving logs, and generating reports & figures

Download

Included in

Navigation, Guidance, Control, and Dynamics Commons

COinS

Master's Theses

Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

Date of Award

Degree Name

Department/Program

College

Advisor

Advisor Department

Advisor College

Abstract

Included in

Search

Browse

Author Corner

LINKS

Master's Theses

Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

Author

Date of Award

Degree Name

Department/Program

College

Advisor

Advisor Department

Advisor College

Abstract

Included in

Share

Search

Browse

Author Corner

LINKS