Available at: https://digitalcommons.calpoly.edu/theses/2894
Date of Award
9-2024
Degree Name
MS in Mechanical Engineering
Department/Program
Mechanical Engineering
College
College of Engineering
Advisor
Mohammad Hasan
Advisor Department
Mechanical Engineering
Advisor College
College of Engineering
Abstract
The implementation of deep reinforcement learning in mobile robotics offers a great solution for the development of autonomous mobile robots to efficiently complete tasks and transport objects. Reinforcement learning continues to show impressive potential in robotics applications through self-learning and biological plausibility. Despite its advancements, challenges remain in applying these machine learning techniques in dynamic environments. This thesis explores the performance of Deep Q-Networks (DQN), using images as an input, for mobile robot navigation in dynamic maze puzzles and aims to contribute to advancements in deep reinforcement learning applications for simulated and real-life robotic systems. This project is a step towards implementation in a hardware-based system. The proposed approach uses a DQN algorithm with experience replay and an epsilon-greedy annealing schedule. Experiments are conducted to train DQN agents in static and dynamic maze environments, and various reward functions and training strategies are explored to optimize learning outcomes. In this context, the dynamic aspect involves training the agent on fixed mazes and then testing its performance on modified mazes, where obstacles like walls alter previously optimal paths to the goal. In game play, the agent achieved a 100\% win rate in both 4x4 and 10x10 static mazes, successfully making it to the goal regardless of slip conditions. The number of rewards obtained during the game-play episodes indicates that the agent took the optimal path in all 100 episodes of the 4x4 maze without the slip condition, whereas it took the shortest, most optimal path in 99 out of 100 episodes in the 4x4 maze with the slip condition. Compared to the 4x4 maze, the agent more frequently chose sub-optimal paths in the larger 10x10 maze, as indicated by the amount of times the agent maximized rewards obtained. In the 10x10 static maze game-play, the agent took the optimal path in 96 out of 100 episodes for the no slip condition, while it took the shortest path in 93 out of 100 episodes for the slip condition. In the dynamic maze experiment, the agent successfully solved 7 out of 8 mazes with a 100\% win rate in both original and modified maze environments. The results indicate that adequate exploration, well-designed reward functions, and diverse training data significantly impacted both training performance and game play outcomes. The findings suggest that DQN approaches are plausible solutions to stochastic outcomes, but expanding upon the proposed method and more research is needed to improve this methodology. This study highlights the need for further efforts in improving deep reinforcement learning applications in dynamic environments.