Available at: https://digitalcommons.calpoly.edu/theses/2894

Date of Award

9-2024

Degree Name

MS in Mechanical Engineering

Department/Program

Mechanical Engineering

College

College of Engineering

Advisor

Mohammad Hasan

Advisor Department

Mechanical Engineering

Advisor College

College of Engineering

Abstract

The implementation of deep reinforcement learning in mobile robotics offers a great solution for the development of autonomous mobile robots to efficiently complete tasks and transport objects. Reinforcement learning continues to show impressive potential in robotics applications through self-learning and biological plausibility. Despite its advancements, challenges remain in applying these machine learning techniques in dynamic environments. This thesis explores the performance of Deep Q-Networks (DQN), using images as an input, for mobile robot navigation in dynamic maze puzzles and aims to contribute to advancements in deep reinforcement learning applications for simulated and real-life robotic systems. This project is a step towards implementation in a hardware-based system. The proposed approach uses a DQN algorithm with experience replay and an epsilon-greedy annealing schedule. Experiments are conducted to train DQN agents in static and dynamic maze environments, and various reward functions and training strategies are explored to optimize learning outcomes. In this context, the dynamic aspect involves training the agent on fixed mazes and then testing its performance on modified mazes, where obstacles like walls alter previously optimal paths to the goal. In game play, the agent achieved a 100\% win rate in both 4x4 and 10x10 static mazes, successfully making it to the goal regardless of slip conditions. The number of rewards obtained during the game-play episodes indicates that the agent took the optimal path in all 100 episodes of the 4x4 maze without the slip condition, whereas it took the shortest, most optimal path in 99 out of 100 episodes in the 4x4 maze with the slip condition. Compared to the 4x4 maze, the agent more frequently chose sub-optimal paths in the larger 10x10 maze, as indicated by the amount of times the agent maximized rewards obtained. In the 10x10 static maze game-play, the agent took the optimal path in 96 out of 100 episodes for the no slip condition, while it took the shortest path in 93 out of 100 episodes for the slip condition. In the dynamic maze experiment, the agent successfully solved 7 out of 8 mazes with a 100\% win rate in both original and modified maze environments. The results indicate that adequate exploration, well-designed reward functions, and diverse training data significantly impacted both training performance and game play outcomes. The findings suggest that DQN approaches are plausible solutions to stochastic outcomes, but expanding upon the proposed method and more research is needed to improve this methodology. This study highlights the need for further efforts in improving deep reinforcement learning applications in dynamic environments.

Download

Included in

Robotics Commons

COinS

Master's Theses

Dynamic Maze Puzzle Navigation Using Deep Reinforcement Learning

Date of Award

Degree Name

Department/Program

College

Advisor

Advisor Department

Advisor College

Abstract

Included in

Search

Browse

Author Corner

LINKS

Master's Theses

Dynamic Maze Puzzle Navigation Using Deep Reinforcement Learning

Author

Date of Award

Degree Name

Department/Program

College

Advisor

Advisor Department

Advisor College

Abstract

Included in

Share

Search

Browse

Author Corner

LINKS