MazeCov-Q: An Efficient Maze-Based Reinforcement Learning Accelerator for Coverage
🔗Abstract: Reinforcement learning (RL) is an unsupervised machine learning that does not requires pre-assigned labeled data to learn. It is implemented in many areas such as robotics, games, finances, health, transportation, and energy applications. In this paper, we present an application of reinforcement learning accelerator for finding coverage area and its implementation in a mobile robot called MazeCov-Q (Maze-Based Coverage Q-Learning). We define a novel state that is divided into two conditions. The conditions are directions and visit counters for the Q-value calculation. The experimental results show that our MazeCov-Q achieves more than 74% path efficiency on average. Moreover, our coverage-based Q-learning accelerator (MazeCov-Q) achieves 48.3 Mps and 169.05 Mps for 50 Mhz Pynq Z1 and 175 MHz ZCU104 boards, respectively. This research is useful for surveillance, resource allocation, environmental monitoring, and autonomous navigation.
I. Syafalni, M. I. Firdaus, A. M. Riyadhus Ilmy, N. Sutisna and T. Adiono, “MazeCov-Q: An Efficient Maze-Based Reinforcement Learning Accelerator for Coverage,” 2023 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS), Tokyo, Japan, 2023, pp. 01-06, doi: 10.1109/COOLCHIPS57690.2023.10122120.
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10122120&isnumber=10121921