Muhammad Aurangzeb, Frank L. Lewis, and Manfred Huber
Reinforcement learning, distributed RL, intelligent graph search
This paper addresses the problem of steering a swarm of autonomous agents out of an unknown graph to some goal located at an unknown location. To address this task, an ε–greedy, collaborative reinforce- ment learning method using only local information exchanges is in- troduced in this paper to balance exploitation and exploration in the unknown graph and to optimize the ability of the swarm to exit from the graph. The learning and routing algorithm given here provides a mechanism for storing data needed to represent the collaborative utility function based on the experiences of previous agents visiting a node that results in routing decisions that improve with time. Two theorems show the theoretical soundness of the proposed learning method and illustrate the importance of the stored information in improving decision-making for routing. Simulation examples show that the introduced simple rules of learning from past experience significantly improve performance over random search and search based on ant colony optimization, a metaheuristic algorithm.
Important Links:
Go Back