A HIERARCHICAL REINFORCEMENT LEARNING-BASED APPROACH TO MULTI-ROBOT COOPERATION FOR TARGET SEARCHING IN UNKNOWN ENVIRONMENTS

Yifan Cai, Simon X. Yang, and Xin Xu

References

  1. [1] A. Pal, R. Tiwari, and A. Shukla, Multi robot exploration through pruning frontiers, Advanced Materials Research, 462, 2012, 609–616.
  2. [2] K.S. Senthilkumar and K.K. Bharadwaj, Multi-robot exploration and terrain coverage in an unknown environment,Robotics and Autonomous Systems, 60(1), 2012, 123–132.
  3. [3] D. Vrajitoru, P. Konnanur, and R. Mehler, Genetic algo-rithms for a single-track vehicle autonomous pilot, Control and Intelligent Systems, 36(1), 47–56, 2008.
  4. [4] Y. Yuan and H.G. Tanner, Sensor graphs for guaranteedcooperative localization performance, Control and Intelligent Systems, 38(1), 2010, 32–39.
  5. [5] E.A. Maravillas and E.P. Dadios, Fira middle-league robot soccer game strategy, Control and Intelligent Systems, 35(4), 2007, 377–385.
  6. [6] F. Fahimi, S.V.S. Rineesh, and C. Nataraj, Formation controllers for underactuated surface vessels and zero dynamics stability, Control and Intelligent Systems, 36(3), 2008, 277–287.
  7. [7] M. Varga, Z. Piskovic, S. Bogdan, and Z. Li, Multi-agent swarm based localization of hazardous events, Control and Intelligent Systems, 40(1), 2012, 49–56.
  8. [8] J. Chakraborty, A. Konar, L. Jain, and U.K. Chakraborty, Cooperative multi-robot path planning using differential evolution, Journal of Intelligent and Fuzzy Systems, 20(1–2), 2009, 13–27.
  9. [9] Z. Wang, P. Goldsmith, and J. Gu, Adaptive trajectory tracking control for Euler-Lagrange systems with application to robot manipulators, Control and Intelligent Systems, 37(1), 2009, 46–56.
  10. [10] Y. Wang and Z. Zheng, A method of reinforcement learning based automatic traffic signal control, Proc. of the 2011 Int. Conf. on Measuring Technology and Mechatronics Automation, Shangshai, China, 6–7 January 2011, 119–122.
  11. [11] R.S. Sutton and A.G. Barto, Reinforcement learning: Anintroduction (Cambridge, MA: MIT Press, 1998).
  12. [12] L. Yu, A. Marin, F. Hong, and J. Lin, Studies on hierarchical reinforcement learning in multi-agent environment, Proc. of the 2008 IEEE Int. Conf. on Networking, Sensing and Control, Sanya, China, 6–8 April 2008, 1714–1720.
  13. [13] J. Li, Q. Pan, and B. Hong, A new multi-agent reinforcement learning approach, Proc. of the 2010 IEEE Int. Conf. on Information and Automation, Harbin, China, 20–23 June 2010, 1667–1671.
  14. [14] Y.T. Tian, M. Yang, X.Y. Qi, and Y. M. Yang, Multi-robot task allocation for fire-disaster response based on reinforcement learning, Proc. of the 2009 Eighth Int. Conf. on Machine Learning and Cybernetics, Baoding, China, 12–15 July 2009, 12–15.
  15. [15] C.F. Juang and C.H. Hsu, Reinforcement ant optimized fuzzy controller for mobile-robot wall-following control, IEEE Transactions on Industrial Electronics, 56(10), 2009, 3931–3940.
  16. [16] X. Cheng, J. Shen, H. Liu, and O. Gu, Multi-robot cooperation based on hierarchical reinforcement learning, Proc. of the 7th Int. Conf. on Computational Science, Beijing, China, 27–30 May 2007, 90–97.
  17. [17] C. Shi, R. Huang, and Z. Shi, Automatic discovery of subgoals in reinforcement learning using unique-direction value, Proc. of the 6th IEEE Int. Conf. on Cognitive Informatics, Lake Tahoe, CA, 6–8 August 2007, 480–486.
  18. [18] X. Du, Q. Li, and J. Han, A unifying framework for HAMs-family HRL methods, Proc. of the 2007 IEEE Int. Conf. on Robotics and Biomimetics, Sanya, China, 15–18 December2007, 1978–1982.
  19. [19] N. K. Jong and P. Stone, Hierarchical model-based reinforcement learning: R-max + maxq, Proc. of the 25th Int. Conf. on Machine Learning, Helsinki, Finland, 5–9 July 2008, 432–439.
  20. [20] J. Shen, O. Gu, and H. Liu, Multi-agent hierarchical reinforcement learning by integrating options into MAXQ, Proc. of the First Int. Multi-Symposiums on Computer and Computational Sciences, Hangzhou, China, 20–24 April 2006, 676–682.
  21. [21] J. Yuan, Y. Huang, T. Tao, and F. Sun, A cooperative approach for multi-robot area exploration, Proc. of the 2010 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Taipei, Taiwan, 18–22 October 2010, 1390–1395.
  22. [22] A. Marjovi, J. G. Nunes, L. Marques, and A. de Almeida, Multi-robot exploration and fire searching, Proc. of the 2009 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, St. Louis, USA, 11–15 October 2009, 1929–1934.
  23. [23] J.R. Jang, C. Sun, and E. Mizutani, Neuro-fuzzy and soft computing: A computational approach to learning and machine intelligence (New York, USA: Prentice Hall, 1997).
  24. [24] A.D. Haumann, K.D. Listmann, and V. Willert, Discoverage: A new paradigm for multi-robot exploration, Proc. of the 2010 IEEE Int. Conf. on Robotics and Automation, Anchorage, AK, 3–8 May 2010, 924–934.
  25. [25] K.M. Wurm, C. Stachniss, and W. Burgard, “Coordinatedmulti-robot exploration using a segmentation of the environment, Proc. of the 2008 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Nice, France, 22–26 September 2008, 1160–1165.
  26. [26] N. Cai, J.X. Xi, and Y.S. Zhong, Asymptotic swarm stability of high-order multi-agent systems: Condition and application, Control and Intelligent Systems, 40(1), 2012, 33–39.
  27. [27] J. Guo and L. Liu, A study of improvement of d algorithms for mobile robot path planning in partial unknown environments, Kybernetes, 39(6), 2010, 935–945.
  28. [28] Y. Jin, B. Wang, and S. Wu, Avoiding obstacles using virtual force field with humanoid strategy for autonomous robot in 2-d environment, Computer Engineering and Applications, 46(34), 2010, 215–218.

Important Links:

Go Back