リケラボ論文検索は、全国の大学リポジトリにある学位論文・教授論文を一括検索できる論文検索サービスです。

リケラボ 全国の大学リポジトリにある学位論文・教授論文を一括検索するならリケラボ論文検索大学・研究所にある論文を検索できる

リケラボ 全国の大学リポジトリにある学位論文・教授論文を一括検索するならリケラボ論文検索大学・研究所にある論文を検索できる

大学・研究所にある論文を検索できる 「Analysis of Coordination Structures of Partially Observing Cooperative Agents by Multi-Agent Deep Q-Learning」の論文概要。リケラボ論文検索は、全国の大学リポジトリにある学位論文・教授論文を一括検索できる論文検索サービスです。

コピーが完了しました

URLをコピーしました

論文の公開元へ論文の公開元へ
書き出し

Analysis of Coordination Structures of Partially Observing Cooperative Agents by Multi-Agent Deep Q-Learning

Smith Ken 早稲田大学

2021.03.15

概要

We compare the coordination structures of agents using different types of in- puts for their deep Q-networks (DQNs) by having agents play a distributed task execution game. The efficiency and performance of many multi-agent systems can be significantly affected by the coordination structures formed by agents. One im- portant factor that may affect these structures is the information provided to an agent’s DQN. In this study, we analyze the differences in coordination structures in an environment involving walls to obstruct visibility and movement. Additionally, we introduce a new DQN input, which performs better than past inputs in a dynamic setting. Experimental results show that agents with their absolute locations in their DQN input indicate a granular level of labor division in some settings, and that the consistency of the starting locations of agents significantly affects the coordination structures and performances of agents.

この論文で使われている画像

参考文献

[1] Chollet, Francois, et al. Keras. https://github.com/fchollet/keras, 2015.

[2] E. A. O. Diallo and T. Sugawara. Coordination in Adversarial Multi-Agent with Deep Re- inforcement Learning Under Partial Observability. In 2019 IEEE 31st International Con- ference on Tools with Artificial Intelligence (ICTAI), 2019.

[3] J. Fan, Z. Wang, Y. Xie, and Z. Yang. A theoretical analysis of deep q-learning. In A. M. Bayen, A. Jadbabaie, G. Pappas, P. A. Parrilo, B. Recht, C. Tomlin, and M. Zeilinger, editors, Proceedings of the 2nd Conference on Learning for Dynamics and Control, volume 120 of Proceedings of Machine Learning Research, pages 486–489, The Cloud, 10–11 Jun 2020. PMLR.

[4] J. N. Foerster, Y. M. Assael, N. de Freitas, and S. Whiteson. Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks. CoRR, abs/1602.02672, 2016.

[5] J. K. Gupta, M. Egorov, and M. Kochenderfer. Cooperative Multi-agent Control Using Deep Reinforcement Learning. In Autonomous Agents and Multiagent Systems, pages 66–83. Springer International Publishing, 2017.

[6] M. Hausknecht and P. Stone. Deep Recurrent Q-Learning for Partially Observable MDPs. CoRR, abs/1507.06527, 2015.

[7] M. Hessel, J. Modayil, H. van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. G. Azar, and D. Silver. Rainbow: Combining Improvements in Deep Rein- forcement Learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18) New Orleans, Louisiana, USA, February 2-7, 2018, pages 3215– 3222. AAAI Press, 2018.

[8] G. Lample and D. S. Chaplot. Playing FPS Games with Deep Reinforcement Learning. CoRR, abs/1609.05521, 2016.

[9] J. Z. Leibo, V. Zambaldi, M. Lanctot, J. Marecki, and T. Graepel. Multi-Agent Reinforce- ment Learning in Sequential Social Dilemmas. AAMAS ’17, page 464–473, Richland, SC, 2017. International Foundation for Autonomous Agents and Multiagent Systems.

[10] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. Continuous control with deep reinforcement learning. CoRR, abs/1509.02971, 2015.

[11] L.-J. Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3-4):293–321, May 1992.

[12] R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. CoRR, abs/1706.02275, 2017.

[13] Y. Miyashita and T. Sugawara. Cooperation and Coordination Regimes by Deep Q-Learning in Multi-agent Task Executions. In Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation, 2019.

[14] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Ried- miller. Playing Atari with Deep Reinforcement Learning. CoRR, abs/1312.5602, 2013.

[15] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human- level control through deep reinforcement learning. Nature, 518(7540):529–533, feb 2015.

[16] V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, page 807–814, Madison, WI, USA, 2010. Omnipress.

[17] D. Portugal and R. Rocha. A survey on multi-robot patrolling algorithms. In L. M. Camarinha-Matos, editor, Technological Innovation for Sustainability, pages 139–146, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.

[18] A. Sugiyama, V. Sea, and T. Sugawara. Emergence of divisional cooperation with negotia- tion and re-learning and evaluation of flexibility in continuous cooperative patrol problem. Knowledge and Information Systems, 60(3):1587–1609, Dec. 2018.

[19] T. Tieleman and G. Hinton. Neural Networks for Machine Learning - Lecture 6a - Overview of mini-batch gradient descent. 2012.

[20] H. van Hasselt, A. Guez, and D. Silver. Deep Reinforcement Learning with Double Q- learning. CoRR, abs/1509.06461, 2015.

[21] Z. Wang, T. Schaul, M. Hessel, H. Van Hasselt, M. Lanctot, and N. De Freitas. Duel- ing Network Architectures for Deep Reinforcement Learning. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 1995–2003. JMLR.org, 2016.

[22] C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8(3-4):279–292, May 1992.

参考文献をもっと見る