[1] Chollet, Francois, et al. Keras. https://github.com/fchollet/keras, 2015.
[2] E. A. O. Diallo and T. Sugawara. Coordination in Adversarial Multi-Agent with Deep Re- inforcement Learning Under Partial Observability. In 2019 IEEE 31st International Con- ference on Tools with Artificial Intelligence (ICTAI), 2019.
[3] J. Fan, Z. Wang, Y. Xie, and Z. Yang. A theoretical analysis of deep q-learning. In A. M. Bayen, A. Jadbabaie, G. Pappas, P. A. Parrilo, B. Recht, C. Tomlin, and M. Zeilinger, editors, Proceedings of the 2nd Conference on Learning for Dynamics and Control, volume 120 of Proceedings of Machine Learning Research, pages 486–489, The Cloud, 10–11 Jun 2020. PMLR.
[4] J. N. Foerster, Y. M. Assael, N. de Freitas, and S. Whiteson. Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks. CoRR, abs/1602.02672, 2016.
[5] J. K. Gupta, M. Egorov, and M. Kochenderfer. Cooperative Multi-agent Control Using Deep Reinforcement Learning. In Autonomous Agents and Multiagent Systems, pages 66–83. Springer International Publishing, 2017.
[6] M. Hausknecht and P. Stone. Deep Recurrent Q-Learning for Partially Observable MDPs. CoRR, abs/1507.06527, 2015.
[7] M. Hessel, J. Modayil, H. van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. G. Azar, and D. Silver. Rainbow: Combining Improvements in Deep Rein- forcement Learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18) New Orleans, Louisiana, USA, February 2-7, 2018, pages 3215– 3222. AAAI Press, 2018.
[8] G. Lample and D. S. Chaplot. Playing FPS Games with Deep Reinforcement Learning. CoRR, abs/1609.05521, 2016.
[9] J. Z. Leibo, V. Zambaldi, M. Lanctot, J. Marecki, and T. Graepel. Multi-Agent Reinforce- ment Learning in Sequential Social Dilemmas. AAMAS ’17, page 464–473, Richland, SC, 2017. International Foundation for Autonomous Agents and Multiagent Systems.
[10] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. Continuous control with deep reinforcement learning. CoRR, abs/1509.02971, 2015.
[11] L.-J. Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3-4):293–321, May 1992.
[12] R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. CoRR, abs/1706.02275, 2017.
[13] Y. Miyashita and T. Sugawara. Cooperation and Coordination Regimes by Deep Q-Learning in Multi-agent Task Executions. In Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation, 2019.
[14] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Ried- miller. Playing Atari with Deep Reinforcement Learning. CoRR, abs/1312.5602, 2013.
[15] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human- level control through deep reinforcement learning. Nature, 518(7540):529–533, feb 2015.
[16] V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, page 807–814, Madison, WI, USA, 2010. Omnipress.
[17] D. Portugal and R. Rocha. A survey on multi-robot patrolling algorithms. In L. M. Camarinha-Matos, editor, Technological Innovation for Sustainability, pages 139–146, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
[18] A. Sugiyama, V. Sea, and T. Sugawara. Emergence of divisional cooperation with negotia- tion and re-learning and evaluation of flexibility in continuous cooperative patrol problem. Knowledge and Information Systems, 60(3):1587–1609, Dec. 2018.
[19] T. Tieleman and G. Hinton. Neural Networks for Machine Learning - Lecture 6a - Overview of mini-batch gradient descent. 2012.
[20] H. van Hasselt, A. Guez, and D. Silver. Deep Reinforcement Learning with Double Q- learning. CoRR, abs/1509.06461, 2015.
[21] Z. Wang, T. Schaul, M. Hessel, H. Van Hasselt, M. Lanctot, and N. De Freitas. Duel- ing Network Architectures for Deep Reinforcement Learning. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 1995–2003. JMLR.org, 2016.
[22] C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8(3-4):279–292, May 1992.