1. Stochastic games, Shapley, L. S., 39(10):1095{1100, , 1953
2. Collaborative Multiagent Reinforcement Learning by Payoff Propagation, Vlassis, N., Kok, J. R., 7:1789–1828, , 2006
3. QSOD: Hybrid policy gradient for deep multi-agent reinforcement learning, Yi, S., Choi, G. S., Rehman, H. M. R. U., On, B. W., Ningombam, D. D., 9, 129728-129741, , 2021
4. Multi-agent reinforcement learning as a rehearsal for decentralized planning, Bikramjit Banerjee, Landon Kraemer, 190:82–94, , 2016
1. Stochastic games, Shapley, L. S., 39(10):1095{1100, , 1953
2. Collaborative Multiagent Reinforcement Learning by Payoff Propagation, Vlassis, N., Kok, J. R., 7:1789–1828, , 2006
3. QSOD: Hybrid policy gradient for deep multi-agent reinforcement learning, Yi, S., Choi, G. S., Rehman, H. M. R. U., On, B. W., Ningombam, D. D., 9, 129728-129741, , 2021
4. Multi-agent reinforcement learning as a rehearsal for decentralized planning, Bikramjit Banerjee, Landon Kraemer, 190:82–94, , 2016