1 G. Brockman, "OpenAI Gym"
2 A. Weinstein, "Open-Loop planning in Large-Scale Stochastic Domains" 1436-1442, 2013
3 D. Silver, "Mastering the game of Go with deep neural networks and tree search" 529 : 484-489, 2016
4 D. Hafner, "Learning Latent Dynamics for Planning from Pixels" 2019
5 M. P. Deisenroth, "Gaussian processes for data-efficient learning in robotics and control" 37 (37): 408-423, 2015
6 M. Lewis, "Deal or no deal? end-to-end learning of negotiation dialogues" 2443-2453, 2017
7 M. Kobilarov, "Cross-entropy motion planning" 31 (31): 855-871, 2012
8 N. Lipovetzky, "Classical Planning with Simulators: Results on the Atari Video Games" 1610-1616, 2015
9 L. Kocsis, "Bandit based Monte-Carlo planning" 282-293, 2006
10 P. Boer, "A tutorial on the cross-entropy method" 134 (134): 19-67, 2005
1 G. Brockman, "OpenAI Gym"
2 A. Weinstein, "Open-Loop planning in Large-Scale Stochastic Domains" 1436-1442, 2013
3 D. Silver, "Mastering the game of Go with deep neural networks and tree search" 529 : 484-489, 2016
4 D. Hafner, "Learning Latent Dynamics for Planning from Pixels" 2019
5 M. P. Deisenroth, "Gaussian processes for data-efficient learning in robotics and control" 37 (37): 408-423, 2015
6 M. Lewis, "Deal or no deal? end-to-end learning of negotiation dialogues" 2443-2453, 2017
7 M. Kobilarov, "Cross-entropy motion planning" 31 (31): 855-871, 2012
8 N. Lipovetzky, "Classical Planning with Simulators: Results on the Atari Video Games" 1610-1616, 2015
9 L. Kocsis, "Bandit based Monte-Carlo planning" 282-293, 2006
10 P. Boer, "A tutorial on the cross-entropy method" 134 (134): 19-67, 2005
11 C. Browne, "A survey of monte carlo tree search methods" 4 (4): 1-43, 2012