MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[OL]. https://arxiv.org /abs/1312.5602v1, 2013.12.
|
SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. MA: MIT Press, 1998: 3-24. doi: 10.1109/ TNN.1998.712192.
|
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human- level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. doi: 10.1038/nature14236.
|
OSBAND I, BLUNDELL C, PRITZEL A, et al. Deep exploration via bootstrapped DQN[C]. Proceedings of the 29th Neural Information Processing Systems, Barcelona, 2016: 4026-4034.
|
BELLEMARE M, SRINIVASAN S, OSTROVSKI G, et al. Unifying count-based exploration and intrinsic motivation[C]. Proceedings of the 29th Neural Information Processing Systems, Barcelona, 2016: 1471-1479.
|
HOUTHOOFT R, CHEN X, DUAN Y, et al. VIME: Variational information maximizing exploration[C]. Proceedings of the 29th Neural Information Processing Systems, Barcelona, 2016: 1109-1117.
|
DAVENPORT T H, PRUSAK L, and PRUSAK L. Working Knowledge: How Organizations Manage What They Know [M]. Boston: Harvard Business School Press, 1997: 1-24. doi: 10.1145/347634.348775.
|
SANTOS M and BOTELLA G. Dyna-H: A heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems[J]. Knowledge-Based Systems, 2012, 32(8): 28-36.
|
BIANCHI R A C, ROS R, and MANTARAS R L D. Improving reinforcement learning by using case based heuristics[C]. Proceedings of the International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development, Burlin, 2009: 75-89.
|
KUHLMANN G, STONE P, MOONEY R, et al. Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer[C]. Proceedings of the 19th National Conference on Artificial Intelligence Workshop on Supervisory Control of Learning and Adaptive Systems, California, 2004: 30-35.
|
LI Deyi, CHEUNG D, SHI Xuemei, et al. Uncertainty reasoning based on cloud models in controllers[J]. Computers Mathematics with Applications, 1998, 35(3): 99-123.
|
SINGH S P. Learning to solve Markovian decision processes [D]. [Ph.D. dissertation], University of Massachusetts, Amherst, 1994: 66-72.
|
HASSELT H V, GUEZ A, and SILVER D. Deep reinforcement learning with double Q-learning[C]. Proceedings of the 30th AAAI Conference on Articial Intelligence, Phoenix, 2016: 2094-2100.
|