Cloud Reasoning Model-Based Exploration for Deep Reinforcement Learning

LI Chenxi; CAO Lei; CHEN Xiliang; ZHANG Yongliang; XU Zhixiong; PENG Hui; DUAN Liwen

doi:10.11999/JEIT170347

Volume 40 Issue 1

Jan. 2018

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2018 > 40(1): 244-248

LI Chenxi, CAO Lei, CHEN Xiliang, ZHANG Yongliang, XU Zhixiong, PENG Hui, DUAN Liwen. Cloud Reasoning Model-Based Exploration for Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2018, 40(1): 244-248. doi: 10.11999/JEIT170347

Citation:

LI Chenxi, CAO Lei, CHEN Xiliang, ZHANG Yongliang, XU Zhixiong, PENG Hui, DUAN Liwen. Cloud Reasoning Model-Based Exploration for Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2018, 40(1): 244-248. doi: 10.11999/JEIT170347

Citation:

PDF( 228 KB)

Cloud Reasoning Model-Based Exploration for Deep Reinforcement Learning

doi: 10.11999/JEIT170347 cstr: 32379.14.JEIT170347

1.
(Institute of Command Information System, PLA University of Science and Technology, Nanjing 210007, China)
2.
(College of Mechanical Engineering, Zhejiang University, Hangzhou 310027, China)

Funds:

The Advanced Research of China Electronics Technology Group Corporation (6141B08010101), China Postdoctoral Science Foundation (2015T81081, 2016M602974), The Jiangsu Natural Science Foundation for Youths (BK20140075)

Received Date: 2017-04-18
Rev Recd Date: 2017-09-30
Publish Date: 2018-01-19

Abstract

Abstract

Reinforcement learning which has self-improving and online learning properties gets the policy of tasks through the interaction with environment. But the mechanism of trial-and-error usually leads to a large number of training episodes. Knowledge includes human experience and the cognition of environment. This paper tries to introduce the qualitative rules into the reinforcement learning, and represents these rules through the cloud reasoning model. It is used as the heuristics exploration strategy to guide the action selection. Empirical evaluation is conducted in OpenAI Gym environment called CartPole-v2 and the result shows that using exploration strategy based on the cloud reasoning model significantly enhances the performance of the learning process.
- Cloud reasoning,
- Deep reinforcement learning,
- Knowledge,
- Exploration strategy

FullText(HTML)

References(13)

References

MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[OL]. https://arxiv.org /abs/1312.5602v1, 2013.12.

SUTTON R S and BARTO A G. Reinforcement Learning: An Introduction[M]. MA: MIT Press, 1998: 3-24. doi: 10.1109/ TNN.1998.712192.

MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human- level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. doi: 10.1038/nature14236.

OSBAND I, BLUNDELL C, PRITZEL A, et al. Deep exploration via bootstrapped DQN[C]. Proceedings of the 29th Neural Information Processing Systems, Barcelona, 2016: 4026-4034.

BELLEMARE M, SRINIVASAN S, OSTROVSKI G, et al. Unifying count-based exploration and intrinsic motivation[C]. Proceedings of the 29th Neural Information Processing Systems, Barcelona, 2016: 1471-1479.

HOUTHOOFT R, CHEN X, DUAN Y, et al. VIME: Variational information maximizing exploration[C]. Proceedings of the 29th Neural Information Processing Systems, Barcelona, 2016: 1109-1117.

DAVENPORT T H, PRUSAK L, and PRUSAK L. Working Knowledge: How Organizations Manage What They Know [M]. Boston: Harvard Business School Press, 1997: 1-24. doi: 10.1145/347634.348775.

SANTOS M and BOTELLA G. Dyna-H: A heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems[J]. Knowledge-Based Systems, 2012, 32(8): 28-36.

BIANCHI R A C, ROS R, and MANTARAS R L D. Improving reinforcement learning by using case based heuristics[C]. Proceedings of the International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development, Burlin, 2009: 75-89.

KUHLMANN G, STONE P, MOONEY R, et al. Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer[C]. Proceedings of the 19th National Conference on Artificial Intelligence Workshop on Supervisory Control of Learning and Adaptive Systems, California, 2004: 30-35.

LI Deyi, CHEUNG D, SHI Xuemei, et al. Uncertainty reasoning based on cloud models in controllers[J]. Computers Mathematics with Applications, 1998, 35(3): 99-123.

SINGH S P. Learning to solve Markovian decision processes [D]. [Ph.D. dissertation], University of Massachusetts, Amherst, 1994: 66-72.

HASSELT H V, GUEZ A, and SILVER D. Deep reinforcement learning with double Q-learning[C]. Proceedings of the 30th AAAI Conference on Articial Intelligence, Phoenix, 2016: 2094-2100.

Relative Articles

Supplements(0)

Cited By

Proportional views