A Deep Reinforcement Learning Communication Jamming Resource Allocation Algorithm Fused with Noise Network

PENG Xiang; XU Hua; JIANG Lei; RAO Ning; SONG Bailin

doi:10.11999/JEIT220066

Volume 45 Issue 3

Mar. 2023

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2023 > 45(3): 1043-1054

PENG Xiang, XU Hua, JIANG Lei, RAO Ning, SONG Bailin. A Deep Reinforcement Learning Communication Jamming Resource Allocation Algorithm Fused with Noise Network[J]. Journal of Electronics & Information Technology, 2023, 45(3): 1043-1054. doi: 10.11999/JEIT220066

Citation:

PENG Xiang, XU Hua, JIANG Lei, RAO Ning, SONG Bailin. A Deep Reinforcement Learning Communication Jamming Resource Allocation Algorithm Fused with Noise Network[J]. Journal of Electronics & Information Technology, 2023, 45(3): 1043-1054. doi: 10.11999/JEIT220066

Citation:

PDF( 10694 KB)

A Deep Reinforcement Learning Communication Jamming Resource Allocation Algorithm Fused with Noise Network

doi: 10.11999/JEIT220066

Information and Navigation College, Air Force Engineering University, Xi’an 710077, China

Received Date: 2022-01-13
Rev Recd Date: 2022-07-12

Available Online: 2022-07-15

Publish Date: 2023-03-10

Abstract

Abstract

To solve the problem that the traditional jamming resource allocation algorithm needs relatively complete prior information when dealing with nonlinear combinatorial optimization problems, and meanwhile, the decision dimension is small, which can not meet the requirements of modern communication countermeasures, a Deep Reinforcement Learning communication jamming resource allocation algorithm Fused with Noise Network (FNNDRL) is proposed. Using the idea of noise network for reference, twin noise evaluation network, which can avoid the overestimation of Q value and improve the randomness of evaluation network to ensure the exploration of training process is designed by the algorithm. Based on the physical significance of the probability entropy, an improved strategy network loss function based on the strategy distribution entropy is designed to maximize the cumulative reward and the strategy distribution entropy to avoid convergence to local optimal in the process of strategy optimization. The simulation results show that the proposed algorithm is superior to the average allocation and reinforcement learning methods in solving the problem of jamming resource allocation. Meanwhile, the algorithm has high stability and strong adaptability to high-dimensional decision space.
- Jamming resource allocation,
- Deep Reinforcement Learning(DRL),
- Noise network,
- Entropy of strategy distribution

FullText(HTML)

References(24)

References

[1]	LIU Yafeng and DAI Yuhong. On the complexity of joint subcarrier and power allocation for multi-user OFDMA systems[J]. IEEE Transactions on Signal Processing, 2014, 62(3): 583–596. doi: 10.1109/TSP.2013.2293130
[2]	宗思光, 刘涛, 梁善永. 基于改进遗传算法的干扰资源分配问题研究[J]. 电光与控制, 2018, 25(5): 41–45. doi: 10.3969/j.issn.1671-637X.2018.05.009 ZONG Siguang, LIU Tao, and LIANG Shanyong. Interference resource allocation based on improved genetic algorithm[J]. Electro-Optics &Control, 2018, 25(5): 41–45. doi: 10.3969/j.issn.1671-637X.2018.05.009
[3]	LUO Zhaoyi, DENG Min, YAO Zhiqiang, et al. Distributed blanket jamming resource scheduling for satellite navigation based on particle swarm optimization and genetic algorithm[C]. The IEEE 20th International Conference on Communication Technology (ICCT), Nanning, China, 2020: 611–616.
[4]	XU Zhiwei and ZHANG Kai. Multiobjective multifactorial immune algorithm for multiobjective multitask optimization problems[J]. Applied Soft Computing, 2021, 107: 107399. doi: 10.1016/j.asoc.2021.107399
[5]	TIAN Min, DENG Hongtao, and XU Mengying. Immune parallel artificial bee colony algorithm for spectrum allocation in cognitive radio sensor networks[C]. 2020 International Conference on Computer, Information and Telecommunication Systems (CITS), Hangzhou, China, 2020: 1–4.
[6]	李东生, 高杨, 雍爱霞. 基于改进离散布谷鸟算法的干扰资源分配研究[J]. 电子与信息学报, 2016, 38(4): 899–905. doi: 10.11999/JEIT150726 LI Dongsheng, GAO Yang, and YONG Aixia. Jamming resource allocation via improved discrete cuckoo search algorithm[J]. Journal of Electronics &Information Technology, 2016, 38(4): 899–905. doi: 10.11999/JEIT150726
[7]	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. arXiv: 1312.5602, 2013.
[8]	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236
[9]	XIONG Xiong, ZHENG Kan, LEI Lei, et al. Resource allocation based on deep reinforcement learning in IoT edge computing[J]. IEEE Journal on Selected Areas in Communications, 2020, 38(6): 1133–1146. doi: 10.1109/JSAC.2020.2986615
[10]	HE Chaofan, HU Yang, CHEN Yan, et al. Joint power allocation and channel assignment for NOMA with deep reinforcement learning[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(10): 2200–2210. doi: 10.1109/JSAC.2019.2933762
[11]	黄星源, 李岩屹. 基于双Q学习算法的干扰资源分配策略[J]. 系统仿真学报, 2021, 33(8): 1801–1808. doi: 10.16182/j.issn1004731x.joss.20-0253 HUANG Xingyuan and LI Yanyi. The allocation of jamming resources based on double Q-learning algorithm[J]. Journal of System Simulation, 2021, 33(8): 1801–1808. doi: 10.16182/j.issn1004731x.joss.20-0253
[12]	许华, 宋佰霖, 蒋磊, 等. 一种通信对抗干扰资源分配智能决策算法[J]. 电子与信息学报, 2021, 43(11): 3086–3095. doi: 10.11999/JEIT210115 XU Hua, SONG Bailin, JIANG Lei, et al. An intelligent decision-making algorithm for communication countermeasure jamming resource allocation[J]. Journal of Electronics &Information Technology, 2021, 43(11): 3086–3095. doi: 10.11999/JEIT210115
[13]	饶宁, 许华, 齐子森, 等. 基于最大策略熵深度强化学习的通信干扰资源分配方法[J]. 西北工业大学学报, 2021, 39(5): 1077–1086. doi: 10.1051/jnwpu/20213951077 RAO Ning, XU Hua, QI Zisen, et al. Allocation method of communication interference resource based on deep reinforcement learning of maximum policy entropy[J]. Journal of Northwestern Polytechnical University, 2021, 39(5): 1077–1086. doi: 10.1051/jnwpu/20213951077
[14]	ZHONG Chen, WANG Feng, GURSOY M C, et al. Adversarial jamming attacks on deep reinforcement learning based dynamic multichannel access[C]. 2020 IEEE Wireless Communications and Networking Conference (WCNC), Seoul, Korea (South), 2020: 1–6.
[15]	ZHONG Chen, LU Ziyang, GURSOY M C, et al. Actor-critic deep reinforcement learning for dynamic multichannel access[C]. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Anaheim, USA, 2018: 599–603.
[16]	LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]. The 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.
[17]	CUI Haoran, WANG Dongyu, LI Qi, et al. A2C deep reinforcement learning-based MEC network for offloading and resource allocation[C]. The 7th International Conference on Computer and Communications (ICCC), Chengdu, China, 2021: 1905–1909.
[18]	XU Chen, WANG Jian, YU Tianhang, et al. Buffer-aware wireless scheduling based on deep reinforcement learning[C]. Proceedings of 2020 IEEE Wireless Communications and Networking Conference (WCNC), Seoul, Korea (South), 2020: 1–6.
[19]	MENG Fan, CHEN Peng, WU Lenan, et al. Power allocation in multi-user cellular networks: Deep reinforcement learning approaches[J]. IEEE Transactions on Wireless Communications, 2020, 19(10): 6255–6267. doi: 10.1109/TWC.2020.3001736
[20]	AMURU S, TEKIN C, VAN DER SCHAAR M, et al. Jamming bandits - a novel learning method for optimal jamming[J]. IEEE Transactions on Wireless Communications, 2016, 15(4): 2792–2808. doi: 10.1109/TWC.2015.2510643
[21]	FORTUNATO M, AZAR M G, PIOT B, et al. Noisy networks for exploration[C]. The 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
[22]	KINGMA D P, SALIMANS T, and WELLING M. Variational dropout and the local reparameterization trick[C]. The 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 2575–2583.
[23]	HAARNOJA T, TANG Haoran, ABBEEL P, et al. Reinforcement learning with deep energy-based policies[C]. The 34th International Conference on Machine Learning (ICML), Sydney, Australia, 2017: 1352–1361.
[24]	WANG Wenjing, BHATTACHARJEE S, CHATTERJEE M, et al. Collaborative jamming and collaborative defense in cognitive radio networks[J]. Pervasive and Mobile Computing, 2013, 9(4): 572–587. doi: 10.1016/j.pmcj.2012.06.008