Recurrent Neural Networks Based Wireless Network Intrusion Detection and Classification Model Construction and Optimization
-
摘要: 为提高无线网络入侵检测模型的综合性能,该文将循环神经网络(RNN)算法用于构建无线网络入侵检测分类模型。针对无线网络入侵检测训练数据样本分布不均衡导致分类模型出现过拟合的问题,在对原始数据进行清洗、转换、特征选择等预处理基础上,提出基于窗口的实例选择算法精简训练数据集。对攻击分类模型的网络结构、激活函数和可复用性进行综合优化实验,得到最终优化模型,分类准确率达到98.6699%,综合优化后的运行时间为9.13 s。与其他机器学习算法结果比较,该优化方法在分类准确率和执行效率两个方面取得了很好的效果,综合性能优于传统的入侵检测分类模型。Abstract: In order to improve the comprehensive performance of the wireless network intrusion detection model, Recurrent Neural Network (RNN) algorithm is used to build a wireless network intrusion detection classification model. For the over-fitting problem of the classification model caused by the imbalance of training data samples distribution in wireless network intrusion detection, based on the pre-treatment of raw data cleaning, transformation, feature selection, etc., an instance selection algorithm based on window is proposed to refine the train data-set. The network structure, activation function and re-usability of the attack classification model are optimized experimentally, so the optimization model is obtained finally. The classification accuracy of the optimization model is 98.6699%, and the running time after the model reuse optimization is 9.13 s. Compared to other machine learning algorithms, the proposed approach achieves good results in classification accuracy and execution efficiency. The comprehensive performances of the proposed model are better than those of traditional intrusion detection model.
-
表 1 最重要的20维特征重要性得分
特征名 特征重要性得分 特征名 特征重要性得分 frame.len 0.8671 RA 0.6850 SA 0.7897 Subtype 0.6506 wep.iv 0.7764 type_sub 0.6373 TA 0.7587 reason_c 0.6327 wep.icv 0.7458 wep.key 0.6161 DA 0.7365 bssid 0.5971 DS 0.7283 Pwrmgt 0.5872 Duration 0.7135 type.cck 0.5866 RSS 0.7112 Protected 0.5865 Seq 0.7100 Datarate 0.5860 表 2 SamSelect伪代码
算法1 基于窗口的实例选择算法 SamSelect(DA, w) 输入:AWID训练集DA,窗口大小w 输出:采样后训练集DB (1) 初始化 正常样本计数器c=0 (2) for t=1 to |DA| do: (3) If Tt = normal then: (4) c = c+1 (5) if c ≤ w then: (6) 将当前样本放入DB (7) end if (8) end if (9) if Tt ≠ normal then: (10) c=0 (11) 将当前样本放入DB (12) end if (13) end for (14) return DB 表 3 窗口阈值大小与采样数据分布表
样本标签数量 窗口阈值为5 窗口阈值为2 正常标签样本数量 368038 201007 攻击标签样本数量 162385 162385 表 4 窗口大小为5时的RNN分类预测实验结果报告
类别 精确率(%) 召回率(%) F度量(%) 样本数 正常流量 95.93 99.11 97.49 530785 洪泛攻击流量 74.16 61.47 67.22 8097 伪装攻击流量 22.63 4.34 7.28 20079 注入攻击流量 99.80 99.99 99.90 16682 表 5 窗口大小为2时的RNN分类预测实验结果报告
类别 精确率(%) 召回率(%) F度量(%) 样本数 正常流量 96.04 98.27 97.14 530785 洪泛攻击流量 69.31 66.26 67.75 8097 伪装攻击流量 15.95 6.40 9.14 20079 注入攻击流量 99.63 99.99 99.81 16682 表 6 网络单元结构优化效果对比
网络单元结构 隐藏层数 隐藏层节点 学习率 迭代轮数 时间(s) 准确率(%) RNN 1 20 0.001 429 649.81 95.01 GRU 1 20 0.001 286 681.05 95.19 LSTM 1 20 0.001 277 663.50 95.19 LSTM 2 20 0.001 141 526.65 95.21 LSTM 3 20 0.001 145 545.47 95.14 LSTM 2 10 0.001 186 454.61 95.06 LSTM 2 30 0.001 175 988.17 95.00 LSTM 2 10 0.010 64 165.58 95.22 LSTM 2 10 0.020 86 205.83 95.27 LSTM 2 10 0.005 53 129.61 95.07 表 7 分类模型实验对比效果
算法名称 准确率(%) 时间(s) KNN 95.87 528.84 SVM 94.92 6757.97 NB 92.49 4.41 RFC 93.27 7.93 DT 93.19 6.43 AdaBoost 87.43 66.97 GB 95.14 53.13 RNN-LSTM 98.67 1717.00 RNN-LSTM(复用优化) 98.67 9.13 -
CHEN Dong. A survey of IEEE 802.11 protocols: Comparison and prospective[C]. Proceedings of the 2017 5th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering, Chongqing, China, 2017: 589–598. KOLIAS C, KAMBOURAKIS G, STAVROU A, et al. Intrusion detection in 802.11 networks: empirical evaluation of threats and a public dataset[J]. IEEE Communications Surveys & Tutorials, 2016, 18(1): 184–208. doi: 10.1109/COMST.2015.2402161 KOLIAS C and KAMBOURAKIS G. Organizations requested the dataset[EB/OL]. http://icsdweb.aegean.gr/awid/download.html, 2018. 白琮, 黄玲, 陈佳楠, 等. 面向大规模图像分类的深度卷积神经网络优化[J]. 软件学报, 2018, 29(4): 1029–1038. doi: 10.13328/j.cnki.jos.005404BAI Cong, HUANG Ling, CHEN Jianan, et al. Optimization of deep convolutional neural network for large scale image classification[J]. Journal of Software, 2018, 29(4): 1029–1038. doi: 10.13328/j.cnki.jos.005404 ALOTAIBI B and ELLEITHY K. A majority voting technique for wireless intrusion detection systems[C]. Proceedings of 2016 IEEE Long Island Systems, Applications and Technology Conference, New York, USA, 2016: 1–6. THING V L L. IEEE 802.11 network anomaly detection and attack classification: a deep learning approach[C]. Proceedings of 2017 IEEE Wireless Communications and Networking Conference, San Francisco, USA, 2017: 1–6. YIN Chuanlong, ZHU Yuefei, FEI Jinlong, et al. A deep learning approach for intrusion detection using recurrent neural networks[J]. IEEE Access, 2017, 5: 21954–21961. doi: 10.1109/ACCESS.2017.2762418 陈红松, 王钢, 宋建林. 基于云计算入侵检测数据集的内网用户异常行为分类算法研究[J]. 信息网络安全, 2018, 18(3): 1–7. doi: 10.3969/j.issn.1671-1122.2018.03.001CHEN Hongsong, WANG Gang, and SONG Jianlin. Research on anomaly behavior classification algorithm of internal network user based on cloud computing intrusion detection data set[J]. Netinfo Security, 2018, 18(3): 1–7. doi: 10.3969/j.issn.1671-1122.2018.03.001 MARTENS J and SUTSKEVER I. Learning recurrent neural networks with hessian-free optimization[C]. Proceedings of the 20th International Conference on Machine Learning, Washington, USA, 2011: 1033–1040. ABADI M, BARHAM P, CHEN Zhifeng, et al. Tensorflow: a system for large-scale machine learning[C]. Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation, Savannah, USA, 2016: 265–283. KIM J, KIM J, LE THI THU H, et al. Long short term memory recurrent neural network classifier for intrusion detection[C]. Proceedings of 2016 International Conference on Platform Technology and Service, Jeju, South Korea, 2016: 1–5. ZHOU Guobing, WU Jianxin, ZHANG Chenlin, et al. Minimal gated unit for recurrent neural networks[J]. International Journal of Automation and Computing, 2016, 13(3): 226–234. doi: 10.1007/s11633-016-1006-2