Advanced Search
Volume 43 Issue 4
Apr.  2021
Turn off MathJax
Article Contents
Hong TANG, Dan LIU, LiShuang YAO, Yunfeng WANG, Zuofei PEI. Feature Selection Algorithm for Class Imbalanced Internet Traffic[J]. Journal of Electronics & Information Technology, 2021, 43(4): 923-930. doi: 10.11999/JEIT190992
Citation: Hong TANG, Dan LIU, LiShuang YAO, Yunfeng WANG, Zuofei PEI. Feature Selection Algorithm for Class Imbalanced Internet Traffic[J]. Journal of Electronics & Information Technology, 2021, 43(4): 923-930. doi: 10.11999/JEIT190992

Feature Selection Algorithm for Class Imbalanced Internet Traffic

doi: 10.11999/JEIT190992
Funds:  Changjiang Scholars and Innovative Research Team in University (IRT_16R72)
  • Received Date: 2019-12-11
  • Rev Recd Date: 2021-02-22
  • Available Online: 2021-03-04
  • Publish Date: 2021-04-20
  • Class imbalance always exists in the process of network traffic classification. Considering the problem, a new feature selection algorithm using Weighted Symmetric Uncertainty (WSU) and Approximate Markov Blanket (AMB) is proposed. Firstly, a feature metric is defined using category distribution information, which is biased to minority classes. This makes it easier pick out features which have strong correlation with minority classes. Then, considering the correlation between features and categories and between features and features, the weighted symmetry uncertainty and approximate Markov blanket are used to delete the unrelated features and redundant features. Finally, the feature dimension is further reduced to determine the optimal feature subset, by using feature evaluation functions based on correlation measures and sequence search algorithms. The experimental results demonstrate that the algorithm can effectively improve the classification performance of minority classes without sacrificing the accuracy of the overall classification.
  • loading
  • XUE Yibo, ZHANG Luoshi, and WANG Dawei. Traffic classification: Issues and challenges[J]. Journal of Communications, 2013, 8(4): 240–248. doi: 10.12720/jcm.8.4.240-248
    NGUYEN T T T and ARMITAGE G. A survey of techniques for internet traffic classification using machine learning[J]. IEEE Communications Surveys & Tutorials, 2008, 10(4): 56–76. doi: 10.1109/SURV.2008.080406
    DAINOTTI A, PESCAPE A, and CLAFFY K C. Issues and future directions in traffic classification[J]. IEEE Network, 2012, 26(1): 35–40. doi: 10.1109/mnet.2012.6135854
    MOORE A W and PAPAGIANNAKI K. Toward the accurate identification of network applications[C]. The 6th International Workshop on Passive and Active Network Measurement, Boston, USA, 2005: 41–54. doi: 10.1007/978-3-540-31966-5_4.
    叶春明, 王珍, 陈思, 等. 基于节点行为特征分析的网络流量分类方法[J]. 电子与信息学报, 2014, 36(9): 2158–2165. doi: 10.3724/SP.J.1146.2013.01600

    YE Chunming, WANG Zhen, CHEN Si, et al. Internet Traffic classification based on hosts behavior analysis[J]. Journal of Electronics &Information Technology, 2014, 36(9): 2158–2165. doi: 10.3724/SP.J.1146.2013.01600
    DIAS K L, PONGELUPE M A, CAMINHAS W M, et al. An innovative approach for real-time network traffic classification[J]. Computer Networks, 2019, 158: 143–157. doi: 10.1016/j.comnet.2019.04.004
    鲁刚, 张宏莉, 叶麟. P2P流量识别[J]. 软件学报, 2011, 22(6): 1281–1298. doi: 10.3724/SP.J.1001.2011.03995

    LU Gang, ZHANG Hongli, and YE Lin. P2P traffic identification[J]. Journal of Software, 2011, 22(6): 1281–1298. doi: 10.3724/SP.J.1001.2011.03995
    MOORE A W and ZUZV D. Internet traffic classification using Bayesian analysis techniques[J]. ACM SIGMETRICS Performance Evaluation Review, 2005, 33(1): 50–60. doi: 10.1145/1071690.1064220
    DAI Lei, YUN Xiaochun, and XIAO Jun. Optimizing traffic classification using hybrid feature selection[C]. The 9th International Conference on Web-Age Information Management, Zhangjiajie, China, 2008: 520–525. doi: 10.1109/WAIM.2008.30.
    XU Huali, YU Shuhao, CHEN Jiajun, et al. An improved firefly algorithm for feature selection in classification[J]. Wireless Personal Communications, 2018, 102(4): 2823–2834. doi: 10.1007/s11277-018-5309-1
    张震, 汪斌强, 陈鸿昶, 等. 互联网中基于用户连接图的流量分类机制[J]. 电子与信息学报, 2013, 35(4): 958–964. doi: 10.3724/SP.J.1146.2012.01040

    ZHANG Zhen, WANG Binqiang, CHEN Hongchang, et al. Internet traffic classification based on host connection graph[J]. Journal of Electronics &Information Technology, 2013, 35(4): 958–964. doi: 10.3724/SP.J.1146.2012.01040
    SHAFIQ M, YU Xiangzhan, BASHIR A K, et al. A machine learning approach for feature selection traffic classification using security analysis[J]. The Journal of Supercomputing, 2018, 74(10): 4867–4892. doi: 10.1007/s11227-018-2263-3
    SHI Hongtao, LI Hongping, ZHANG Dan, et al. An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification[J]. Computer Networks, 2018, 132: 81–89. doi: 10.1016/j.comnet.2018.01.007
    WANG Youwei and FENG Lizhou. A new hybrid feature selection based on multi-filter weights and multi-feature weights[J]. Applied Intelligence, 2019, 49(12): 4033–4057. doi: 10.1007/s10489-019-01470-z
    王勇, 周慧怡, 俸皓, 等. 基于深度卷积神经网络的网络流量分类方法[J]. 通信学报, 2018, 39(1): 14–23. doi: 10.11959/j.issn.1000-436x.2018018

    WANG Yong, ZHOU Huiyi, FENG Hao, et al. Network traffic classification method basing on CNN[J]. Journal on Communications, 2018, 39(1): 14–23. doi: 10.11959/j.issn.1000-436x.2018018
    REN Xinming, GU Huaxi, and WEI Wenting. Tree-RNN: Tree structural recurrent neural network for network traffic classification[J]. Expert Systems with Applications, 2021, 167: 114363. doi: 10.1016/j.eswa.2020.114363
    LIN S Z, SHI Yong, and XUE Zhi. Character-level intrusion detection based on convolutional neural networks[C]. 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 2018: 1–8. doi: 10.1109/IJCNN.2018.8488987.
    夏栋梁, 刘玉坤, 鲁书喜. 基于蚁群算法和改进SSO的混合网络入侵检测方法[J]. 重庆邮电大学学报: 自然科学版, 2016, 28(3): 406–413. doi: 10.3979/j.issn.1673-825X.2016.03.021

    XIA Dongliang, LIU Yukun, and LU Shuxi. Hybrid network intrusion detection method based on ant colony algorithm and improved simplified swarm optimization[J]. Journal of Chongqing University of Posts and Telecommunications:Natural Science Edition, 2016, 28(3): 406–413. doi: 10.3979/j.issn.1673-825X.2016.03.021
    LOPEZ-MARTIN M, CARRO B, SANCHEZ-ESGUEVILLAS A, et al. Shallow neural network with kernel approximation for prediction problems in highly demanding data networks[J]. Expert Systems with Applications, 2019, 124: 196–208. doi: 10.1016/j.eswa.2019.01.063
    DASH M and LIU Huan. Consistency-based search in feature selection[J]. Artificial Intelligence, 2003, 151(1/2): 155–176. doi: 10.1016/s0004-3702(03)00079-1
    ZHANG Hongli, LU Gang, QASSRAWI M T, et al. Feature selection for optimizing traffic classification[J]. Computer Communications, 2012, 35(12): 1457–1471. doi: 10.1016/j.comcom.2012.04.012
    崔自峰, 徐宝文, 张卫丰, 等. 一种近似Markov Blanket最优特征选择算法[J]. 计算机学报, 2007, 30(12): 2074–2081. doi: 10.3321/j.issn:0254-4164.2007.12.002

    CUI Zifeng, XU Baowen, ZHANG Weifeng, et al. An approximate markov blanket feature selection algorithm[J]. Chinese Journal of Computers, 2007, 30(12): 2074–2081. doi: 10.3321/j.issn:0254-4164.2007.12.002
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(5)

    Article Metrics

    Article views (1173) PDF downloads(81) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return