互联网中基于用户连接图的流量分类机制
doi: 10.3724/SP.J.1146.2012.01040
Internet Traffic Classification Based on Host Connection Graph
-
摘要: 针对机器学习分类算法的概念漂移现象,该文提出了一种基于用户连接图的(Host Connection Graph, HCG)流量分类机制。算法将{IP Address, Port}作为用户唯一标识,构建了用户连接图,提出了用户相似度的概念;应用图挖掘理论将用户连接图划分为互不相交的行为子簇,使得用户之间的相互通信抽象为一种社会团体;通过定义基于信息熵的用户行为模式(UBM),分析了各个行为子簇背后表现出的业务特征,并使用UBM+Port对用户行为子簇进行了业务标签映射,实现了流量分类的目的。仿真实验表明:在不牺牲识别准确率的前提下,算法不仅能克服概念漂移问题,还能有效降低算法的计算复杂度。Abstract: Considering at the concept drift issue of machine learning identification, a novel algorithm called traffic classification based on Host Connection Graph (HCG) is proposed. Considering {IP Address, Port} as the unique user identifier, HCG constructs a host connection graph and innovates the concept of user similarity. Based on the theory of graph mining, social community is abstracted from communications among hosts by partitioning the graph into mutually intersectant behavior clusters. In order to reach traffic classification, HCG not only conceives a definition called User Behavior Mode (UBM) to analyse the implicit traffic characteristics, but also maps application labels to every host behavior by employing UBM and Port. Finally, simulations are conducted based on the real network trace. Results demonstrate that HCG can circumvent the concept shift problem and ameliorate gracefully computational complication without sacrificing accuracy.
-
Key words:
- Traffic classification /
- Host Connection Graph (HCG) /
- User similarity /
- Graph mining
计量
- 文章访问数: 2754
- HTML全文浏览量: 138
- PDF下载量: 754
- 被引次数: 0