报头特征驱动的加密流量跨维度协同识别框架

王梦寒; 周正春; 吉庆兵

doi:10.11999/JEIT250434

报头特征驱动的加密流量跨维度协同识别框架

doi: 10.11999/JEIT250434 cstr: 32379.14.JEIT250434

1.
西南交通大学成都 611756
2.
保密通信全国重点实验室成都 610041

基金项目: 四川省自然基金创新群体项目(2024NSFTD0015)，保密通信全国重点实验室稳定计划支持项目(WD202403)

详细信息

作者简介:
王梦寒：女，博士生，研究方向为网络安全与人工智能

周正春：男，教授，研究方向为序列编码、压缩感知以及网络安全与人工智能

吉庆兵：男，研究员，研究方向为保密通信

通讯作者:
吉庆兵　jqbdxy@163.com

中图分类号: TN915.08; TP393.08
计量
- 文章访问数: 368
- HTML全文浏览量: 228
- PDF下载量: 27
- 被引次数: 0
出版历程
- 收稿日期: 2025-05-20
- 修回日期: 2025-09-25
- 网络出版日期: 2025-10-20
- 刊出日期: 2025-11-10

A Cross-Dimensional Collaborative Framework for Header-Metadata-Driven Encrypted Traffic Identification

1.
Southwest Jiaotong University, Chengdu 611756, China
2.
National Key Laboratory of Security Communications, Chengdu 610041, China

Funds: The Innovation Group Project of Sichuan Provincial Natural Science Foundation (2024NSFTD0015), The Stability Program of National Key Laboratory of Security Communication (WD202403)

摘要

摘要: 在网络通信加密技术广泛应用的背景下，加密流量识别已成为网络安全领域亟待攻克的核心难题。传统基于载荷内容的识别方法，因加密算法的持续升级，面临特征失效的风险，进而在动态网络环境中产生检测盲区。与此同时，报头作为协议交互的关键载体，其结构化特征价值尚未得到充分挖掘。此外，随着加密协议的不断发展，现有的加密流量识别方法还面临特征解释性不足、模型在对抗攻击下鲁棒性薄弱等问题。针对上述挑战，该文提出一种报头特征驱动的加密流量跨维度协同识别框架，分别从网络流量特征选取与识别性能、量化特征贡献度的可解释性评估以及对抗性扰动对模型稳健性影响3个维度进行分析，系统地揭示和证明了报头特征在加密流量识别中占主导作用，突破了传统单视角分析的局限性，革新了传统方法依赖载荷数据的固有认知。该识别框架不仅能分析深度模型的性能边界、评估决策的可信性，而且能通过有效筛选特征剪除冗余，在降低模型复杂度的基础上提升加密场景下的抗干扰能力，进而设计更轻量化、更加稳健的加密流量识别模型。最后，在ISCXVPN2016和ISCXTor2016数据集上的对比实验表明：在识别性能维度，仅基于报头特征的模型F1分数较完整流量模型最高提升6%，较仅基于有载荷特征的模型最高提升61%，验证了报头特征在分类任务中的有效性；在可解释性评估中，通过特征贡献度量化方法发现，报头特征相关性得分的平均占比相较于载荷特征最多高出 89.8%，凸显其在模型决策中的主导性影响；在抗干扰鲁棒性方面，含报头特征的模型在同等带宽扰动下的最大抗干扰性能保持率较纯载荷模型相比，优势显著，最大差距达 98.46%，证实了报头特征对增强模型鲁棒性的关键作用。
- 报头特征 /
- 加密流量识别 /
- 可解释性 /
- 对抗性扰动
Abstract: Objective With the widespread adoption of network communication encryption technologies, encrypted traffic identification has become a critical problem in network security. Traditional identification methods based on payload content face the risk of feature invalidation due to the continuous evolution of encryption algorithms, leading to detection blind spots in dynamic network environments. Meanwhile, the structured information embedded in packet headers, an essential carrier for protocol interaction, remains underutilized. Furthermore, as encryption protocols evolve, existing encrypted traffic identification approaches encounter limitations such as poor feature interpretability and weak model robustness against adversarial attacks. To address these challenges, this paper proposes a cross-dimensional collaborative identification framework for encrypted traffic, driven by header metadata features. The framework systematically reveals and demonstrates the dominant role of header features in encrypted traffic identification, overcoming the constraints of single-perspective analyses and reducing dependence on payload data. It further enables the assessment of deep model performance boundaries and decision credibility. Through effective feature screening and pruning, redundant attributes are eliminated, enhancing the framework’s anti-interference capability in encrypted scenarios. This approach reduces model complexity while improving interpretability and robustness, facilitating the design of lighter and more reliable encrypted traffic identification models. Methods This study performs a three-dimensional analysis including (1) network traffic feature selection and identification performance, (2) quantitative evaluation of feature importance in classification, and (3) assessment of model robustness under adversarial perturbations. First, the characteristics, differences, and effects on identification performance are compared among three forms of encrypted traffic packets using a One-Dimensional Convolutional Neural Network (1D-CNN). This comparison verifies the dominant role of header features in encrypted traffic identification. Second, two explainable algorithms, Layer-wise Relevance Propagation (LRP) and Deep Taylor Decomposition (DTD), are employed to further confirm the essential contribution of header features to network traffic classification. The relative importance of header and payload features is quantified from two perspectives: (1) the relevance of backpropagation and (2) the contribution coefficients derived from Taylor series expansion, thereby enhancing feature interpretability. Finally, adversarial attack experiments are conducted using Projected Gradient Descent (PGD) and random perturbations. By injecting carefully constructed adversarial perturbation data into the initial and terminal parts of the payload, or by adding randomly generated noise to produce adversarial traffic, the study examines the effect of these perturbations on model decision-making. This analysis evaluates the stability and anti-interference capabilities of the encrypted traffic identification model under adversarial conditions. Results and Discussions Comparative experiments conducted on the ISCXVPN2016 and ISCXTor2016 datasets yield three key findings. (1) Recognition performance. The model based solely on header features achieves an F1 score up to 6% higher than that of the model using complete traffic, and up to 61% higher than that of the model using only payload features. These results verify that header features possess irreplaceable significance in encrypted traffic identification. The structural information embedded in headers plays a dominant role in enabling the model to accurately classify traffic types. Even without payload data, high identification accuracy can be achieved using header information alone (Figure 2, Table 4). (2) Interpretability evaluation. The LRP and DTD methods are used to quantify the contribution of header features to model classification. The correlation between header features and classification performance is markedly higher than that of payload features, with the average proportion of the correlation score up to 89.8% (Figures 3～4, Table 5). This result is highly consistent with the classification behavior of the One-Dimensional Convolutional Neural Network (1D-CNN), further confirming the critical importance and dominant influence of header features in encrypted traffic identification. (3) Anti-interference robustness. The combined Header-Payload model exhibits strong robustness under adversarial attacks. Particularly under low-bandwidth conditions, the model incorporating header features shows a markedly higher maximum performance retention rate under equivalent bandwidth perturbation than the pure payload model, with the maximum difference reaching 98.46%. This finding confirms the essential role of header features in enhancing model robustness (Figures 5～6). Header-based models maintain stable recognition performance, whereas payload information is more susceptible to interference, leading to sharp performance degradation. In addition, the identification performance, contribution quantification, and anti-attack effectiveness of header features are influenced by data type and distribution characteristics. In certain cases, payload features provide auxiliary support, suggesting a complementary relationship between the two feature domains. Conclusions This study addresses core challenges in encrypted traffic identification, including feature degradation, limited interpretability, and weak adversarial robustness in traditional payload-dependent methods. A cross-dimensional collaborative identification framework driven by header features is proposed. Through systematic theoretical analysis and experimental validation from three perspectives, the framework demonstrates the irreplaceable value of header features in network traffic identification and overcomes the limitations of conventional single-perspective approaches. It provides a theoretical foundation for improving the efficiency, interpretability, and robustness of encrypted traffic identification models. Future work will focus on enhancing dynamic adaptability, integrating multi-modal features, implementing lightweight architectures, and strengthening adversarial defense mechanisms. These directions are expected to advance encrypted traffic identification technology toward higher intelligence, adaptability, and resilience.
- Header metadata features /
- Encrypted traffic identification /
- Interpretability /
- Adversarial perturbation

HTML全文

图 1 基于报头特征驱动的加密流量跨维度协同识别框架

下载: 全尺寸图片幻灯片

图 2 两种数据集下3种数据形式的F1得分结果

下载: 全尺寸图片幻灯片

图 3 2种数据集的每种类型数据的输入字节在LRP方法下的相关性得分

下载: 全尺寸图片幻灯片

图 4 2种数据集的每种类型数据的输入字节在DTD方法下的相关性得分TOR

下载: 全尺寸图片幻灯片

图 5 基于2种扰动在不同BW下HP的F1得分情况

下载: 全尺寸图片幻灯片

图 6 基于2种扰动在不同BW下P的F1得分情况

下载: 全尺寸图片幻灯片

表 1 数据集样本分布情况(预处理后)(个)

流量类型	ISCXVPN2016 样本数量	ISCXTor2016 样本数量
Chat	2160	28284
Email	2488	16579
Filetransfer	25630	76912
P2P	21703	55818
Streaming	15103	40544
VoIP	12462	27557

下载: 导出CSV

表 2 注入扰动后的样本分布情况(个)

流量类型	ISCXVPN2016 样本数量	ISCXTor2016 样本数量
Chat	720	9428
Email	829	5526
Filetransfer	8543	25637
P2P	7234	18606
Streaming	5034	13514
VoIP	4154	9186

下载: 导出CSV

表 3 参数设置表

方法	参数名称	参数符号	参数值	方法	参数名称	参数符号	参数值
1D CNN	学习率	l_r	0.002	PGD	训练集/测试集	train/test	8:2
	权重衰减	weight_decay	0.001		最大迭代次数	max_iter	10
	训练轮数	epoch	50		扰动阈值	eps	0.3
	批量大小	batch_size	1024		梯度扰动步长	eps_iter	0.03

下载: 导出CSV

表 4 两种数据集下模型的流量识别效果

数据集	数据类型	Precision			Recall			F1 score			Accuracy
数据集	数据类型	H	P	HP	H	P	HP	H	P	HP	H	P	HP
ISCXVPN2016	Chat	0.92	0.63	0.90	0.94	0.50	0.95	0.93	0.59	0.93	0.94	0.55	0.95
	Email	0.92	0.86	0.96	0.88	0.52	0.83	0.90	0.65	0.89	0.88	0.52	0.83
	Filetransfer	0.99	0.54	0.99	0.99	0.94	1.00	0.99	0.69	0.99	0.99	0.94	0.99
	P2P	1.00	0.91	1.00	1.00	0.60	1.00	1.00	0.73	1.00	1.00	0.60	1.00
	Streaming	0.99	0.53	0.99	1.00	0.29	1.00	0.99	0.38	0.99	1.00	0.29	1.00
	VoIP	0.99	0.81	0.98	0.97	0.51	0.98	0.98	0.63	0.98	0.97	0.51	0.98
ISCXTor2016	Chat	0.88	0.84	0.64	0.53	0.24	0.67	0.67	0.38	0.65	0.50	0.37	0.81
	Email	0.98	0.98	0.97	0.90	0.55	0.83	0.94	0.70	0.89	0.95	0.90	0.96
	Filetransfer	0.99	1.00	0.99	0.98	0.71	0.86	0.98	0.83	0.92	0.98	0.94	0.98
	P2P	0.95	0.87	0.90	0.97	0.89	0.97	0.96	0.88	0.94	0.97	0.97	0.99
	Streaming	0.85	0.55	0.85	0.97	0.89	0.97	0.91	0.68	0.91	0.97	0.96	0.98
	VoIP	0.88	0.81	0.86	0.79	0.82	0.84	0.83	0.81	0.85	0.81	0.86	0.92

下载: 导出CSV

表 5 2种数据集的每种类型数据的报头和载荷在LRP和DTD可解释方法下的相关性得分的平均占比

方法	数据集	类别	Chat	Email	Filetransfer	P2P	Streaming	VoIP
LRP	ISCXVPN2016	H	0.89	0.95	0.88	0.87	0.86	0.89
	ISCXVPN2016	P	0.11	0.05	0.12	0.13	0.14	0.11
	ISCXTor2016	H	0.72	0.62	0.57	0.75	0.57	0.82
	ISCXTor2016	P	0.28	0.38	0.43	0.25	0.43	0.18
DTD	ISCXVPN2016	H	0.88	0.82	0.97	0.84	0.88	0.88
	ISCXVPN2016	P	0.12	0.18	0.03	0.16	0.12	0.12
	ISCXTor2016	H	0.76	0.70	0.59	0.72	0.75	0.83
	ISCXTor2016	P	0.24	0.30	0.41	0.28	0.25	0.17

下载: 导出CSV

参考文献(18)

[1]	CHOOROD P, WEIR G, and FERNANDO A. Classifying tor traffic encrypted payload using machine learning[J]. IEEE Access, 2024, 12: 19418–19431. doi: 10.1109/ACCESS.2024.3356073.
[2]	SHEN Meng, YE Ke, LIU Xingtong, et al. Machine learning-powered encrypted network traffic analysis: A comprehensive survey[J]. IEEE Communications Surveys & Tutorials, 2023, 25(1): 791–824. doi: 10.1109/COMST.2022.3208196.
[3]	ABBASI M, SHAHRAKI A, and TAHERKORDI A. Deep learning for network traffic monitoring and analysis (NTMA): A survey[J]. Computer Communications, 2021, 170: 19–41. doi: 10.1016/j.comcom.2021.01.021.
[4]	OKONKWO Z, FOO E, LI Qinyi, et al. A CNN based encrypted network traffic classifier[C]. 2022 Australasian Computer Science Week, Brisbane, Australia, 2022: 74–83. doi: 10.1145/3511616.3513101.
[5]	WANG Wei, ZHU Ming, WANG Jinlin, et al. End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]. 2017 IEEE International Conference on Intelligence and Security Informatics, Beijing, China, 2017: 43–48. doi: 10.1109/ISI.2017.8004872.
[6]	CUI Yuqing and LI Aihua. Research on network encrypted traffic detection technology based on CNN+LSTM[C]. 2024 2nd International Conference on Signal Processing and Intelligent Computing, Guangzhou, China, 2024: 191–195. doi: 10.1109/SPIC62469.2024.10691502.
[7]	HU Feifei, ZHANG Situo, LIN Xuebin, et al. Network traffic classification model based on attention mechanism and spatiotemporal features[J]. EURASIP Journal on Information Security, 2023, 2023(1): 6. doi: 10.1186/s13635-023-00141-4.
[8]	HONG Yueping, LI Qi, YANG Yanqing, et al. Graph based encrypted malicious traffic detection with hybrid analysis of multi-view features[J]. Information Sciences, 2023, 644: 119229. doi: 10.1016/j.ins.2023.119229.
[9]	YU Rongwei, GUO Xiya, ZHANG Peihao, et al. HGNN-ETC: Higher-order graph neural network based on chronological relationships for encrypted traffic classification[J]. Computers, Materials & Continua, 2024, 81(2): 2643–2664. doi: 10.32604/cmc.2024.056165.
[10]	DIAO Zulong, XIE Gaogang, WANG Xin, et al. EC-GCN: A encrypted traffic classification framework based on multi-scale graph convolution networks[J]. Computer Networks, 2023, 224: 109614. doi: 10.1016/j.comnet.2023.109614.
[11]	LIM W, YONG K S C, LAU B T, et al. Future of generative adversarial networks (GAN) for anomaly detection in network security: A review[J]. Computers & Security, 2024, 139: 103733. doi: 10.1016/j.cose.2024.103733.
[12]	DING Hongwei, SUN YU, HUANG Nana, et al. TMG-GAN: Generative Adversarial Networks-Based Imbalanced Learning for Network Intrusion Detection[J]. IEEE Transactions on Information Forensics and Security, 2023, 19: 1156–1167. doi: 10.1109/TIFS.2023.3331240.
[13]	JAIN S and WALLACE B C. Attention is not explanation[C]. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA, 2019: 3543–3556. doi: 10.18653/v1/N19-1357.
[14]	BINDER A, MONTAVON G, LAPUSCHKIN S, et al. Layer-wise relevance propagation for neural networks with local renormalization layers[C]. The 25th International Conference on Artificial Neural Networks and Machine Learning, Barcelona, Spain, 2016: 63–71. doi: 10.1007/978-3-319-44781-0_8.
[15]	KAUFFMANN J, MÜLLER K R, and MONTAVON G. Towards explaining anomalies: A deep Taylor decomposition of one-class models[J]. Pattern Recognition, 2020, 101: 107198. doi: 10.1016/j.patcog.2020.107198.
[16]	MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]. The 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
[17]	DRAPER-GIL G, LASHKARI A H, MAMUN M S I, et al. Characterization of encrypted and VPN traffic using time-related features[C]. The 2nd International Conference on Information Systems Security and Privacy, Rome, Italy, 2016: 407–414. doi: 10.5220/0005740704070414.
[18]	LASHKARI A H, GIL G D, MAMUN M S I, et al. Characterization of tor traffic using time based features[C]. The 3rd International Conference on Information Systems Security and Privacy, Porto, Portugal, 2017: 253–262. doi: 10.5220/0006105602530262.

施引文献

资源附件(0)

访问统计

图(6) / 表(5)

计量

文章访问数: 368
HTML全文浏览量: 228
PDF下载量: 27
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

报头特征驱动的加密流量跨维度协同识别框架

doi: 10.11999/JEIT250434 cstr: 32379.14.JEIT250434

作者简介:
王梦寒：女，博士生，研究方向为网络安全与人工智能

周正春：男，教授，研究方向为序列编码、压缩感知以及网络安全与人工智能

吉庆兵：男，研究员，研究方向为保密通信

通讯作者:
吉庆兵　jqbdxy@163.com

计量

A Cross-Dimensional Collaborative Framework for Header-Metadata-Driven Encrypted Traffic Identification

计量

目录

留言板

报头特征驱动的加密流量跨维度协同识别框架

doi: 10.11999/JEIT250434 cstr: 32379.14.JEIT250434

作者简介: 王梦寒：女，博士生，研究方向为网络安全与人工智能 周正春：男，教授，研究方向为序列编码、压缩感知以及网络安全与人工智能 吉庆兵：男，研究员，研究方向为保密通信

通讯作者: 吉庆兵 jqbdxy@163.com

计量

出版历程

A Cross-Dimensional Collaborative Framework for Header-Metadata-Driven Encrypted Traffic Identification

计量

出版历程

目录

作者简介:
王梦寒：女，博士生，研究方向为网络安全与人工智能

周正春：男，教授，研究方向为序列编码、压缩感知以及网络安全与人工智能

吉庆兵：男，研究员，研究方向为保密通信

通讯作者:
吉庆兵　jqbdxy@163.com