类别数据流和特征空间双分离的类增量学习算法

云涛; 潘泉; 刘磊; 白向龙; 刘宏

doi:10.11999/JEIT231064

类别数据流和特征空间双分离的类增量学习算法

doi: 10.11999/JEIT231064 cstr: 32379.14.JEIT231064

云涛^{1, 2},
潘泉^{1, 3, ,},
刘磊⁴,
白向龙¹,
刘宏²

1.
西北工业大学自动化学院西安 710114
2.
宇航动力学国家重点实验室西安 710043
3.
信息融合技术教育部重点实验室西安 710114
4.
西安电子科技大学西安 710071

基金项目: 国家自然科学基金重大项目(61790552)

详细信息

作者简介:
云涛：男，工程师，主要研究方向为雷达数据处理、深度学习

潘泉：男，博士，教授，主要研究方向为信息融合理论及应用、目标跟踪与识别技术、光谱成像和图像处理

刘磊：男，博士，副教授，主要研究方向为态势感知、雷达成像和雷达图像处理

白向龙：男，博士生，主要研究方向为多目标跟踪和多源数据智能融合处理

刘宏：男，博士，主要研究方向为图像处理

通讯作者:
潘泉　quanpan@nwpu.edu.cn

中图分类号: TN911.7; TN959.1+7
计量
- 文章访问数: 611
- HTML全文浏览量: 554
- PDF下载量: 129
- 被引次数: 0
出版历程
- 收稿日期: 2023-10-07
- 修回日期: 2024-05-08
- 网络出版日期: 2024-06-16
- 刊出日期: 2024-10-30

A Class Incremental Learning Algorithm with Dual Separation of Data Flow and Feature Space for Various Classes

YUN Tao^{1, 2},
PAN Quan^{1, 3
, ,},
LIU Lei⁴,
BAI Xianglong¹,
LIU Hong²

1.
School of Automation, Northwestern Polytechnical University, Xi’an 710114 China
2.
State Key Laboratory of Astronautic Dynamics, Xi’an 710043, China
3.
Key Laboratory of Information Fusion Technology of Ministry of Education, Xi’an 710114, China
4.
Xidian University, Xi’an 710071, China

Funds: The Major Project of the National Natural Science Foundation of China (61790552)

摘要

摘要: 针对类增量学习(CIL)中的灾难性遗忘问题，该文提出一种不同类的数据流和特征空间双分离的类增量学习算法。双分离(S2)算法在1次增量任务中包含2个阶段。第1个阶段通过分类损失、蒸馏损失和对比损失的综合约束训练网络。根据模块功能对各类的数据流进行分离，以增强新网络对新类别的识别能力。通过对比损失的约束，增大各类数据在特征空间中的距离，避免由于旧类样本的不完备性造成特征空间被新类侵蚀。第2个阶段对不均衡的数据集进行动态均衡采样，利用得到的均衡数据集对新网络进行动态微调。利用实测和仿真数据构建了一个飞机目标高分辨率距离像增量学习数据集，实验结果表明该算法相比其它几种对比算法在保持高可塑性的同时，具有更高的稳定性，综合性能更优。
- 雷达目标识别 /
- 逆合成孔径雷达 /
- 高分辨率距离像 /
- 类增量学习
Abstract: To address the catastrophic forgetting problem in Class Incremental Learning (CIL), a class incremental learning algorithm with dual separation of data flow and feature space for various classes is proposed in this paper. The Dual Separation (S2) algorithm is composed of two stages in an incremental task. In the first stage, the network training is achieved through the comprehensive constraint of classification loss, distillation loss, and contrastive loss. The data flows from different classes are separated depending on module functions, in order to enhance the network’s ability to recognize new classes. By utilizing contrastive loss, the distance between different classes in the feature space is increased to prevent the feature space of old class from being eroded by the new class due to the incompleteness of the old class samples. In the second stage, the imbalanced dataset is subjected to dynamic balancing sampling to provide a balanced dataset for the new network’s dynamic fine-tuning. A high-resolution range profile incremental learning dataset of aircraft targets was created using observed and simulated data. The experimental results demonstrate that the algorithm proposed in this paper outperforms other algorithms in terms of overall performance and higher stability, while maintaining high plasticity.
- Radar target recognition /
- Inverse Synthetic Aperture Radar (ISAR) /
- High Resolution Range Profile (HRRP) /
- Class Incremental Learning (CIL)

HTML全文

图 1 类增量学习示意图

下载: 全尺寸图片幻灯片

图 2 iCaRL类增量学习方法

下载: 全尺寸图片幻灯片

图 3 特征空间侵蚀与分离示意图

下载: 全尺寸图片幻灯片

图 4 双分离的类增量学习方法

下载: 全尺寸图片幻灯片

图 5 残差多尺度卷积模块

下载: 全尺寸图片幻灯片

图 6 主干网络

下载: 全尺寸图片幻灯片

图 7 不同算法识别准确率

下载: 全尺寸图片幻灯片

图 8 不同算法各类平均识别准确率

下载: 全尺寸图片幻灯片

图 9 软标签-嵌入损失权重系数消融实验结果

下载: 全尺寸图片幻灯片

图 10 对比损失权重消融实验结果

下载: 全尺寸图片幻灯片

图 11 投影头数量消融实验结果

下载: 全尺寸图片幻灯片

图 12 T₄任务各类识别准确率(实验1)

下载: 全尺寸图片幻灯片

图 13 T₄任务各类识别准确率(实验2)

下载: 全尺寸图片幻灯片

1 新类样本集的构建

输入：任务${T_i}$包含${N_{{\text{NC}}}}$个类别的数据集${{\boldsymbol{D}}^i}$
输入：每个类别保存样本数$m$
输入：主干网络$ \text{CONV}(\cdot) $
1:　 for $j = 1,2, \cdots ,{N_{{\text{NC}}}}$
2: 　　样本嵌入均值$ {{\boldsymbol{\mu }}_j} \leftarrow \frac{1}{{\left\| {{{\boldsymbol{D}}_j}} \right\|}}\sum\limits_{{\boldsymbol{x}} \in {{\boldsymbol{D}}_j}} {{\text{CONV}}({\boldsymbol{x}})} $
3: 　　for $k = 1,2, \cdots ,m$
4: 　${{\boldsymbol{z}}_{jk}} \leftarrow \mathop {\arg \min }\limits_{{\boldsymbol{x}} \in {{\boldsymbol{D}}^i}} \left\\| {{{\boldsymbol{\mu }}_j} - \dfrac{1}{k}\left[ {{\text{CONV}}({\boldsymbol{x}}) + \displaystyle\sum\limits_{l = 1}^{k - 1} {{\text{CONV}}({{\boldsymbol{z}}_{il}})} } \right]} \right\\|$
5: 　　end for
6: 　　${{\boldsymbol{Z}}_j} \leftarrow \left( {{{\boldsymbol{z}}_{j1}},{{\boldsymbol{z}}_{j2}}, \cdots ,{{\boldsymbol{z}}_{jm}}} \right)$
7: 　end for
输出：$ {{\boldsymbol{Z}}^{{\text{NC}}}} \leftarrow {{\boldsymbol{Z}}_1} \cup {{\boldsymbol{Z}}_2} \cup \cdots \cup {{\boldsymbol{Z}}_{{N_{{\text{NC}}}}}} $

下载: 导出CSV

2 S2未增量训练过程

输入：新类数据集${{\boldsymbol{D}}^0}$
输入：每个类别保存样本数$m$
输入：初始网络参数${\boldsymbol{W}}$
1: 　${\boldsymbol{W}} \leftarrow \mathop {\arg \min }\limits_{\boldsymbol{W}} {\text{los}}{{\text{s}}_{{\text{cls}}}}({{\boldsymbol{D}}^0},{\boldsymbol{W}})$
2: 　利用算法1挑选新类样本${{\boldsymbol{Z}}^{{\text{NC}}}} \leftarrow {{\boldsymbol{D}}^0}$
3: 　回放数据集${{\boldsymbol{Z}}^0} \leftarrow {{\boldsymbol{Z}}^{{\text{NC}}}}$
输出：${\boldsymbol{W}}$, ${{\boldsymbol{Z}}^0}$

下载: 导出CSV

3 S2增量训练过程

输入：新类数据集${{\boldsymbol{D}}^i}$
输入：回放数据集${{\boldsymbol{Z}}^{i - 1}}$
输入：每个类别保存样本数$m$
输入：当前网络参数${\boldsymbol{W}}$
1: 　/阶段1/
2: 　for $k = 1,2, \cdots ,{\text{epoc}}{{\text{h}}_{{\text{train}}}}$
3: 　　随机抽取一个批次的数据　　　　${{\boldsymbol{D}}_{{\text{batch}}}} = {\text{RandomSample}}\left( {{{\boldsymbol{D}}^i} \cup {{\boldsymbol{Z}}^{i - 1}}} \right)$
4: 　　新旧数据分流${\boldsymbol{D}}_{{\text{batch}}}^{\text{O}},{\boldsymbol{D}}_{{\text{batch}}}^{\text{N}} = {\text{separate}}({{\boldsymbol{D}}_{{\text{batch}}}})$, 　　　　$\left( {{{\boldsymbol{x}}^{{\text{OC}}}},{y^{{\text{OC}}}}} \right) \in {\boldsymbol{D}}_{{\text{batch}}}^{\text{O}}$, $\left( {{{\boldsymbol{x}}^{{\text{AC}}}},{y^{{\text{AC}}}}} \right) \in {\boldsymbol{D}}_{{\text{batch}}}^{}$
5: 　　特征提取${{\boldsymbol{e}}^{{\text{ONOC}}}} = {\text{CON}}{{\text{V}}^{{\text{ON}}}}({{\boldsymbol{x}}^{{\text{OC}}}})$, 　　　　$ {{\boldsymbol{e}}^{{\text{NNOC}}}}{\text{ = CON}}{{\text{V}}^{{\text{NN}}}}({{\boldsymbol{x}}^{{\text{OC}}}}) $, $ {{\boldsymbol{e}}^{{\text{NNAC}}}}{\text{ = CON}}{{\text{V}}^{{\text{NN}}}}({{\boldsymbol{x}}^{{\text{AC}}}}) $
6: 　　计算嵌入蒸馏损失$ {\text{los}}{{\text{s}}_{{\text{ED}}}} $
7: 　　分类器输出${{\boldsymbol{l}}^{{\text{ONOC}}}} = {\text{F}}{{\text{C}}^{{\text{ON}}}}({{\boldsymbol{e}}^{{\text{ONOC}}}})$, 　　　　${{\boldsymbol{l}}^{{\text{NNOC}}}} = {\text{F}}{{\text{C}}^{{\text{NN}}}}({{\boldsymbol{e}}^{{\text{NNOC}}}})$, ${{\boldsymbol{l}}^{{\text{NNAC}}}} = {\text{F}}{{\text{C}}^{{\text{NN}}}}({{\boldsymbol{e}}^{{\text{NNAC}}}})$
8: 　　计算软标签蒸馏损失$ {\text{los}}{{\text{s}}_{{\text{LD}}}} $
9: 　　计算分类损失$ {\text{los}}{{\text{s}}_{{\text{cls}}}} $
10: 　　投影$ {{\boldsymbol{p}}^{{\text{NNAC}}}} = {\text{PROJECTION}}({{\boldsymbol{e}}^{{\text{NNAC}}}}) $
11: 　计算对比损失$ {\text{los}}{{\text{s}}_{{\text{SCL}}}} $
12: 　计算总损失$ {\text{los}}{{\text{s}}_{{\text{total}}}} $
13: 　利用$ \nabla {\text{los}}{{\text{s}}_{{\text{total}}}} $更新${\boldsymbol{W}}$
14: end for
15: /阶段2/
16: for $k = 1,2, \cdots ,{\text{epoc}}{{\text{h}}_{{\text{ft}}}}$
17: 均衡数据集${{\boldsymbol{D}}^{\text{B}}} \leftarrow {\text{BalanceSample(}}{{\boldsymbol{D}}^i},{{\boldsymbol{Z}}^{i - 1}}{\text{)}}$
18: 微调${\boldsymbol{W}} \leftarrow \mathop {\arg \min }\limits_{\boldsymbol{W}} {\text{los}}{{\text{s}}_{{\text{cls}}}}({{\boldsymbol{D}}^{\text{B}}},{\boldsymbol{W}})$
19: end for
20: /回放数据集管理/
21: 挑选旧类样本${{\boldsymbol{Z}}^{{\text{OC}}}} \leftarrow {{\boldsymbol{Z}}^{i - 1}}$
22: 利用算法1挑选新类样本${{\boldsymbol{Z}}^{{\text{NC}}}} \leftarrow {{\boldsymbol{D}}^i}$
23: 回放数据集${{\boldsymbol{Z}}^i} \leftarrow {{\boldsymbol{Z}}^{{\text{OC}}}} \cup {{\boldsymbol{Z}}^{{\text{NC}}}}$
输出：${\boldsymbol{W}}$, ${{\boldsymbol{Z}}^i}$

下载: 导出CSV

表 1 飞机尺寸参数

飞机型号	机长(m)	机宽(m)	机高(m)	多边形数量
飞机1	28.72	30.04	9.10	166 338
飞机2	17.51	15.46	4.84	69 446
飞机3	6.84	15.00	1.74	64 606
飞机4	12.58	7.60	3.33	119 940
飞机5	16.18	7.40	2.44	51 736
飞机6	7.93	9.01	3.09	141 343
飞机7	15.28	13.02	4.99	114 166
飞机8	7.36	9.20	2.89	76 851
雅克42	36.38	34.88	9.83	-
奖状	14.40	15.90	4.57	-
安26	23.80	29.20	9.83	-

下载: 导出CSV

表 2 算法部分参数

参数名称	取值
主干网络	RMsCNN
迭代次数	25
初始学习率	0.01
学习率衰减	余弦退火
优化器	SGD
批大小	256
权重衰减	0.000 2

下载: 导出CSV

表 3 模块消融实验结果

序号	分类损失	蒸馏损失	数据流分离	对比损失	动态微调	分类器		准确率 (%)
序号	分类损失	蒸馏损失	数据流分离	对比损失	动态微调	NME	CNN	准确率 (%)
1	√	√	√	√	√		√	97.88
2	√	√	√	√			√	96.36
3	√	√	√				√	95.33
4	√	√					√	94.35
5	√	√				√		91.94

下载: 导出CSV

参考文献(16)

[1]	ZHU Kai, ZHAI Wei, CAO Yang, et al. Self-sustaining representation expansion for non-exemplar class-incremental learning[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 9286–9295. doi: 10.1109/CVPR52688.2022.00908.
[2]	LI Zhizhong and HOIEM D. Learning without forgetting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(12): 2935–2947. doi: 10.1109/TPAMI.2017.2773081.
[3]	DOUILLARD A, CORD M, OLLION C, et al. PODNet: Pooled outputs distillation for small-tasks incremental learning[C]. The 16th European Conference, Glasgow, UK, 2020: 86–102. doi: 10.1007/978-3-030-58565-5_6.
[4]	REBUFFI S A, KOLESNIKOV A, SPERL G, et al. iCaRL: Incremental classifier and representation learning[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 2017: 2001–2010. doi: 10.1109/CVPR.2017.587.
[5]	曲志昱, 李根, 邓志安. 基于知识蒸馏与注意力图的雷达信号识别方法[J]. 电子与信息学报, 2022, 44(9): 3170–3177. doi: 10.11999/JEIT210695. QU Zhiyu, LI Gen, and DENG Zhian. Radar signal recognition method based on knowledge distillation and attention map[J]. Journal of Electronics & Information Technology, 2022, 44(9): 3170–3177. doi: 10.11999/JEIT210695.
[6]	ISCEN A, ZHANG J, LAZEBNIK S, et al. Memory-efficient incremental learning through feature adaptation[C]. Proceedings of the 16th European Conference, Glasgow, UK, 2020: 699–715. doi: 10.1007/978-3-030-58517-4_41.
[7]	PELLEGRINI L, GRAFFIETI G, LOMONACO V, et al. Latent replay for real-time continual learning[C]. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, USA, 2020: 10203–10209. doi: 10.1109/IROS45743.2020.9341460.
[8]	YIN Hongxu, MOLCHANOV P, ALVAREZ J M, et al. Dreaming to distill: Data-free knowledge transfer via DeepInversion[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 8715–8724. doi: 10.1109/CVPR42600.2020.00874.
[9]	SHEN Gehui, ZHANG Song, CHEN Xiang, et al. Generative feature replay with orthogonal weight modification for continual learning[C]. 2021 International Joint Conference on Neural Networks, Shenzhen, China, 2021: 1–8. doi: 10.1109/IJCNN52387.2021.9534437.
[10]	WU Yue, CHEN Yinpeng, WANG Lijuan, et al. Large scale incremental learning[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 374–382. doi: 10.1109/CVPR.2019.00046.
[11]	LIU Yaoyao, SCHIELE B, and SUN Qianru. Adaptive aggregation networks for class-incremental learning[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 2544–2553. doi: 10.1109/CVPR46437.2021.00257.
[12]	CHEN Long, WANG Fei, YANG Ruijing, et al. Representation learning from noisy user-tagged data for sentiment classification[J]. International Journal of Machine Learning and Cybernetics, 2022, 13(12): 3727–3742. doi: 10.1007/s13042-022-01622-7.
[13]	ZHOU Dawei, YE Hanjia, and ZHAN Dechuan. Co-transport for class-incremental learning[C]. The 29th ACM International Conference on Multimedia, Chengdu, China, 2021: 1645–1654. doi: 10.1145/3474085.3475306.
[14]	WANG Fuyun, ZHOU Dawei, YE Hanjia, et al. FOSTER: Feature boosting and compression for class-incremental learning[C]. The 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 398–414. doi: 10.1007/978-3-031-19806-9_23.
[15]	ZHAO Bowen, XIAO Xi, GAN Guojun, et al. Maintaining discrimination and fairness in class incremental learning[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 13208–13217. doi: 10.1109/CVPR42600.2020.01322.
[16]	ZHOU Dawei, WANG Fuyun, YE Hanjia, et al. PyCIL: A python toolbox for class-incremental learning[J]. Science China Information Sciences, 2023, 66(9): 197101. doi: 10.1007/s11432-022-3600-y.