Explicit Discrimination-Driven Automatic Unknown Class Clustering for Open-World Semi-Supervised Learning
-
摘要: 传统目标识别常基于闭集假设,即测试目标类别均属训练集。然而真实世界具有开放性,除训练集已知类外,识别任务还关注未知新目标的解译,要求模型兼具已知类识别与未知类发现并聚类的能力。针对上述问题,该文提出一种显式鉴别驱动-未知类自动聚类的直推式开放世界半监督学习方法,综合利用少量标记已知类训练样本与大量无标记待测已知类/未知类样本学习模型,实现已知类识别与未知类聚类。所提方法重点包含基于已知类边界极值分布动态扩展的无标记已知类-未知类鉴别模块,结合极值理论建模标记已知类边界分布,并在半监督学习中对该分布进行动态完善,提升模型鉴别能力;对所鉴别高置信未知类样本,基于近邻交并比关系合并的未知类自动聚类模块进行近邻聚类簇合并,实现未知类自动聚类。鉴别模块和未知类自动聚类模块迭代优化。基于光学CIFAR-10与实测雷达数据的实验表明所提方法具有良好的已知类识别与未知类聚类性能。Abstract: Traditional target recognition follows closed-set assumption, restricting test targets to those present in training data, yet real-world scenarios involve unlabeled unknown classes requiring simultaneous recognition of known categories and discovery/clustering of novel ones. Thus, this paper introduces a transductive open-world semi-supervised learning method based on explicit discrimination-driven automatic unknown class clustering, leveraging limited labeled known-class data and extensive unlabeled testing samples (knowns and unknowns). The approach combines: Dynamic Known-Unknown Class Discrimination (DKUCD), which models labeled known class boundaries using extreme value theory and refines them dynamically with high-confident knowns samples to enhance discrimination, and Neighbor Intersection-over-Union Clustering (NIOUC), which clusters high-confidence unknowns samples by merging neighbors based on intersection-over-union ratios. DKUCD and NIOUC are optimized iteratively. Experiments on optical CIFAR-10 and measured radar data demonstrate the method's effectiveness in accurately recognizing known classes while clustering unknown classes.
-
表 1 不同方法在CIFAR-10数据集上的RAUS(%)、$ {\text{ACC}}_{\text{known}} $(%)、$ {\text{ACC}}_{\text{cluster}} $(%)(每个指标左侧对应未知类个数已知条件下各方法性能,右侧对应未知类个数未知条件下各方法性能-三次K-means随机初始化下进行学习的平均结果)及$ \text{K+U} $估计个数
方法 RAUS(%) $ {\text{ACC}}_{\text{known}} $(%) $ {\text{ACC}}_{\text{cluster}} $(%) $ \text{K+U} $ 无监督
聚类BootSC - - 87.70 76.07 88.92 77.42 [8,8,9] SNSCC - - 88.24 78.10 89.66 78.35 [8,8,9] 开放
世界
半监督
学习ORCA 91.71 82.45 88.20 80.02 90.40 81.22 [9,11,11] OpenLDN 94.54 86.55 93.40 88.35 93.65 84.14 [9,11,11] TRSSL 93.20 87.25 95.15 89.07 92.61 84.67 [9,11,11] PKOSSL 94.00 87.11 94.90 89.10 91.80 85.00 [9,11,11] LeGoGCD 98.80 92.25 64.64 62.62 98.45 89.42 [10,11,11] AFGCD 98.04 91.88 87.85 83.72 97.68 88.31 [10,11,11] 本文方法 97.25 95.33 96.12 93.86 96.40 92.74 [10,10,11] 表 2 不同方法在MSTAR、ATRNet-STAR数据集上的RAUS(%)、$ {\text{ACC}}_{\text{known}} $(%)、$ {\text{ACC}}_{\text{cluster}} $(%)(每个指标下左侧、右侧分别对应MSTAR数据、ATRNet-STAR数据实验,均在未知类个数未知条件下以三次K-means随机初始化后学习的平均结果作最终结果)及$ \text{K+U} $估计个数
方法 RAUS $ {\text{ACC}}_{\text{known}} $ $ {\text{ACC}}_{\text{cluster}} $ $ \text{K+U} $ MSTAR ATRNet-STAR MSTAR ATRNet-STAR MSTAR ATRNet-STAR MSTAR ATRNet-STAR 无监督
聚类BootSC - - 72.80 68.62 61.18 25.99 [7,8,8] [36,37,38] SNSCC - - 75.15 70.33 62.02 26.96 [7,8,8] [36,37,38] 开放
世界
半监督
学习ORCA 70.03 54.68 80.73 83.76 67.99 31.70 [8,9,9] [38,39,41] OpenLDN 72.32 58.53 83.09 88.12 70.66 34.24 [8,9,9] [38,39,41] TRSSL 73.55 60.12 83.82 87.05 70.24 37.98 [8,9,9] [38,39,41] PKOSSL 74.18 63.66 81.87 85.20 72.61 38.13 [8,9,9] [38,39,41] LeGoGCD 80.90 75.08 60.95 89.17 78.45 42.22 [8,9,10] [38,40,42] AFGCD 78.60 72.66 62.45 88.41 76.68 40.37 [8,9,10] [38,40,42] 本文方法 86.59 84.94 88.27 90.15 84.21 50.09 [10,10,10] [39,39,40] 表 3 所提模块的消融实验结果RAUS、$ {\text{ACC}}_{\text{known}} $、$ {\text{ACC}}_{\text{cluster}} $(每个指标下左侧、右侧分别对应CIFAR-10数据、ATRNet-STAR数据结果)
无标记已知类-未知类鉴别模块 未知类自动聚类模块 RAUS $ {\text{ACC}}_{\text{known}} $ $ {\text{ACC}}_{\text{cluster}} $ Baseline × × 87.25 60.12 89.07 87.05 83.67 37.98 所提
模块Π × 95.09 80.97 91.54 88.60 86.35 44.85 × Π 86.76 62.79 90.19 87.33 89.01 39.01 Π Π 95.33 84.94 93.86 90.15 92.74 50.09 -
[1] 张东阳, 陆子轩, 刘军民, 等. 深度模型的持续学习综述: 理论、方法和应用[J]. 电子与信息学报, 2024, 46(10): 3849–3878. doi: 10.11999/JEIT240095.ZHANG Dongyang, LU Zixuan, LIU Junmin, et al. A survey of continual learning with deep networks: Theory, method and application[J]. Journal of Electronics & Information Technology, 2024, 46(10): 3849–3878. doi: 10.11999/JEIT240095. [2] . CAO Kaidi, BRBIC M, and LESKOVEC J. Open-world semi-supervised learning[C]. International Conference on Learning Representations, 2022: 22837–22855. (查阅网上资料, 未找到本条文献的出版地信息, 请确认). [3] . RIZVE M N, KARDAN N, KHAN S, et al. Openldn: Learning to discover novel classes for open-world semi-supervised learning[C]. 17th European Conference on Computer Vision–ECCV 2022, Tel Aviv, Israel, 2022: 382–401. doi: 10.1007/978-3-031-19821-2_22. [4] . RIZVE M N, KARDAN N, and SHAH M. Towards realistic semi-supervised learning[C]. 17th European Conference on Computer Vision – ECCV 2022, Tel Aviv, Israel, 2022: 437–455. doi: 10.1007/978-3-031-19821-2_25. [5] . MARCO C. Sinkhorn distances: Lightspeed computation of optimal transport[C]. Advances in Neural Information Processing Systems, Stateline 26: 27th Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, United States, 2013: 2292–2300. [6] ZHAO Tianhao, LIN Yutian, WU Yu, et al. Promote knowledge mining towards open-world semi-supervised learning[J]. Pattern Recognition, 2024, 149: 110259. doi: 10.1016/j.patcog.2024.110259. [7] . CAO Xinzi, ZHENG Xiawu, WANG Guanhong, et al. Solving the catastrophic forgetting problem in generalized category discovery[C]. Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2024, 16880–16889. doi: 10.1109/CVPR52733.2024.01597. [8] . XU Qiyu, HU Zhanxuan, DUAN Yu, et al. A hidden stumbling block in generalized category discovery: Distracted attention[C]. Proceedings of the 2025 IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, USA, 2025: 405–414. doi: 10.1109/ICCV51701.2025.00045. [9] . KRIZHEVSKY A and HINTON G E. Learning multiple layers of features from tiny images[R]. 2009. (查阅网上资料, 未找到本条文献的报告编号信息, 请确认). [10] ROSS T D, WORRELL S W, VELTEN V J, et al. Standard SAR ATR evaluation experiments using the MSTAR public release data set[J]. Algorithms for Synthetic Aperture Radar Imagery V, 1998, 3370: 566–573. doi: 10.1117/12.321859. [11] LIU Yongxiang, LI Weijie, LIU Li, et al. ATRNet-STAR: A large dataset and benchmark towards remote sensing object recognition in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026, 48(6): 6735–6753. doi: 10.1109/TPAMI.2026.3658649. [12] . MOVSHOVITZ-ATTIAS Y, TOSHEV A, LEUNG T K, et al. No fuss distance metric learning using proxies[C]. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 360–368. doi: 10.1109/ICCV.2017.47. [13] 陈健, 雍奇锋, 杜兰, 等. 结合未知类特征生成与分类得分修正的SAR目标开集识别方法[J]. 电子与信息学报, 2024, 46(10): 3890–3907. doi: 10.11999/JEIT240138.CHEN Jian, YONG Qifeng, DU Lan, et al. An open set recognition method for SAR targets combining unknown feature generation and classification score modification[J]. Journal of Electronics & Information Technology, 2024, 46(10): 3890–3907. doi: 10.11999/JEIT240138. [14] 杜兰, 李逸明, 薛世鲲, 等. 结合相似度预测和阈值自动求解的开集条件下毫米波雷达点云步态识别方法[J]. 电子与信息学报, 2025, 47(6): 1850–1863. doi: 10.11999/JEIT241034.DU Lan, LI Yiming, XUE Shikun, et al. Millimeter-wave radar point cloud gait recognition method under open-set conditions based on similarity prediction and automatic threshold estimation[J]. Journal of Electronics & Information Technology, 2025, 47(6): 1850–1863. doi: 10.11999/JEIT241034. [15] GUO Wengang, YE Wei, CHEN Chunchun, et al. Bootstrap deep spectral clustering with optimal transport[J]. IEEE Transactions on Multimedia, 2026, 28: 531–544. doi: 10.1109/TMM.2025.3623492. [16] DUAN Yu, CHEN Huimin, ZHANG Runxin, et al. Soft neighbors supported contrastive clustering[J]. IEEE Transactions on Image Processing, 2025, 34: 4315–4327. doi: 10.1109/TIP.2025.3583194. -
下载: