关键细粒度信息指导的多尺度遮挡行人重识别

周玉; 赵小锋; 汪一; 孙彦景; 李松

doi:10.11999/JEIT230686

关键细粒度信息指导的多尺度遮挡行人重识别

doi: 10.11999/JEIT230686

周玉^{1, 2},
赵小锋¹,
汪一^{1, 3},
孙彦景^1, ,,
李松¹

1.
中国矿业大学信息与控制工程学院徐州 221116
2.
徐州市第一人民医院徐州 221116
3.
江苏师范大学科文学院徐州 221132

基金项目: 国家自然科学基金(62001475)，江苏省自然科学基金(BK20200649)

详细信息

作者简介:
周玉：女，副教授，研究方向为人工智能、图像处理、行人重识别

赵小锋：男，硕士生，研究方向为图像处理、遮挡行人重识别

汪一：男，博士生，研究方向为多媒体图像处理、行人重识别

孙彦景：男，教授，研究方向为图像处理、行人重识别

李松：男，副教授，研究方向为图像处理、行人重识别

通讯作者:
孙彦景　yjsun@cumt.edu.cn

中图分类号: TN911.73;TP391.41
计量
- 文章访问数: 39
- HTML全文浏览量: 16
- PDF下载量: 9
- 被引次数: 0
出版历程
- 收稿日期: 2023-07-07
- 修回日期: 2024-01-19
- 网络出版日期: 2024-01-26

Multi-Scale Occluded Person Re-Identification Guided by Key Fine-Grained Information

ZHOU Yu^{1, 2},
ZHAO Xiaofeng¹,
WANG Yi^{1, 3},
SUN Yanjing^{1
, ,},
LI Song¹

1.
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
2.
Xuzhou First Peoples Hospital, Xuzhou 221116, China
3.
Jiangsu Normal University Kewen College, Xuzhou 221132, China

Funds: The National Natural Science Foundation of China (62001475), The National Natural Science Foundation of Jiangsu Province(BK20200649)

摘要

摘要: 为了减轻背景和遮挡等干扰信息对行人身份重识别(ReID)准确率的影响以及充分利用细粒度和粗粒度信息之间的互补性，该文提出关键细粒度信息指导的多尺度遮挡行人重识别网络。首先，将图像划分为两种不同尺寸的重叠图像块，构建同时包含细粒度和粗粒度信息提取分支的多尺度识别网络，以更好模拟人类观察图像时的多尺度特性以及观察相邻区域时的连续性特性。然后，考虑到细粒度分支能够提取更多的图像细节信息且细粒度和粗粒度信息之间存在一定的共性与差异，进一步通过细粒度注意力模块实现细粒度信息对粗粒度信息学习分支的指导。其中，参与指导的细粒度信息是通过干扰信息剔除(IIE)模块滤除干扰信息后保留的关键信息。最后，通过双次差分获取与行人身份识别相关的关键信息，并通过标签和特征等多维度的联合监督，实现行人身份的预测。在多个公开的行人重识别数据库进行的大量实验证明了该算法的性能优越性以及其中各个模块的有效性和必要性。
- 遮挡行人重识别 /
- 多尺度 /
- 细粒度信息 /
- 粗粒度信息 /
- 干扰信息剔除
Abstract: To reduce the influence of background and occlusion on the accuracy of pedestrian identity Re-IDentification (ReID) and make full use of the complementarity between fine-grained and coarse-grained information, a multi-scale occluded pedestrian ReID network guided by key fine-grained information is proposed. First, the image is divided into two types of overlapping patches with different sizes to better simulate the multi-scale characteristics of human observing images and the continuity characteristics of human observing adjacent regions, so a multi-scale recognition network containing both fine-grained and coarse-grained information extraction branches is constructed. Then, considering fine-grained information contains more details and there are similarities and differences between fine-grained and coarse-grained information, fine-grained attention module is further employed to realize the guide of the fine-grained branch to the coarse-grained branch. Among them, the fine-grained information is the key information retained after filtering out the interference information by the Interference Information Elimination (IIE) module. Finally, the key information related to pedestrian ReID is obtained by bivariate difference, and the prediction of pedestrian identity is realized by multi-dimensional joint supervision such as tags and features. Extensive experiments on several public pedestrian ReID databases prove the superiority of this algorithm and the effectiveness and necessity of each module.
- Occluded person ReID /
- Multi-scale /
- Fine-grained information /
- Coarse-grained information /
- Interference information elimination

HTML全文

图 1 本文算法的结构框图

下载: 全尺寸图片幻灯片

图 2 干扰信息剔除模块结构图

下载: 全尺寸图片幻灯片

图 3 细粒度信息指导编码模块结构图

下载: 全尺寸图片幻灯片

图 4 参数$ \alpha $和$ \eta $对算法性能的影响

下载: 全尺寸图片幻灯片

图 5 特征图可视化结果

下载: 全尺寸图片幻灯片

表 1 本文算法及对比算法在Occluded-Duke数据集上的实验结果

性能	PCB^[5]	PGFA^[8]	PVPM ^[10]	ISP ^[20]	HOReID ^[11]	MoS^[4]	PAT ^[21]	TransReID* ^[14]	PFD ^[5]	本文算法
Rank-1	42.6	51.4	47.0	62.8	55.1	61.0	64.5	66.4	69.5	71.2
mAP	33.7	37.3	37.7	52.3	43.8	49.2	53.6	59.2	61.8	62.3

下载: 导出CSV

表 2 本文算法及对比算法在Occluded- REID数据集上的实验结果

性能	PCB^[5]	PVPM ^[10]	HOReID^[11]	PAT ^[21]	PFD ^[5]	DSR^[22]	Yang^[6]	Yan^[23]	本文算法
Rank-1	41.3	66.8	80.3	81.6	81.5	72.8	81.0	78.5	86.3
mAP	38.9	59.5	70.2	72.1	83.0	62.8	71.0	72.9	81.3

下载: 导出CSV

表 3 本文算法及对比算法在全身数据集Market-1501和DukeMTMC上的实验结果

方法	Market-1501		DukeMTMC
方法	Rank-1	mAP	Rank-1	mAP
PCB ^[5]	92.3	71.4	81.8	66.1
PGFA^[8]	91.2	76.8	82.6	65.5
ISP ^[20]	95.3	88.6	89.6	80.0
HOReID ^[11]	94.2	84.9	86.9	75.6
MoS ^[4]	95.4	89.0	90.6	80.2
PAT ^[21]	95.4	88.0	88.8	78.2
TransReID* ^[14]	95.2	88.9	90.7	82.0
PFD ^[5]	95.5	89.7	91.2	83.2
本文算法	95.7	89.5	90.7	82.5

下载: 导出CSV

表 4 本文算法中各模块的贡献

方法	Rank-1	Rank-5	Rank-10	mAP
Baseline	61.9	78.2	83.8	53.1
Baseline +IIE	65.1	80.3	85.4	55.6
无FGI	65.7	80.0	84.6	58.0
无DD	64.2	80.2	85.1	57.0
粗粒度->细粒度	69.1	82.7	86.7	61.5
粗粒度<->细粒度	48.4	66.9	73.6	43.0
本文算法	71.2	82.9	86.9	62.3

下载: 导出CSV

表 5 超参数$ {\omega _1} $和$ {\omega _2} $对算法性能Rank-1(mAP)的影响

	0	1	2	3
0	67.1(56.5)	67.1(56.8)	67.3(57.1)	67.9(58.5)
1	66.4(56.4)	68.5(58.1)	68.5(57.9)	67.3(57.4)
2	67.3(57.5)	67.9(57.2)	70.2(62.1)	68.7(58.9)
3	66.6(56.5)	67.6(57.4)	68.6(58.7)	71.2(62.3)

下载: 导出CSV

表 6 图像块尺寸对算法性能的影响

	4$ \times $4, 8$ \times $8	4$ \times $4, 12$ \times $12	4$ \times $4, 16$ \times $16	4$ \times $4, 20$ \times $20	8$ \times $8, 12$ \times $12	8$ \times $8, 16$ \times $16	8$ \times $8, 20$ \times $20	12$ \times $12, 16$ \times $16	12$ \times $12, 20$ \times $20	16$ \times $16, 20$ \times $20
Rank-1	63.3	66.4	64.3	64.8	69.3	68.7	70.2	71.2	69.6	66.5
mAP	52.9	55.7	53.8	55.4	59.9	59.1	60.2	62.3	59.4	57.2

下载: 导出CSV

表 7 3种尺度输入对算法性能的影响

	8$ \times $8, 12$ \times $12, 16$ \times $16	8$ \times $8, 12$ \times $12,20$ \times $20	8$ \times $8, 16$ \times $16,20$ \times $20	12$ \times $12, 16$ \times $16, 20$ \times $20
Rank-1	64	65	66.7	67.3
mAP	53.8	54.3	56	57.4
参数量 (M)	313.47	313.47	313.47	313.47
响应时间(s)	12.57	12.37	12.08	10.25

下载: 导出CSV

参考文献(23)

[1]	石跃祥, 周玥. 基于阶梯型特征空间分割与局部注意力机制的行人重识别[J]. 电子与信息学报, 2022, 44(1): 195–202. doi: 10.11999/JEIT201006 SHI Yuexiang and ZHOU Yue. Person re-identification based on stepped feature space segmentation and local attention mechanism[J]. Journal of Electronics & Information Technology, 2022, 44(1): 195–202. doi: 10.11999/JEIT201006
[2]	许文正, 黄天欢, 贲晛烨, 等. 跨视角步态识别综述[J]. 中国图象图形学报, 2023, 28(5): 1265–1286. doi: 10.11834/jig.220458 XU Wenzheng, HUANG Tianhuan, BEN Xianye, et al. Cross-view gait recognition: A review[J]. Journal of Image and Graphics, 2023, 28(5): 1265–1286. doi: 10.11834/jig.220458
[3]	SUN Yifan, XU Qin, LI Yali, et al. Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification[C]. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 393–402.
[4]	JIA Mengxi, CHENG Xinhua, ZHAI Yunpeng, et al. Matching on sets: Conquer occluded person re-identification without alignment[C]. Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021: 1673–1681. (查阅网上资料,未找到本条文献出版地信息,请确认并补充) . JIA Mengxi, CHENG Xinhua, ZHAI Yunpeng, et al. Matching on sets: Conquer occluded person re-identification without alignment[C]. Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021: 1673–1681. doi: 10.1609/aaai.v35i2.16260. (查阅网上资料,未找到本条文献出版地信息,请确认并补充).
[5]	WANG Tao, LIU Hong, SONG Pinhao, et al. Pose-guided feature disentangling for occluded person re-identification based on transformer[C]. Proceedings of the 36th AAAI Conference on Artificial Intelligence, 2022: 2540–2549. (查阅网上资料,未找到本条文献出版地信息,请确认并补充) . WANG Tao, LIU Hong, SONG Pinhao, et al. Pose-guided feature disentangling for occluded person re-identification based on transformer[C]. Proceedings of the 36th AAAI Conference on Artificial Intelligence, 2022: 2540–2549. doi: 10.1609/aaai.v36i3.20155. (查阅网上资料,未找到本条文献出版地信息,请确认并补充).
[6]	YANG Jinrui, ZHANG Jiawei, YU Fufu, et al. Learning to know where to see: A visibility-aware approach for occluded person re-identification[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 11865–11874.
[7]	CHENG Xinhua, JIA Mengxi, WANG Qian, et al. More is better: Multi-source dynamic parsing attention for occluded person re-identification[C]. Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 2022: 6840–6849.
[8]	SOMERS V, DE VLEESCHOUWER C, and ALAHI A. Body part-based representation learning for occluded person re-identification[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, USA, 2023: 1613–1623.
[9]	MIAO Jiaxu, WU Yu, LIU Ping, et al. Pose-guided feature alignment for occluded person re-identification[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 542–551.
[10]	GAO Shang, WANG Jingya, LU Huchuan, et al. Pose-guided visible part matching for occluded person ReID[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 11741–11749.
[11]	WANG Guan’an, YANG Shuo, LIU Huanyu, et al. High-order information matters: Learning relation and topology for occluded person re-identification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 6448–6457.
[12]	JIA Mengxi, CHENG Xinhua, LU Shijian, et al. Learning disentangled representation implicitly via transformer for occluded person re-identification[J]. IEEE Transactions on Multimedia, 2023, 25: 1294–1305. doi: 10.1109/tmm.2022.3141267
[13]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, Austria, 2021. (查阅网上资料, 未找到本条文献出版城市信息, 请确认并补充) . DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, Austria, 2021. (查阅网上资料, 未找到本条文献出版城市信息, 请确认并补充).
[14]	HE Shuting, LUO Hao, WANG Pichao, et al. TransReID: Transformer-based object re-identification[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 14993–15002.
[15]	ZHOU Qinqin, ZHONG Bineng, LAN Xiangyuan, et al. Fine-grained spatial alignment model for person re-identification with focal triplet loss[J]. IEEE Transactions on Image Processing, 2020, 29: 7578–7589. doi: 10.1109/TIP.2020.3004267
[16]	ZHUO Jiaxuan, CHEN Zeyu, LAI Jianhuang, et al. Occluded person re-identification[C]. 2018 IEEE International Conference on Multimedia and Expo (ICME), San Diego, USA, 2018: 1–6.
[17]	ZHENG Liang, SHEN Liyue, TIAN Lu, et al. Scalable person re-identification: A benchmark[C]. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1116–1124.
[18]	ZHENG Zhedong, ZHENG Liang, and YANG Yi. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 3774–3782.
[19]	KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84–90. doi: 10.1145/3065386
[20]	SUN Yifan, ZHENG Liang, YANG Yi, et al. Beyond part models: Person retrieval with refined part pooling (and A Strong Convolutional Baseline)[C]. 15th European Conference on Computer Vision-ECCV, Munich, Germany, 2018: 501–518.
[21]	LI Yulin, HE Jianfeng, ZHANG Tianzhu, et al. Diverse part discovery: Occluded person re-identification with part-aware transformer[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 2897–2906.
[22]	HE Lingxiao, LIANG Jian, LI Haiqing, et al. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7073–7082.
[23]	YAN Cheng, PANG Guansong, JIAO Jile, et al. Occluded person re-identification with single-scale global representations[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 11855–11864.