Advanced Search
Turn off MathJax
Article Contents
ZHANG Hongying, FAN Shiyu, LUO Qian, ZHANG Tao. Combining Visual-Textual Matching and Graph Embedding for Visible-Infrared Person Re-identification[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240318
Citation: ZHANG Hongying, FAN Shiyu, LUO Qian, ZHANG Tao. Combining Visual-Textual Matching and Graph Embedding for Visible-Infrared Person Re-identification[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT240318

Combining Visual-Textual Matching and Graph Embedding for Visible-Infrared Person Re-identification

doi: 10.11999/JEIT240318
Funds:  Key Supported Project of the Civil Aviation Joint Research Fund of the National Natural Science Foundation of China (U2133211), Graduate Research Innovation Grant Program of Civil Aviation University of China (2023YJSKC05005)
  • Received Date: 2024-04-22
  • Rev Recd Date: 2024-06-24
  • Available Online: 2024-06-27
  • For cross-modal person re-identification in visible-infrared images, methods using modality conversion and adversarial networks yield associative information between modalities. However, this approach falls short in effective feature recognition. Thus, a two-stage approach using visual-text matching and graph embedding for enhanced re-identification effectiveness is proposed in this paper. A context-optimized scheme is utilized by the method to construct learnable text templates that generate person descriptions as associative information between modalities. Specifically, in the first stage, unified text descriptions of the same person across different modalities are utilized as prior information, assisting in the reduction of modality differences, based on the Contrastive Language–Image Pre-training (CLIP) model. Meanwhile, in the second stage, a cross-modal constraint framework based on graph embedding is applied, and a modality-adaptive loss function is designed, aiming to improve person recognition accuracy. The method's efficacy has been confirmed through extensive experiments on the SYSU-MM01 and RegDB datasets, with a Rank-1 accuracy of 64.2% and mean Average Precision (mAP) of 60.2% on SYSU-MM01 being achieved, thereby demonstrating significant improvements in cross-modal person re-identification.
  • loading
  • [1]
    张永飞, 杨航远, 张雨佳, 等. 行人再识别技术研究进展[J]. 中国图象图形学报, 2023, 28(6): 1829–1862. doi: 10.11834/jig.230022.

    ZHANG Yongfei, YANG Hangyuan, ZHANG Yujia, et al. Recent progress in person re-ID[J]. Journal of Image and Graphics, 2023, 28(6): 1829–1862. doi: 10.11834/jig.230022.
    [2]
    王粉花, 赵波, 黄超, 等. 基于多尺度和注意力融合学习的行人重识别[J]. 电子与信息学报, 2020, 42(12): 3045–3052. doi: 10.11999/JEIT190998.

    WANG Fenhua, ZHAO Bo, HUANG Chao, et al. Person re-identification based on multi-scale network attention fusion[J]. Journal of Electronics & Information Technology, 2020, 42(12): 3045–3052. doi: 10.11999/JEIT190998.
    [3]
    LI Shuang, LI Fan, LI Jinxing, et al. Logical relation inference and multiview information interaction for domain adaptation person re-identification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023. doi: 10.1109/tnnls.2023.3281504. (查阅网上资料,未找到本条文献卷期页码信息,请确认) .
    [4]
    CHOI S, LEE S, KIM Y, et al. Hi-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10254–10263. doi: 10.1109/cvpr42600.2020.01027.
    [5]
    HUANG Nianchang, LIU Jianan, LUO Yongjiang, et al. Exploring modality-shared appearance features and modality-invariant relation features for cross-modality person re-identification[J]. Pattern Recognition, 2023, 135: 109145. doi: 10.1016/j.patcog.2022.109145.
    [6]
    ZHANG Yukang and WANG Hanzi. Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification[C]. Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 2153–2162. doi: 10.1109/CVPR52729.2023.00214.
    [7]
    DAI Pingyang, JI Rongrong, WANG Haibin, et al. Cross-modality person re-identification with generative adversarial training[C]. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018: 677–683.
    [8]
    WANG Guan’an, ZHANG Tianzhu, CHENG Jian, et al. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment[C]. The 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 3622–3631. doi: 10.1109/ICCV.2019.00372.
    [9]
    寇旗旗, 黄绩, 程德强, 等. 基于语义融合的域内相似性分组行人重识别[J]. 通信学报, 2022, 43(7): 153–162. doi: 10.11959/j.issn.1000-436x.2022136.

    KOU Qiqi, HUANG Ji, CHENG Deqiang, et al. Person re-identification with intra-domain similarity grouping based on semantic fusion[J]. Journal on Communications, 2022, 43(7): 153–162. doi: 10.11959/j.issn.1000-436x.2022136.
    [10]
    LI Siyuan, SUN Li, and LI Qingli. CLIP-ReID: Exploiting vision-language model for image re-identification without concrete text labels[C]. Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, USA, 2023: 1405–1413. doi: 10.1609/aaai.v37i1.25225.
    [11]
    MORSING L H, SHEIKH-OMAR O A, and IOSIFIDIS A. Supervised domain adaptation using graph embedding[C]. 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 2021: 7841–7847. doi: 10.1109/icpr48806.2021.9412422.
    [12]
    YE Mang, SHEN Jianbing, LIN Gaojie, et al. Deep learning for person re-identification: A survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 2872–2893. doi: 10.1109/TPAMI.2021.3054775.
    [13]
    WU Ancong, ZHENG Weishi, YU Hongxing, et al. RGB-infrared cross-modality person re-identification[C]. Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 5390–5399. doi: 10.1109/iccv.2017.575.
    [14]
    NGUYEN D T, HONG H G, KIM K W, et al. Person recognition system based on a combination of body images from visible light and thermal cameras[J]. Sensors, 2017, 17(3): 605. doi: 10.3390/s17030605.
    [15]
    YE Mang, LAN Xiangyuan, LI Jiawei, et al. Hierarchical discriminative learning for visible thermal person re-identification[C]. Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018: 7501–7508. doi: 10.1609/aaai.v32i1.12293.
    [16]
    YE Mang, WANG Zheng, LAN Xiangyuan, et al. Visible thermal person re-identification via dual-constrained top-ranking[C]. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018: 1092–1099.
    [17]
    YE Mang, LAN Xiangyuan, WANG Zheng, et al. Bi-directional center-constrained top-ranking for visible thermal person re-identification[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 407–419. doi: 10.1109/tifs.2019.2921454.
    [18]
    XIANG Xuezhi, LV Ning, YU Zeting, et al. Cross-modality person re-identification based on dual-path multi-branch network[J]. IEEE Sensors Journal, 2019, 19(23): 11706–11713. doi: 10.1109/JSEN.2019.2936916.
    [19]
    BASARAN E, GÖKMEN M, and KAMASAK M E. An efficient framework for visible–infrared cross modality person re-identification[J]. Signal Processing: Image Communication, 2020, 87: 115933. doi: 10.1016/j.image.2020.115933.
    [20]
    LI Diangang, WEI Xing, HONG Xiaopeng, et al. Infrared-visible cross-modal person re-identification with an x modality[C]. Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, 2020: 4610–4617. doi: 10.1609/aaai.v34i04.5891.
    [21]
    YE Mang, SHEN Jianbing, CRANDALL D J, et al. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification[C]. Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, 2020: 229–247. doi: 10.1007/978-3-030-58520-4_14.
    [22]
    LIU Haojie, MA Shun, XIA Daoxun, et al. SFANet: A spectrum-aware feature augmentation network for visible-infrared person reidentification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(4): 1958–1971. doi: 10.1109/tnnls.2021.3105702.
    [23]
    HUANG Zhipeng, LIU Jiawei, LI Liang, et al. Modality-adaptive mixup and invariant decomposition for RGB-infrared person re-identification[C]. Proceedings of the 36th AAAI Conference on Artificial Intelligence, 2022: 1034–1042. doi: 10.1609/aaai.v36i1.19987. (查阅网上资料,未找到本条文献出版地信息,请确认) .
    [24]
    FU Chaoyou, HU Yibo, WU Xiang, et al. CM-NAS: Cross-modality neural architecture search for visible-infrared person re-identification[C]. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 11803–11812. doi: 10.1109/ICCV48922.2021.01161.
    [25]
    DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, 2021. (查阅网上资料, 未找到本条文献出版地信息, 请确认) .
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(6)

    Article Metrics

    Article views (52) PDF downloads(6) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return