Advanced Search
Volume 45 Issue 12
Dec.  2023
Turn off MathJax
Article Contents
GUAN Xin, GUO Jiaen, LU Yu. Discriminant Adversarial Hashing Transformer for Cross-modal Vessel Image Retrieval[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4411-4420. doi: 10.11999/JEIT220980
Citation: GUAN Xin, GUO Jiaen, LU Yu. Discriminant Adversarial Hashing Transformer for Cross-modal Vessel Image Retrieval[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4411-4420. doi: 10.11999/JEIT220980

Discriminant Adversarial Hashing Transformer for Cross-modal Vessel Image Retrieval

doi: 10.11999/JEIT220980
Funds:  Taishan Scholar Engineering Special Fund (ts 201712072), The National Defense Science and Technology Excellence Youth Talent Fund (2017-JCJQ-ZQ-003)
  • Received Date: 2022-07-22
  • Rev Recd Date: 2023-01-27
  • Available Online: 2023-02-08
  • Publish Date: 2023-12-26
  • In view of the problems that the current mainstream cross-modal image retrieval algorithm based on Convolutional Neural Network (CNN) paradigm can not extract details of ship images effectively, and the cross-modal “heterogeneous gap” is difficult to eliminate, a Discriminant Adversarial Hash Transformer (DAHT) is proposed for fast cross-modal retrieval of ship images. The network adopts dual-stream Vision Transformer(ViT) structure and relies on the self-attention mechanism of ViT to extract the discriminant features of ship images. Based on this, a Hash Token structure is designed for Hash generation. In order to eliminate the cross-modal difference of the same category image, the whole retrieval framework is trained in an adversarial way, and modal confusion is realized by modal discrimination of generated Hash codes. At the same time, a Normalized discounted cumulative gain Weighting based Discriminant Cross-modal Quintuplet Loss (NW-DCQL) is designed to maintain the semantic discrimination of different types of images. In the four types of cross-modal retrieval tasks carried out on two datasets, the proposed method achieves 9.8 %, 5.2 %, 19.7 %, and 21.6 % performance improvement compared with the suboptimal retrieval results (32 bit), and also has certain performance advantages in unimodal retrieval tasks.
  • loading
  • [1]
    MUKHERJEE S, COHEN S, and GERTNER I. Content-based vessel image retrieval[J]. SPIE Automatic Target Recognition XXVI, Baltimore, USA, 2016, 9844: 984412.
    [2]
    何柏青, 王自敏. 反馈机制的大规模舰船图像检索[J]. 舰船科学技术, 2018, 40(4A): 157–159. doi: 10.3404/j.issn.1672-7649.2018.4A.053

    HE Baiqing and WANG Zimin. The feedback mechanism of large-scale ship image retrieval[J]. Ship Science and Technology, 2018, 40(4A): 157–159. doi: 10.3404/j.issn.1672-7649.2018.4A.053
    [3]
    LI Yansheng, ZHANG Yongjun, HUANG Xin, et al. Learning source-invariant deep hashing convolutional neural networks for cross-source remote sensing image retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(11): 6521–6536. doi: 10.1109/TGRS.2018.2839705
    [4]
    XIONG Wei, LV Yafei, ZHANG Xiaohan, et al. Learning to translate for cross-source remote sensing image retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(7): 4860–4874. doi: 10.1109/TGRS.2020.2968096
    [5]
    ZHU Junyan, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2242–2251.
    [6]
    XIONG Wei, XIONG Zhenyu, CUI Yaqi, et al. A discriminative distillation network for cross-source remote sensing image retrieval[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 1234–1247. doi: 10.1109/JSTARS.2020.2980870
    [7]
    SUN Yuxi, FENG Shanshan, YE Yunming, et al. Multisensor fusion and explicit semantic preserving-based deep hashing for cross-modal remote sensing image retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5219614. doi: 10.1109/TGRS.2021.3136641
    [8]
    HU Peng, PENG Xi, ZHU Hongyuan, et al. Learning cross-modal retrieval with noisy labels[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 5399–5409.
    [9]
    XU Xing, SONG Jingkuan, LU Huimin, et al. Modal-adversarial semantic learning network for extendable cross-modal retrieval[C]. 2018 ACM on International Conference on Multimedia Retrieval, Yokohama, Japan, 2018: 46–54.
    [10]
    WANG Bokun, YANG Yang, XU Xing, et al. Adversarial cross-modal retrieval[C]. The 25th ACM International Conference on Multimedia, Mountain View, USA, 2017: 154–162.
    [11]
    DONG Xinfeng, LIU Li, ZHU Lei, et al. Adversarial graph convolutional network for cross-modal retrieval[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(3): 1634–1645. doi: 10.1109/TCSVT.2021.3075242
    [12]
    GU Wen, GU Xiaoyan, GU Jingzi, et al. Adversary guided asymmetric hashing for cross-modal retrieval[C]. 2019 on International Conference on Multimedia Retrieval, Ottawa, Canada, 2019: 159–167.
    [13]
    HU Rong, YANG Jie, ZHU Bangpei, et al. Research on ship image retrieval based on BoVW model under hadoop platform[C]. The 1st International Conference on Information Science and Systems, Jeju, Korea, 2018: 156–160.
    [14]
    TIAN Chi, XIA Jinfeng, TANG Ji, et al. Deep image retrieval of large-scale vessels images based on BoW model[J]. Multimedia Tools and Applications, 2020, 79(13/14): 9387–9401. doi: 10.1007/s11042-019-7725-y
    [15]
    邹利华. 基于PCA降维的舰船图像检索方法[J]. 舰船科学技术, 2020, 42(24): 97–99.

    ZOU Lihua. Research on ship image retrieval method based on PCA dimension reduction[J]. Ship Science and Technology, 2020, 42(24): 97–99.
    [16]
    DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C/OL]. The 9th International Conference on Learning Representations, 2021.
    [17]
    HERMANS A, BEYER L, and LEIBE B. In defense of the triplet loss for person re-identification[J]. arXiv: 1703.07737, 2017.
    [18]
    LI Tao, ZHANG Zheng, PEI Lishen, et al. HashFormer: Vision transformer based deep hashing for image retrieval[J]. IEEE Signal Processing Letters, 2022, 29: 827–831. doi: 10.1109/LSP.2022.3157517
    [19]
    LI Mengyang, SUN Weiwei, DU Xuan, et al. Ship classification by the fusion of panchromatic image and multi-spectral image based on pseudo siamese LightweightNetwork[J]. Journal of Physics: Conference Series, 2021, 1757: 012022.
    [20]
    ZHANG M M, CHOI J, DANIILIDIS K, et al. VAIS: A dataset for recognizing maritime imagery in the visible and infrared spectrums[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, USA, 2015: 10–16.
    [21]
    关欣, 国佳恩, 衣晓. 基于低秩双线性池化注意力网络的舰船目标识别[J]. 系统工程与电子技术, 2023, 45(5): 1305–1314.

    GUAN Xin, GUO Jiaen, and YI Xiao. Low rank bilinear pooling attention network for ship target recognition[J]. Systems Engineering and Electronics, 2023, 45(5): 1305–1314.
    [22]
    BAI Cong, ZENG Chao, MA Qing, et al. Deep adversarial discrete hashing for cross-modal retrieval[C]. 2020 International Conference on Multimedia Retrieval, Dublin, Ireland, 2020: 525–531.
    [23]
    JIANG Qingyuan and LI Wujun. Deep cross-modal hashing[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 3270–3278.
    [24]
    XIONG Wei, XIONG Zhenyu, ZHANG Yang, et al. A deep cross-modality hashing network for SAR and optical remote sensing images retrieval[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 5284–5296. doi: 10.1109/JSTARS.2020.3021390
    [25]
    ZHU Han, LONG Mingsheng, WANG Jianmin, et al. Deep hashing network for efficient similarity retrieval[C]. The Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, USA, 2016: 2415–2421.
    [26]
    LIU Haomiao, WANG Ruiping, SHAN Shiguang, et al. Deep supervised hashing for fast image retrieval[J]. International Journal of Computer Vision, 2019, 127(9): 1217–1234. doi: 10.1007/s11263-019-01174-4
    [27]
    CAO Yue, LONG Mingsheng, LIU Bin, et al. Deep Cauchy hashing for hamming space retrieval[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1229–1237.
    [28]
    LI Yunqiang, PEI Wenjie, ZHA Yufei, et al. Push for quantization: Deep fisher hashing[C]. The 30th British Machine Vision Conference 2019, Cardiff, UK, 2019.
    [29]
    FAN Lixin, NG K W, JU Ce, et al. Deep polarized network for supervised learning of accurate binary hashing codes[C]. The Twenty-Ninth International Joint Conference on Artificial Intelligence, Yokohama, Japan, 2020: 825–831.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(6)  / Tables(5)

    Article Metrics

    Article views (516) PDF downloads(133) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return