高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

LFTA:轻量级特征提取与加性注意力的特征匹配方法

郭志强 汪子涵 王永圣 陈鹏羽

郭志强, 汪子涵, 王永圣, 陈鹏羽. LFTA:轻量级特征提取与加性注意力的特征匹配方法[J]. 电子与信息学报, 2025, 47(8): 2872-2882. doi: 10.11999/JEIT250124
引用本文: 郭志强, 汪子涵, 王永圣, 陈鹏羽. LFTA:轻量级特征提取与加性注意力的特征匹配方法[J]. 电子与信息学报, 2025, 47(8): 2872-2882. doi: 10.11999/JEIT250124
GUO Zhiqiang, WANG Zihan, WANG Yongsheng, CHEN Pengyu. LFTA:Lightweight Feature Extraction and Additive Attention-based Feature Matching Method[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2872-2882. doi: 10.11999/JEIT250124
Citation: GUO Zhiqiang, WANG Zihan, WANG Yongsheng, CHEN Pengyu. LFTA:Lightweight Feature Extraction and Additive Attention-based Feature Matching Method[J]. Journal of Electronics & Information Technology, 2025, 47(8): 2872-2882. doi: 10.11999/JEIT250124

LFTA:轻量级特征提取与加性注意力的特征匹配方法

doi: 10.11999/JEIT250124 cstr: 32379.14.JEIT250124
基金项目: 太原市“双百攻关行动”揭榜挂帅项目(2014TYJB0126)
详细信息
    作者简介:

    郭志强:男,教授,研究方向为信号处理、图像处理方向

    汪子涵:男,硕士生,研究方向为机器视觉、图像处理

    王永圣:男,高级实验师,研究方向为智能驾驶感知、机器视觉方向

    陈鹏羽:男,硕士生,研究方向为图像处理、智能驾驶

    通讯作者:

    王永圣 wysh@whut.edu.cn

  • 中图分类号: TN911.3; TP391.4

LFTA:Lightweight Feature Extraction and Additive Attention-based Feature Matching Method

Funds: Taiyuan City’s “Double Hundred Key Technology Breakthrough Initiative” (2014TYJB0126)
  • 摘要: 近年来,特征匹配技术在计算机视觉任务中得到了广泛应用,如3维重建、视觉定位和即时定位与地图构建(SLAM)等。然而,现有匹配算法面临精度与效率的权衡困境:高精度方法常因复杂模型设计导致计算复杂度攀升,难以满足实时需求;而快速匹配策略通过特征简化或近似计算虽实现亚线性时间复杂度,却因表征能力受限与误差累积,无法达到实际应用中的精度要求。为此,该文提出一种基于加性注意力的轻量化特征匹配方法—LFTA。该方法通过轻量化多尺度特征提取网络生成高效特征表示,并引入三重交换融合注意力机制,提升了在复杂场景下的特征鲁棒性;同时提出了自适应高斯核生成关键点热力图和动态非极大值抑制算法,以提高关键点的提取精度;此外,该文设计了结合加性Transformer注意力机制和深度可分离卷积位置编码的轻量化模块,对粗粒度匹配结果进行微调,从而生成高精度的像素级匹配点对。为了验证所提方法的有效性,在MegaDepth和ScanNet两个公开数据集上进行了实验评估,并通过消融实验和对比实验验证了各模块的贡献和模型的综合性能。实验结果表明,所提算法在姿态估计上的性能相比于轻量化的算法有显著提升,且与性能较高的算法相比推理时间有显著下降,实现了高效性与高精度的平衡。
  • 图  1  LFTA特征匹配网络架构

    图  2  三维交换融合注意力机制

    图  3  DWP-AT:基于加性自注意力和交叉注意力的Transformer模块

    图  4  ScanNet上的对比

    图  5  MegaDepth上的对比

    图  6  在MegaDepth数据集上的对比实验结果

    表  1  在MegaDepth数据集上的性能评估

    模型姿态估计(%)推理时间(ms)
    auc@5°auc@10°auc@20°
    Xfeat[16]42.6056.4067.7035
    LoFTR[17]52.8369.1981.18560
    Alike[11]49.4061.8071.40200
    MatchFormer[19]53.3069.7081.80688
    SP[10]+LG[14]49.9067.0080.10446
    SP+SG[13]49.7067.1080.60483
    LFTA51.9467.8179.77162
    下载: 导出CSV

    表  2  在ScanNet数据集上的性能评估

    模型姿态估计(%)推理时间(ms)
    auc@5°auc@10°auc@20°
    Xfeat16.7032.6045.2432
    LoFTR22.0640.8057.62554
    Alike8.0016.4025.90197
    MatchFormer22.8942.6860.55681
    SP+SG14.8030.8047.50443
    SP+LG15.4732.2250.13476
    LFTA20.9137.5154.89154
    下载: 导出CSV

    表  3  在MegaDepth数据集上的消融实验

    方法三重交换融合注意力机制自适应模块DWP-ATMegaDepth(%)推理时间(ms)
    auc@5°auc@10°auc@20°
    基准×××42.1056.2067.3034
    只有三重交换融合注意力机制××43.5657.6368.7462
    只有自适应模块××42.9056.8268.1338
    只有 DWP-AT××51.4866.6478.88102
    没有 DWP-AT×43.7857.8968.9167
    没有三重交换融合注意力机制×51.2467.1379.03107
    没有自适应模块×51.5067.5279.41153
    LFTA51.9467.8179.77162
    下载: 导出CSV

    表  4  在ScanNet数据集上的消融实验

    方法三重交换融合注意力机制自适应模块DWP-ATScanNet(%)推理时间(ms)
    auc@5°auc@10°auc@20°
    基准×××16.2032.1044.8029
    只有三重交换融合注意力机制××17.4533.2446.6057
    只有自适应模块××17.1433.1145.8336
    只有DWP-AT××20.1336.8553.9597
    没有DWP-AT×17.7134.1447.2362
    没有三重交换融合注意力机制×20.4037.1554.2699
    没有自适应模块×20.7237.2654.67149
    LFTA20.9137.5154.89154
    下载: 导出CSV
  • [1] ZHANG Jian, XIE Hongtu, ZHANG Lin, et al. Information extraction and three-dimensional contour reconstruction of vehicle target based on multiple different pitch-angle observation circular synthetic aperture radar data[J]. Remote Sensing, 2024, 16(2): 401. doi: 10.3390/rs16020401.
    [2] LUO Haitao, ZHANG Jinming, LIU Xiongfei, et al. Large-scale 3D reconstruction from multi-view imagery: A comprehensive review[J]. Remote Sensing, 2024, 16(5): 773. doi: 10.3390/rs16050773.
    [3] GAO Lei, ZHAO Yingbao, HAN Jingchang, et al. Research on multi-view 3D reconstruction technology based on SFM[J]. Sensors, 2022, 22(12): 4366. doi: 10.3390/s22124366.
    [4] ZHANG He, JIN Lingqiu, and YE Cang. An RGB-D camera based visual positioning system for assistive navigation by a robotic navigation aid[J]. IEEE/CAA Journal of Automatica Sinica, 2021, 8(8): 1389–1400. doi: 10.1109/JAS.2021.1004084.
    [5] YAN Chi, QU Delin, XU Dan, et al. GS-SLAM: Dense visual slam with 3D Gaussian splatting[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 19595–19604. doi: 10.1109/CVPR52733.2024.01853.
    [6] WANG Hengyi, WANG Jingwen, and AGAPITO L. CO-SLAM: Joint coordinate and sparse parametric encodings for neural real-time slam[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 13293–13302. doi: 10.1109/CVPR52729.2023.01277.
    [7] PANCHAL P M, PANCHAL S R, and SHAH S K. A comparison of SIFT and SURF[J]. International Journal of Innovative Research in Computer and Communication Engineering, 2013, 1(2): 323–327.
    [8] 余淮, 杨文. 一种无人机航拍影像快速特征提取与匹配算法[J]. 电子与信息学报, 2016, 38(3): 509–516. doi: 10.11999/JEIT150676.

    YU Huai and YANG Wen. A fast feature extraction and matching algorithm for unmanned aerial vehicle images[J]. Journal of Electronics & Information Technology, 2016, 38(3): 509–516. doi: 10.11999/JEIT150676.
    [9] 陈抒瑢, 李勃, 董蓉, 等. Contourlet-SIFT特征匹配算法[J]. 电子与信息学报, 2013, 35(5): 1215–1221. doi: 10.3724/SP.J.1146.2012.01132.

    CHEN Shurong, LI Bo, DONG Rong, et al. Contourlet-SIFT feature matching algorithm[J]. Journal of Electronics & Information Technology, 2013, 35(5): 1215–1221. doi: 10.3724/SP.J.1146.2012.01132.
    [10] DETONE D, MALISIEWICZ T, and RABINOVICH A. SuperPoint: Self-supervised interest point detection and description[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA, 2018: 337–33712. doi: 10.1109/CVPRW.2018.00060.
    [11] ZHAO Xiaoming, WU Xingming, MIAO Jinyu, et al. ALIKE: Accurate and lightweight keypoint detection and descriptor extraction[J]. IEEE Transactions on Multimedia, 2023, 25: 3101–3112. doi: 10.1109/TMM.2022.3155927.
    [12] JAKUBOVIĆ A and VELAGIĆ J. Image feature matching and object detection using brute-force matchers[C]. 2018 International Symposium ELMAR, Zadar, Croatia, 2018: 83–86. doi: 10.23919/ELMAR.2018.8534641.
    [13] SARLIN P E, DETONE D, MALISIEWICZ T, et al. SuperGlue: Learning feature matching with graph neural networks[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 4937–4946. doi: 10.1109/CVPR42600.2020.00499.
    [14] LINDENBERGER P, SARLIN P E, and POLLEFEYS M. LightGlue: Local feature matching at light speed[C]. 2023 IEEE/CVF International Conference on Computer Vision, Paris, France, 2023: 17581–17592. doi: 10.1109/ICCV51070.2023.01616.
    [15] SHI Yan, CAI Junxiong, SHAVIT Y, et al. ClusterGNN: Cluster-based coarse-to-fine graph neural network for efficient feature matching[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 12507–12516. doi: 10.1109/CVPR52688.2022.01219.
    [16] POTJE G, CADAR F, ARAUJO A, et al. XFeat: Accelerated features for lightweight image matching[C]. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2024: 2682–2691. doi: 10.1109/CVPR52733.2024.00259.
    [17] SUN Jiaming, SHEN Zehong, WANG Yuang, et al. LoFTR: Detector-free local feature matching with transformers[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 8918–8927. doi: 10.1109/CVPR46437.2021.00881.
    [18] CHEN Hongkai, LUO Zixin, ZHOU Lei, et al. ASpanFormer: Detector-free image matching with adaptive span transformer[C]. The 17th European Conference on Computer Vision, Tel Aviv, Israel, 2022: 20–36. doi: 10.1007/978-3-031-19824-3_2.
    [19] WANG Qing, ZHANG Jiaming, YANG Kailun, et al. MatchFormer: Interleaving attention in transformers for feature matching[C]. The 16th Asian Conference on Computer Vision, Macao, China, 2022: 256–273. doi: 10.1007/978-3-031-26313-2_16.
    [20] YU Jiahuan, CHANG Jiahao, HE Jianfeng, et al. Adaptive spot-guided transformer for consistent local feature matching[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2023: 21898–21908. doi: 10.1109/CVPR52729.2023.02097.
  • 加载中
图(6) / 表(4)
计量
  • 文章访问数:  331
  • HTML全文浏览量:  247
  • PDF下载量:  57
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-03-03
  • 修回日期:  2025-07-01
  • 网络出版日期:  2025-07-08
  • 刊出日期:  2025-08-27

目录

    /

    返回文章
    返回