高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于香农熵代表性特征和投票机制的三维模型分类

高雪瑶 闫少康 张春祥

高雪瑶, 闫少康, 张春祥. 基于香农熵代表性特征和投票机制的三维模型分类[J]. 电子与信息学报. doi: 10.11999/JEIT230405
引用本文: 高雪瑶, 闫少康, 张春祥. 基于香农熵代表性特征和投票机制的三维模型分类[J]. 电子与信息学报. doi: 10.11999/JEIT230405
GAO Xueyao, YAN Shaokang, ZHANG Chunxiang. 3D Model Classification Based on Shannon Entropy Representative Feature and Voting Mechanism[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT230405
Citation: GAO Xueyao, YAN Shaokang, ZHANG Chunxiang. 3D Model Classification Based on Shannon Entropy Representative Feature and Voting Mechanism[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT230405

基于香农熵代表性特征和投票机制的三维模型分类

doi: 10.11999/JEIT230405
基金项目: 国家自然科学基金(61502124, 60903082),中国博士后科学基金(2014M560249),黑龙江省自然科学基金(LH2022F031, LH2022F030, F2015041, F201420)
详细信息
    作者简介:

    高雪瑶:女,教授,研究方向为图形图像处理、自然语言处理、机器学习

    闫少康:男,硕士生,研究方向为图形图像处理

    张春祥:男,教授,研究方向为自然语言处理、图形图像处理、机器学习

    通讯作者:

    张春祥 z6c6x666@163.com

  • 中图分类号: TP391.7

3D Model Classification Based on Shannon Entropy Representative Feature and Voting Mechanism

Funds: The National Natural Science Foundation of China (61502124, 60903082), China Postdoctoral Science Foundation (2014M560249), Heilongjiang Provincial Natural Science Foundation of China (LH2022F031, LH2022F030, F2015041, F201420)
  • 摘要: 目前基于视图的3维模型分类方法存在单视图视觉信息不充分、多视图信息冗余的问题,且同等对待所有视图会忽略不同投影视角之间的差异性。针对上述问题,该文提出一种基于香农熵代表性特征和投票机制的3维模型分类方法。首先,通过在3维模型周围均匀设置多个视角组来获取表征模型的多组视图集。为了有效提取视图深层特征,在特征提取网络中引入通道注意力机制;然后,针对Softmax函数输出的视图判别性特征,使用香农熵来选择代表性特征,从而避免多视图特征冗余;最后,基于多个视角组的代表性特征利用投票机制来完成3维模型分类。实验表明:该方法在3维模型数据集ModelNet10上的分类准确率达到96.48%,分类性能突出。
  • 图  1  3维模型投影示意图

    图  2  3维模型分类框架

    图  3  RegNet模型结构

    图  4  ECA模块结构

    图  5  基于RegNet和ECA的特征提取网络

    图  6  block块中ECA具体嵌入位置

    图  7  基于R-HV和R-SV的混淆矩阵

    图  8  错分模型实例

    表  1  不同视图数目下的分类准确率(%)

    网络模型投票算法3V6V9V12V18V
    RegNetHV94.0593.8393.9894.8294.27
    SV94.0594.6094.2794.7194.60
    下载: 导出CSV

    表  2  不同RegNet网络的分类准确率(%)

    RegNet模型stage1stage2stage3stage4flops(B)params(M)HVSV
    RegNet2X1×block1×block4×block7×block0.22.792.5193.83
    RegNet4X1×block2×block7×block12×block0.45.293.5094.16
    RegNet6X1×block3×block5×block7×block0.66.293.8394.60
    RegNet8X1×block3×block7×block5×block0.87.394.1694.71
    下载: 导出CSV

    表  3  ECA不同嵌入位置对RegNet分类的影响(%)

    投票算法ECA1ECA2ECA3ECA4
    HV93.7293.7293.9494.05
    SV94.6094.4995.2694.60
    下载: 导出CSV

    表  4  ECA对RegNet分类的影响(%)

    网络模型 投票算法 视角1 视角2 视角3 视角4 视角5 视角6
    RegNet HV 92.51 92.95 93.06 93.83 94.05 88.66
    SV 93.06 93.94 94.05 94.60 94.49 89.43
    RegNet+ECA HV 93.28 94.27 93.39 93.94 94.05 88.88
    SV 93.72 94.71 94.93 95.26 95.04 90.42
    下载: 导出CSV

    表  5  多视角代表性特征的对比(%)

    类别视角1视角2视角3视角4视角5视角6R-SVR-HV
    bathtub92.0098.0098.00100.0096.0088.00100.00100.00
    bed99.00100.00100.00100.00100.0097.00100.00100.00
    chair100.00100.00100.00100.00100.0097.00100.00100.00
    desk91.8688.3789.5386.0591.8675.5891.8691.86
    dresser83.7291.8695.3594.1989.5387.2194.1994.19
    monitor10099.00100.00100.0099.0099.00100.00100.00
    night_stand89.5382.5679.0782.5688.3772.0983.7286.05
    sofa99.0098.0098.0099.0098.0096.0099.0099.00
    table92.0091.0085.0087.0088.0087.0092.0093.00
    toilet96.0099.00100.00100.00100.00100.00100.00100.00
    平均准确率94.7194.9394.6094.9395.2690.5396.1596.48
    下载: 导出CSV

    表  6  基于视图的分类方法准确率对比(%)

    方法视图数准确率
    DeepPano[16]188.66
    Geometry Image[17]188.40
    MVCLN[19]1295.68
    CNN-Voting[20]1292.85
    FusionNet[21]6093.11
    本文-R-SV3696.15
    本文-R-HV3696.48
    下载: 导出CSV
  • [1] KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84–90. doi: 10.1145/3065386.
    [2] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA, 2015: 1–9. doi: 10.1109/CVPR.2015.7298594.
    [3] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
    [4] CHARLES R Q, SU Hao, MO Kaichun, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, 2017: 77–85. doi: 10.1109/CVPR.2017.16.
    [5] QI C R, YI Li, SU Hao, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space[C]. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 5105–5114.
    [6] SONG Yupeng, HE Fazhi, DUAN Yansong, et al. A kernel correlation-based approach to adaptively acquire local features for learning 3D point clouds[J]. Computer-Aided Design, 2022, 146: 103196. doi: 10.1016/j.cad.2022.103196.
    [7] 张溯, 杨军. 利用空间结构信息的三维点云模型分类[J]. 小型微型计算机系统, 2021, 42(4): 779–784. doi: 10.3969/j.issn.1000-1220.2021.04.018.

    ZHANG Su and YANG Jun. 3D model classification using spatial structure information[J]. Journal of Chinese Computer Systems, 2021, 42(4): 779–784. doi: 10.3969/j.issn.1000-1220.2021.04.018.
    [8] HASSAN R, FRAZ M M, RAJPUT A, et al. Residual learning with annularly convolutional neural networks for classification and segmentation of 3D point clouds[J]. Neurocomputing, 2023, 526: 96–108. doi: 10.1016/j.neucom.2023.01.026.
    [9] ZHOU Ruqin, LI Xixing, and JIANG Wanshou. SCANet: A spatial and channel attention based network for partial-to-partial point cloud registration[J]. Pattern Recognition Letters, 2021, 151: 120–126. doi: 10.1016/j.patrec.2021.08.002.
    [10] WU Zhirong, SONG Shuran, KHOSLA A, et al. 3D ShapeNets: A deep representation for volumetric shapes[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA, 2015: 1912–1920. doi: 10.1109/CVPR.2015.7298801.
    [11] XU Xu and TODOROVIC S. Beam search for learning a deep convolutional neural network of 3D shapes[C]. 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 3506–3511. doi: 10.1109/ICPR.2016.7900177.
    [12] KIM S, CHI H G, and RAMANI K. Object synthesis by learning part geometry with surface and volumetric representations[J]. Computer-Aided Design, 2021, 130: 102932. doi: 10.1016/j.cad.2020.102932.
    [13] MA Ziping, ZHOU Jie, MA Jinlin, et al. A novel 3D shape recognition method based on double-channel attention residual network[J]. Multimedia Tools and Applications, 2022, 81(22): 32519–32548. doi: 10.1007/s11042-022-12041-9.
    [14] CAI Weiwei, LIU Dong, NING Xin, et al. Voxel-based three-view hybrid parallel network for 3D object classification[J]. Displays, 2021, 69: 102076. doi: 10.1016/j.displa.2021.102076.
    [15] HE Yunqian, XIA Guihua, LUO Yongkang, et al. DVFENet: Dual-branch voxel feature extraction network for 3D object detection[J]. Neurocomputing, 2021, 459: 201–211. doi: 10.1016/j.neucom.2021.06.046.
    [16] SHI Baoguang, BAI Song, ZHOU Zhichao, et al. DeepPano: Deep panoramic representation for 3-D shape recognition[J]. IEEE Signal Processing Letters, 2015, 22(12): 2339–2343. doi: 10.1109/LSP.2015.2480802.
    [17] SINHA A, BAI Jing, and RAMANI K. Deep learning 3D shape surfaces using geometry images[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 223–240. doi: 10.1007/978-3-319-46466-4_14.
    [18] SU Hang, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015: 945–953. doi: 10.1109/ICCV.2015.114.
    [19] LIANG Qi, WANG Yixin, NIE Weizhi, et al. MVCLN: Multi-view convolutional LSTM network for cross-media 3D shape recognition[J]. IEEE Access, 2020, 8: 139792–139802. doi: 10.1109/ACCESS.2020.3012692.
    [20] 白静, 司庆龙, 秦飞巍. 基于卷积神经网络和投票机制的三维模型分类与检索[J]. 计算机辅助设计与图形学学报, 2019, 31(2): 303–314. doi: 10.3724/SP.J.1089.2019.17160.

    BAI Jing, SI Qinglong, and QIN Feiwei. 3D model classification and retrieval based on CNN and voting scheme[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(2): 303–314. doi: 10.3724/SP.J.1089.2019.17160.
    [21] HEGDE V and ZADEH R. FusionNet: 3D object classification using multiple data representations[EB/OL]. https://arxiv.org/abs/1607.05695, 2016.
    [22] JIN Xun and LI De. Rotation prediction based representative view locating framework for 3D object recognition[J]. Computer-Aided Design, 2022, 150: 103279. doi: 10.1016/j.cad.2022.103279.
    [23] ZHU Feng, XU Junyu, and YAO Chuanming. Local information fusion network for 3D shape classification and retrieval[J]. Image and Vision Computing, 2022, 121: 104405. doi: 10.1016/j.imavis.2022.104405.
    [24] RADOSAVOVIC I, KOSARAJU R P, GIRSHICK R, et al. Designing network design spaces[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 10425–10433. doi: 10.1109/CVPR42600.2020.01044.
    [25] WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 11531–11539. doi: 10.1109/CVPR42600.2020.01155.
  • 加载中
图(8) / 表(6)
计量
  • 文章访问数:  93
  • HTML全文浏览量:  38
  • PDF下载量:  18
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-05-12
  • 修回日期:  2023-12-12
  • 网络出版日期:  2023-12-20

目录

    /

    返回文章
    返回