高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

复杂场景点云数据的6D位姿估计深度学习网络

陈海永 李龙腾 陈鹏 孟蕊

陈海永, 李龙腾, 陈鹏, 孟蕊. 复杂场景点云数据的6D位姿估计深度学习网络[J]. 电子与信息学报, 2022, 44(5): 1591-1601. doi: 10.11999/JEIT211000
引用本文: 陈海永, 李龙腾, 陈鹏, 孟蕊. 复杂场景点云数据的6D位姿估计深度学习网络[J]. 电子与信息学报, 2022, 44(5): 1591-1601. doi: 10.11999/JEIT211000
CHEN Haiyong, LI Longteng, CHEN Peng, MENG Rui. 6D Pose Estimation Network in Complex Point Cloud Scenes[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1591-1601. doi: 10.11999/JEIT211000
Citation: CHEN Haiyong, LI Longteng, CHEN Peng, MENG Rui. 6D Pose Estimation Network in Complex Point Cloud Scenes[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1591-1601. doi: 10.11999/JEIT211000

复杂场景点云数据的6D位姿估计深度学习网络

doi: 10.11999/JEIT211000
基金项目: 国家自然科学基金( U21A20482, 62073117);中央引导地方科技发展资金项目(206Z1701G)
详细信息
    作者简介:

    陈海永:男,1980年生,教授,博士生导师,研究方向为图像处理、机器视觉、模式识别等

    李龙腾:男,1996年生,硕士生,研究方向为点云处理、3维视觉等

    陈鹏:男,1981年生,讲师,研究方向为智能机器人、机器视觉、3维环境感知等

    孟蕊:女,1997年生,硕士生,研究方向为机器视觉、模式识别和深度学习等

    通讯作者:

    陈海永 haiyong.chen@hebut.edu.cn

  • 中图分类号: TP391.4

6D Pose Estimation Network in Complex Point Cloud Scenes

Funds: The National Natural Science Foundation of China (U21A20482, 62073117), The Central Leading Local Science and Technology Development Fund Project (206Z1701G)
  • 摘要: 针对工业上常见的弱纹理、散乱摆放复杂场景下点云目标机器人抓取问题,该文提出一种6D位姿估计深度学习网络。首先,模拟复杂场景下点云目标多姿态随机摆放的物理环境,生成带真实标签的数据集;进而,设计了6D位姿估计深度学习网络模型,提出多尺度点云分割网络(MPCS-Net),直接在完整几何点云上进行点云实例分割,解决了对RGB信息和点云分割预处理的依赖问题。然后,提出多层特征姿态估计网(MFPE-Net),有效地解决了对称物体的位姿估计问题。最后,实验结果和分析证实了,相比于传统的点云配准方法和现有的切分点云的深度学习位姿估计方法,所提方法取得了更高的准确率和更稳定性能,并且在估计对称物体位姿时有较强的鲁棒性。
  • 图  1  机器人抓取装配系统

    图  2  数据集生成流程图

    图  3  工件CAD模型图及仿真场景点云样本示例

    图  4  网络整体架构图

    图  5  MPCS-Net 网络图

    图  6  特征聚类与采样模块流程图

    图  7  MFPE-Net结构图

    图  8  姿态特征提取模块结构图

    图  9  点云实例分割网络效果图

    图  10  实例预测出现错误情况图

    图  11  高维实例特征降维结果

    图  12  待抓取物体位姿估计效果

    图  13  物体C配准的情况

    表  1  训练基本配置表

    配置项目项目值配置项目项目值
    数据集总量10000个平均点距(水平)1 mm
    单场景物体数4~7个优化器SGD
    训练集数量9000个训练迭代次数500
    测试集数量1000个BatchSize16
    初始学习率0.01学习率衰减步数50
    下载: 导出CSV

    表  2  语义分割精度(%)和平均时间(s)

    方法精度(%)平均时间(s)物体A物体B物体C物体D物体E物体F物体G
    PointNet++82.930.28686.7480.2383.3378.5383.5185.7488.73
    MT-PNet89.790.30589.7487.9784.6992.4288.0587.2195.50
    MV-CRF91.032.97391.2792.0389.6589.0292.7889.9594.47
    本文99.020.32498.7999.2898.9998.9398.6198.9799.67
    下载: 导出CSV

    表  3  实例分割精度(%)和平均时间(s)

    方法精度(%)平均时间(s)物体A物体B物体C物体D物体E物体F物体G
    MT-PNet80.844.97378.8775.5583.4886.9975.0687.8584.25
    MV-CRF84.458.93483.0380.2185.7788.9680.5789.1189.48
    本文94.355.31292.7496.8593.5395.0694.6793.8393.51
    下载: 导出CSV

    表  4  不同实例聚类方法精度(%)

    方法精度(%)物体A物体B物体C物体D物体E物体F物体G
    HAC72.0554.8783.6872.0875.0678.8467.1979.48
    DBSCAN89.7583.5192.0694.0580.8385.4792.5990.64
    MeanShift94.3592.7496.8593.5395.0694.6793.8393.51
    下载: 导出CSV

    表  5  姿态估计精度(%)

    FPFH+ICPPPF+ICPCloudPose+ICP本文+ICP
    ADAD-SADAD-SADAD-SADAD-S
    物体A88.1399.8897.7299.7788.5397.2198.32100
    物体B77.8696.4771.6772.0785.8293.6696.3097.68
    物体C61.0296.3693.1799.8071.8696.7396.5198.91
    物体D87.8397.2398.0498.5497.5398.3697.8599.25
    物体E3.7294.8210.8999.0212.5496.7312.2499.08
    物体F48.1797.8042.4499.2153.3692.6349.5698.91
    物体G28.0496.5423.8296.7632.0291.3617.0797.25
    下载: 导出CSV

    表  6  单个实例识别时间(s)

    FPFH+ICPPPF+ICPMPCS-Net+CloudPose+ICP本文+ICP
    平均计算时间(单个实例)3.724.430.620.58
    下载: 导出CSV
  • [1] ASTANIN S, ANTONELLI D, CHIABERT P, et al. Reflective workpiece detection and localization for flexible robotic cells[J]. Robotics and Computer-Integrated Manufacturing, 2017, 44: 190–198. doi: 10.1016/j.rcim.2016.09.001
    [2] RUSU R B, BLODOW N, and BEETZ M. Fast point feature histograms (FPFH) for 3D registration[C]. 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 2009: 3212–3217.
    [3] SALTI S, TOMBARI F, and DI STEFANO L. SHOT: Unique signatures of histograms for surface and texture description[J]. Computer Vision and Image Understanding, 2014, 125(8): 251–264. doi: 10.1016/j.cviu.2014.04.011
    [4] DROST B, ULRICH M, NAVAB N, et al. Model globally, match locally: Efficient and robust 3D object recognition[C]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, 2010: 998–1005.
    [5] BIRDAL T and ILIC S. Point pair features based object detection and pose estimation revisited[C]. 2015 International Conference on 3D Vision, Lyon, France, 2015: 527–535.
    [6] TANG Keke, SONG Peng, and CHEN Xiaoping. 3D object recognition in cluttered scenes with robust shape description and correspondence selection[J]. IEEE Access, 2017, 5: 1833–1845. doi: 10.1109/ACCESS.2017.2658681
    [7] HOLZ D, NIEUWENHUISEN M, DROESCHEL D, et al. Active Recognition and Manipulation for Mobile Robot Bin Picking[M]. RÖHRBEIN F, VEIGA G, NATALE C. Gearing Up and Accelerating Cross‐Fertilization Between Academic and Industrial Robotics Research in Europe. Cham: Springer, 2014: 133–153.
    [8] WU Chenghei, JIANG S Y, and SONG Kaitai. CAD-based pose estimation for random bin-picking of multiple objects using a RGB-D camera[C]. 2015 15th International Conference on Control, Automation and Systems (ICCAS), Busan, Korea (South), 2015: 1645–1649.
    [9] 高雪梅. 面向自动化装配的零件识别与抓取方位规划[D]. [硕士论文], 哈尔滨工业大学, 2018.

    GAO Xuemei. Research on Objects Recognition and Grasping Position Planning for Robot Automatic Assemblysensing [D]. [Master dissertation], Harbin Institute of Technology, 2018.
    [10] LYU Yecheng, HUANG Xinming, and ZHANG Ziming. Learning to segment 3D point clouds in 2D image space[C]. The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 12255–12264.
    [11] ZHOU Yin and TUZEL O. VoxelNet: End-to-end learning for point cloud based 3D object detection[C]. The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4490–4499.
    [12] QI C R, LIU Wei, WU Chenxia, et al. Frustum pointnets for 3D object detection from RGB-D data[C]. The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 918–927.
    [13] PHAM Q H, NGUYEN T, HUA B S, et al. JSIS3D: Joint semantic-instance segmentation of 3D point clouds with multi-task pointwise networks and multi-value conditional random fields[C].The 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 8819–8828.
    [14] QI C R, SU Hao, MO Kaichun, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 77–85.
    [15] GAO Ge, LAURI M, WANG Yulong, et al. 6D object pose regression via supervised learning on point clouds[C]. 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 2020: 3643–3649.
    [16] DU Guoguang, WANG Kai, LIAN Shiguo, et al. Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: A review[J]. Artificial Intelligence Review, 2021, 54(3): 1677–1734. doi: 10.1007/s10462-020-09888-5
    [17] GSCHWANDTNER M, KWITT R, UHL A, et al. BlenSor: Blender sensor simulation toolbox[C]. International Symposium on Visual Computing, Las Vegas, USA, 2011: 199–208.
    [18] LU Qingkai, CHENNA K, SUNDARALINGAM B, et al. Planning Multi-fingered Grasps as Probabilistic Inference in a Learned Deep Network[M]. AMATO N, HAGER, G, THOMAS S, et al. Robotics Research. Cham: Springer, 2020: 455–472.
    [19] DE BRABANDERE B, NEVEN D, and VAN GOOL L. Semantic instance segmentation with a discriminative loss function[J]. arXiv preprint arXiv: 1708.02551, 2017.
    [20] KUHN H W. The Hungarian method for the assignment problem[J]. Naval Research Logistics, 2005, 52(1): 7–21. doi: 10.1002/nav.20053
    [21] LIU Liyuan, JIANG Haoming, HE Pengcheng, et al. On the variance of the adaptive learning rate and beyond[J]. arXiv preprint arXiv: 1908.03265v1, 2019.
    [22] GAO Ge, LAURI M, ZHANG Jianwei, et al. Occlusion Resistant Object Rotation Regression from Point Cloud Segments[M]. LEAL-TAIXÉ L and ROTH S. European Conference on Computer Vision. Cham: Springer, 2018: 716–729.
    [23] HINTERSTOISSER S, LEPETIT V, ILIC S, et al. Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes[C]. 11th Asian Conference on Computer Vision, Berlin, Germany, 2012: 548–562.
  • 加载中
图(13) / 表(6)
计量
  • 文章访问数:  1183
  • HTML全文浏览量:  1561
  • PDF下载量:  154
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-09-18
  • 修回日期:  2022-04-06
  • 录用日期:  2022-04-08
  • 网络出版日期:  2022-04-10
  • 刊出日期:  2022-05-25

目录

    /

    返回文章
    返回