Advanced Search
Volume 46 Issue 1
Jan.  2024
Turn off MathJax
Article Contents
GUO Hongda, LOU Jingtao, YANG Zhenzhen, XU Youchun. Research on Dispersion Strategy for Multiple Unmanned Ground Vehicles Based on Auction Multi-agent Deep Deterministic Policy Gradient[J]. Journal of Electronics & Information Technology, 2024, 46(1): 287-298. doi: 10.11999/JEIT221582
Citation: GUO Hongda, LOU Jingtao, YANG Zhenzhen, XU Youchun. Research on Dispersion Strategy for Multiple Unmanned Ground Vehicles Based on Auction Multi-agent Deep Deterministic Policy Gradient[J]. Journal of Electronics & Information Technology, 2024, 46(1): 287-298. doi: 10.11999/JEIT221582

Research on Dispersion Strategy for Multiple Unmanned Ground Vehicles Based on Auction Multi-agent Deep Deterministic Policy Gradient

doi: 10.11999/JEIT221582
  • Received Date: 2023-01-02
  • Rev Recd Date: 2023-05-12
  • Available Online: 2023-05-22
  • Publish Date: 2024-01-17
  • Multiple Unmanned Ground Vehicle (multi-UGV) dispersion is commonly used in military combat missions. The existing conventional methods of dispersion are complex, long time-consuming, and have limited applicability. To address these problems, a multi-UGV dispersion strategy is proposed based on the AUction Multi-Agent Deep Deterministic Policy Gradient (AU-MADDPG) algorithm. Founded on the single unmanned vehicle model, the multi-UGV dispersion model is established based on deep reinforcement learning. Then, the MADDPG structure is optimized, and the auction algorithm is used to calculate the dispersion points corresponding to each unmanned vehicle when the absolute path is shortest to reduce the randomness of dispersion points allocation. Plan the path according to the MADDPG algorithm to improve training efficiency and running efficiency. The reward function is optimized by taking into account both during and the end of training process to consider the constraints comprehensively. The multi-constraint problem is converted into the reward function design problem to realize maximization of the reward f unction. The simulation results show that, compared with the traditional MADDPG algorithms, the proposed algorithm has a 3.96% reduction in training time-consuming and a 14.5% reduction in total path length, which is more effective in solving the decentralized problems, and can be used as a general solution for dispersion problems.
  • loading
  • [1]
    解少博, 屈鹏程, 李嘉诚, 等. 跟驰场景中网联混合电动货车速度规划和能量管理协同控制的研究[J]. 汽车工程, 2022, 44(8): 1136–1143,1152. doi: 10.19562/j.chinasae.qcgc.2022.08.003

    XIE Shaobo, QU Pengcheng, LI Jiacheng, et al. Study on coordinated control of speed planning and energy management for connected hybrid electric truck in vehicle following scene[J]. Automotive Engineering, 2022, 44(8): 1136–1143,1152. doi: 10.19562/j.chinasae.qcgc.2022.08.003
    [2]
    张立雄, 郭艳, 李宁, 等. 基于多智能体强化学习的无人车分布式路径规划方法[J]. 电声技术, 2021, 45(3): 52–57. doi: 10.16311/j.audioe.2021.03.010

    ZHANG Lixiong, GUO Yan, LI Ning, et al. Path planning method of autonomous vehicles based on multi agent reinforcement learning[J]. Audio Engineering, 2021, 45(3): 52–57. doi: 10.16311/j.audioe.2021.03.010
    [3]
    孟磊, 吴芝亮, 王轶强. POMDP模型在多机器人环境探测中的应用研究[J]. 机械科学与技术, 2022, 41(2): 178–185. doi: 10.13433/j.cnki.1003-8728.20200318

    MENG Lei, WU Zhiliang, and WANG Yiqiang. Research on multi-robot environment exploration using POMDP[J]. Mechanical Science and Technology for Aerospace Engineering, 2022, 41(2): 178–185. doi: 10.13433/j.cnki.1003-8728.20200318
    [4]
    李瑞珍, 杨惠珍, 萧丛杉. 基于动态围捕点的多机器人协同策略[J]. 控制工程, 2019, 26(3): 510–514. doi: 10.14107/j.cnki.kzgc.161174

    LI Ruizhen, YANG Huizhen, and XIAO Congshan. Cooperative hunting strategy for multi-mobile robot systems based on dynamic hunting points[J]. Control Engineering of China, 2019, 26(3): 510–514. doi: 10.14107/j.cnki.kzgc.161174
    [5]
    王平, 白昕, 解成超. 基于蜂群与A*混合算法的三维多无人机协同[J]. 航天控制, 2019, 37(6): 29–34,65. doi: 10.16804/j.cnki.issn1006-3242.2019.06.006

    WANG Ping, BAI Xin, and XIE Chengchao. 3D Multi-UAV collabaration based on the hybrid algorithm of artificial bee colony and A*[J]. Aerospace Control, 2019, 37(6): 29–34,65. doi: 10.16804/j.cnki.issn1006-3242.2019.06.006
    [6]
    董程博, 陈恩民, 杨坤, 等. 多目标点同时到达约束下的集群四维轨迹规划设计[J]. 控制与信息技术, 2019(4): 23–28,38. doi: 10.13889/j.issn.2096-5427.2019.04.005

    DONG Chengbo, CHEN Enmin, YANG Kun, et al. Four-dimensional drone cluster route planning under the constraint of simultaneous multi-obiective arrival[J]. Control and Information Technology, 2019(4): 23–28,38. doi: 10.13889/j.issn.2096-5427.2019.04.005
    [7]
    赵明明, 李彬, 王敏立. 不确定信息下基于拍卖算法的多无人机同时到达攻击多目标[J]. 电光与控制, 2015, 22(2): 89–93. doi: 10.3969/j.issn.1671-637X.2015.02.020

    ZHAO Mingming, LI Bin, and WANG Minli. Auction algorithm based Multi-UAV arriving simultaneously to attack multiple targets with uncertain informatio[J]. Electronics Optics &Control, 2015, 22(2): 89–93. doi: 10.3969/j.issn.1671-637X.2015.02.020
    [8]
    徐国艳, 宗孝鹏, 余贵珍, 等. 基于DDPG的无人车智能避障方法研究[J]. 汽车工程, 2019, 41(2): 206–212. doi: 10.19562/j.chinasae.qcgc.2019.02.013

    XU Guoyan, ZONG Xiaopeng, YU Guizhen, et al. A research on intelligent obstacle avoidance of unmanned vehicle based on DDPG algorithm[J]. Automotive Engineering, 2019, 41(2): 206–212. doi: 10.19562/j.chinasae.qcgc.2019.02.013
    [9]
    LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]. The 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016: 1–14.
    [10]
    LOWE R, WU Yi, TAMAR A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6382–6393.
    [11]
    唐伦, 李质萱, 蒲昊, 等. 基于多智能体深度强化学习的无人机动态预部署策略[J]. 电子与信息学报, 2022.

    TANG Lun, LI Zhixuan, PU Hao, et al. A dynamic pre-deployment strategy of uavs based on multi-agent deep reinforcement learning[J]. Journal of Electronics & Information Technology, 2022.
    [12]
    张建行, 康凯, 钱骅, 等. 面向物联网的深度Q网络无人机路径规划[J]. 电子与信息学报, 2022, 44(11): 3850–3857. doi: 10.11999/JEIT210962

    ZHANG Jianhang, KANG Kai, QIAN Hua, et al. UAV trajectory planning based on deep Q-network for internet of things[J]. Journal of Electronics &Information Technology, 2022, 44(11): 3850–3857. doi: 10.11999/JEIT210962
    [13]
    赵辉, 郝梦雅, 王红君, 等. 基于资源拍卖的农业多机器人任务分配[J]. 计算机应用与软件, 2021, 38(12): 286–290,313. doi: 10.3969/j.issn.1000-386x.2021.12.046

    ZHAO Hui, HAO Mengya, WANG Hongjun, et al. Cooperative task allocation of agricultural multi-robot based on resource auction[J]. Computer Applications and Software, 2021, 38(12): 286–290,313. doi: 10.3969/j.issn.1000-386x.2021.12.046
    [14]
    ELGIBREEN H and YOUCEF-TOUMI K. Dynamic task allocation in an uncertain environment with heterogeneous multi-agents[J]. Autonomous Robots, 2019, 43(7): 1639–1664. doi: 10.1007/s10514-018-09820-5
    [15]
    VAN HASSELT H. Double Q-learning[C]. The 23rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2010: 2613–2621.
    [16]
    万逸飞, 彭力. 基于协同多目标算法的多机器人路径规划[J]. 信息与控制, 2020, 49(2): 139–146.

    WAN Yifei and PENG Li. Multi-robot path planning based on cooperative multi-objective algorithm[J]. Information and Control. 2020, 49(2): 139–146.
    [17]
    陈宝, 田斌, 周占伟, 等. 基于改进遗传算法的多直角贴装机器人路径协同规划[J]. 机械工程与自动化, 2022(5): 57–58,61. doi: 10.3969/j.issn.1672-6413.2022.05.020

    CHEN Bao, TIAN Bin, ZHOU Zhanwei, et al. Path collaborative planning of multi right angle mounting robot based on improved genetic algorithm[J]. Mechanical Engineering &Automation, 2022(5): 57–58,61. doi: 10.3969/j.issn.1672-6413.2022.05.020
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(12)  / Tables(5)

    Article Metrics

    Article views (366) PDF downloads(53) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return