高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于国产众核超级计算机的6×105核并行矩量法

顾宗静 吴昊翔 赵勋旺 林中朝 张玉 张崎

顾宗静, 吴昊翔, 赵勋旺, 林中朝, 张玉, 张崎. 基于国产众核超级计算机的6×105核并行矩量法[J]. 电子与信息学报, 2019, 41(4): 845-850. doi: 10.11999/JEIT180562
引用本文: 顾宗静, 吴昊翔, 赵勋旺, 林中朝, 张玉, 张崎. 基于国产众核超级计算机的6×105核并行矩量法[J]. 电子与信息学报, 2019, 41(4): 845-850. doi: 10.11999/JEIT180562
Zongjing GU, Haoxiang WU, Xunwang ZHAO, Zhongchao LIN, Yu ZHANG, Qi ZHANG. Parallel MoM Using the Six Hundred Thousand Cores on Domestically-made and Many-core Supercomputer[J]. Journal of Electronics & Information Technology, 2019, 41(4): 845-850. doi: 10.11999/JEIT180562
Citation: Zongjing GU, Haoxiang WU, Xunwang ZHAO, Zhongchao LIN, Yu ZHANG, Qi ZHANG. Parallel MoM Using the Six Hundred Thousand Cores on Domestically-made and Many-core Supercomputer[J]. Journal of Electronics & Information Technology, 2019, 41(4): 845-850. doi: 10.11999/JEIT180562

基于国产众核超级计算机的6×105核并行矩量法

doi: 10.11999/JEIT180562
基金项目: 国家重点研发计划(2017YFB0202102, 2016YFE0121600),中国博士后科学基金(2017M613068)
详细信息
    作者简介:

    顾宗静:男,1989年生,博士生,研究方向为计算电磁学、大规模并行矩量法、区域分解算法

    吴昊翔:男,1995年生,硕士生,研究方向为计算电磁学、大规模并行矩量法

    赵勋旺:男,1983年生,副教授,研究方向为大型机载天线阵列分析

    林中朝:男,1988年生,讲师,研究方向为计算电磁学

    张玉:男,1978年生,教授,研究方向为计算电磁学、大规模并行算法

    通讯作者:

    赵勋旺 xwzhao@mail.xidian.edu.cn

  • 中图分类号: TN820

Parallel MoM Using the Six Hundred Thousand Cores on Domestically-made and Many-core Supercomputer

Funds: The National Key Research and Development Program of China (2017YFB0202102, 2016YFE0121600), The China Postdoctoral Science Foundation (2017M613068)
  • 摘要:

    为实现电磁计算的安全可靠和自主可控,该文基于“天河二号”国产众核超级计算机平台,开展大规模并行矩量法(MoM)的开发工作。为减轻大规模并行计算时计算机集群的通信压力以及加速矩量法积分方程求解,通过分析矩量法电场积分方程离散生成的矩阵具有对角占优特性,提出一种新型LU分解算法,即对角块矩阵选主元LU分解(BDPLU)算法,该算法减少了panel列分解的计算量,更重要的是,完全消除了选主元过程的MPI通信开销。利用BDPLU算法,并行矩量法突破了6×105 CPU核并行规模,这是目前在国产超级计算平台上实现的最大规模的并行矩量法计算,其矩阵求解并行效率可达51.95%。数值结果表明,并行矩量法可准确高效地在国产超级计算平台上解决大规模电磁问题。

  • 图  1  LU分解过程矩阵特性分布

    图  2  BDPLU算法原理图

    图  3  飞机I仿真模型和双站RCS结果

    图  4  飞机II双站RCS结果

    图  5  加速比和并行效率

    表  1  CALU算法与BDPLU算法矩阵求解时间对比

    FT2000+核数矩阵求解时间(s) 并行效率(%)
    CALUBDPLUCALUBDPLU
    2000796.54742.57100100
    3000567.78518.6893.5395.44
    4000463.93421.1285.8588.17
    5000386.89338.2482.3587.82
    10000226.57187.8370.3179.07
    15000172.91139.0561.4271.20
    20000133.97118.3859.4662.73
    4000072.8364.6154.6857.47
    下载: 导出CSV

    表  2  BDPLU算法求解矩阵的加速比和并行效率

    FT2000+核数矩阵求解时间(s)加速比并行效率(%)
    960029183.331100
    480006336.534.6192.11
    960003501.738.3383.34
    1920002035.3514.3471.69
    2400001764.5316.5466.16
    3360001328.1321.9762.78
    3840001227.9123.7759.42
    4320001133.6425.7457.21
    4800001043.4527.9755.94
    504000997.9429.2455.70
    552000937.8231.1254.12
    600000898.8132.4951.95
    下载: 导出CSV
  • HARRINGTION R F. Field Computation by Moment Methods[M]. New York; IEEE Press, 1993.
    王长清. 现代计算电磁学基础[M]. 北京: 北京大学出版社, 2005: 116–157.

    WANG Changqing. Computational Advanced Electromagnetics[M]. Beijing: PeKing University Press, 2005: 116–157.
    ZHANG Yu and SARKAR T K. Parallel Solution of Integral Equation Based EM Problems in the Frequency Domain [M].Hoboken, USA: Wiley-IEEE, 2009: 107–136.
    张玉, 赵勋旺, 陈岩, 等. 计算电磁学中的大规模并行矩量法[M]. 西安: 西安电子科技大学出版社, 2016: 11210.

    ZHANG Yu, ZHAO Xunwang, CHEN Yan, et al. Massively Parallel Method of Moment in Computational Electromagnetics[M].Xi’an: Xidian University Press, 2016: 11210.
    林中朝, 张爽, 王星, 等. 高阶矩量法在国产超级计算机上的并行性能[J]. 微波学报, 2014, 30(S1): 44–47

    LIN Zhongchao, ZHANG Shuang, WANG Xing, et al. Parallel performance of higher-order MoM on a domestically-made supercomputer[J]. Journal of Microwaves, 2014, 30(S1): 44–47
    林中朝, 陈岩, 张玉, 等. 国产CPU平台中并行高阶矩量法研究[J]. 西安电子科技大学学报, 2015, 42(3): 43–47 doi: 10.3969/j.issn.1001-2400.2015.03.008

    LIN Zhongchao, CHEN Yan, ZHANG Yu, et al. Study of the parallel higher-order MoM on a domestically-made CPU platform[J]. Journal of Xidian University, 2015, 42(3): 43–47 doi: 10.3969/j.issn.1001-2400.2015.03.008
    ZHANG Yu, LIN Zhongchao, ZHAO Xunwang, et al. Performance of a massively parallel higher-order method of moment code using thousands of CPUs and its applications[J]. IEEE Transactions on Antenna Propagation, 2014, 62(12): 6317–6324 doi: 10.1109/TAP.2014.2361135
    ZHAO Xunwang, CHEN Yan, ZHANG Huanhuan, et al. A New Decomposition Solver for Complex Electromagnetic Problems[J]. IEEE Antennas & Propagation Magazine, 2017, 59(3): 131–140 doi: 10.1109/MAP.2017.2687119
    CHEN Yan, ZHANG Yu, ZHANG Guanghui, et al. Hybrid MIC/CPU parallel implementation of MoM on MIC cluster for electromagnetic problems[J]. IEICE Transactions on Electronics, 2016, 99(7): 735–743 doi: 10.1587/transele.E99.C.735
    CHEN Yan, ZHANG Guanghui, LIN Zhongchao, et al. Solution of EM problems using hybrid parallel MIC/CPU implementation of higher-order MoM[C]. IEEE International Symposium on Microwave, Antenna, Propagation, and Emc Technologies. Shanghai, China, 2016: 789–791.
    左胜, 陈岩, 张玉, 等. 一种可扩展异构并行核外高阶矩量法[J]. 西安电子科技大学学报, 2017, 44(1): 146–151 doi: 10.3969/j.issn.1001-2400.2017.01.026

    ZUO Sheng, CHEN Yan, ZHANG Yu, et al. Study of the scalable heterogeneous parallel out-of-core higher order method of moments[J]. Journal of Xidian University, 2017, 44(1): 146–151 doi: 10.3969/j.issn.1001-2400.2017.01.026
    CHEN Yan, ZUO Sheng, ZHANG Yu, et al. Large-scale parallel method of moments on CPU/MIC heterogeneous clusters[J]. IEEE Transactions on Antennas & Propagation, 2017, 65(7): 3782–3787 doi: 10.1109/TAP.2017.2700871
    TANG Min, ZHAO Jieyi, TONG Ruofeng, et al. GPU accelerated convex hull computation[J]. Computers & Graphics, 2012, 36(5): 498–506 doi: 10.1016/j.cag.2012.03.015
    陈岩. 高性能矩量法及其在复杂目标电磁模拟中的应用[D]. [博士论文], 西安电子科技大学, 2017: 86–91.

    Chen Yan. High performance method of moments and its application in electromagnetic simulation of complex targets[D]. [Ph.D. dissertation], Xidian University, 2017: 86–91.
    ZHANG Yu, CHEN Yan, ZHANG Guanghui, et al. A highly efficient communication avoiding LU algorithm for Methods of Moments[C]. IEEE International Symposium on Antennas and Propagation & Usnc/ursi National Radio Science Meeting, Vancouver, Canada, 2015: 1672–1673.
    Intel® Developer Zone: Intel® Math Kernel Library [OL]. https://software.intel.com/en-us/forums/intel-math-kernel-library/, 2018.
    徐晓飞, 曹祥玉, 高军, 等. 基于矩量法的电大目标RCS核外并行计算[J]. 电子与信息学报, 2011, 33(3): 758–762 doi: 10.3724/SP.J.1146.2010.00519

    XU Xiaofei, CAO Xiangyu, GAO Jun, et al. Parallel out-of-core calculation of electrically large objects’ RCS based on MoM[J]. Journal of Electronics &Information Technology, 2011, 33(3): 758–762 doi: 10.3724/SP.J.1146.2010.00519
    马骥, 龚书喜, 王兴, 等. 一种快速计算目标宽带雷达截面的电磁算法[J]. 西安电子科技大学学报, 2012, 39(4): 98–102 doi: 10.3969/j.issn.1001-2400.2012.04.018

    MA Ji, GONG Shuxi, WANG Xing, et al. Fast computation of the wide-band radar cross section of arbitrary objects[J]. Journal of Xidian University, 2012, 39(4): 98–102 doi: 10.3969/j.issn.1001-2400.2012.04.018
    国家超级计算广州中心: 产品中心[OL]. http://www.nscc-gz.cn/Product/HighPerformanceComputingService/ServiceCharacteristics.html, 2018.6.
  • 加载中
图(5) / 表(2)
计量
  • 文章访问数:  1944
  • HTML全文浏览量:  740
  • PDF下载量:  64
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-06-04
  • 修回日期:  2018-12-13
  • 网络出版日期:  2018-12-19
  • 刊出日期:  2019-04-01

目录

    /

    返回文章
    返回