高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向自旋存内计算架构的图算法优化设计

王雪岩 陈序航 贾小涛 杨建磊 屈钢 赵巍胜

王雪岩, 陈序航, 贾小涛, 杨建磊, 屈钢, 赵巍胜. 面向自旋存内计算架构的图算法优化设计[J]. 电子与信息学报, 2023, 45(9): 3193-3199. doi: 10.11999/JEIT230371
引用本文: 王雪岩, 陈序航, 贾小涛, 杨建磊, 屈钢, 赵巍胜. 面向自旋存内计算架构的图算法优化设计[J]. 电子与信息学报, 2023, 45(9): 3193-3199. doi: 10.11999/JEIT230371
WANG Xueyan, CHEN Xuhang, JIA Xiaotao, YANG Jianlei, QU Gang, ZHAO Weisheng. Graph Algorithm Optimization for Spintronics-based In-memory Computing Architecture[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3193-3199. doi: 10.11999/JEIT230371
Citation: WANG Xueyan, CHEN Xuhang, JIA Xiaotao, YANG Jianlei, QU Gang, ZHAO Weisheng. Graph Algorithm Optimization for Spintronics-based In-memory Computing Architecture[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3193-3199. doi: 10.11999/JEIT230371

面向自旋存内计算架构的图算法优化设计

doi: 10.11999/JEIT230371
基金项目: 国家自然科学基金(62004011, 62006011, U20A20204, 62072019)
详细信息
    作者简介:

    王雪岩:女,助理教授,研究方向为存算一体架构、图计算、芯片安全

    陈序航:男,硕士,研究方向为存算一体架构、图计算

    贾小涛:男,副教授,研究方向为贝叶斯神经网络、EDA算法设计

    杨建磊:男,副教授,研究方向为智能计算系统与芯片设计

    屈钢:男,教授,研究方向为硬件安全、低功耗设计

    赵巍胜:男,教授,研究方向为自旋电子芯片

    通讯作者:

    赵巍胜 weisheng.zhao@buaa.edu.cn

  • 中图分类号: TN402

Graph Algorithm Optimization for Spintronics-based In-memory Computing Architecture

Funds: The National Natural Science Foundation of China (62004011, 62006011, U20A20204, 62072019)
  • 摘要: 图计算广泛应用于社交网络分析、推荐系统等诸多关键领域,然而,传统的大规模图计算系统面临冯诺依曼架构下访存带来的性能瓶颈。新型存内计算架构成为加速大规模图计算非常有前景的方案,尤其是非易失自旋磁存储器(MRAM)具备超高耐擦写性和超快写入等优点,可使图计算的存内实现更为高效。实现这种潜力的关键挑战之一是如何优化存内计算架构下的图算法设计。该文的前期工作表明,三角形计数算法和图连通分量计算算法可以通过按位运算实现,从而高效地部署在自旋存内处理核中加速。该文探索了更多图算法的优化实现,例如单源最短路径、K-core、链路预测,并提出了面向新型存内计算架构的图算法优化设计模型。该研究对于突破冯诺依曼架构下大规模图计算的内存访问瓶颈具有关键意义。
  • 图  1  基于MRAM的图计算存内加速架构

    图  2  基于位逻辑运算优化实现单源最短路径

    图  3  基于位逻辑运算优化实现链路预测算法

    图  4  图计算存内加速架构实现与CPU实现的能耗标准化结果对比

    表  1  图算法存内计算优化模型

    图计算步骤边计算社区发现结构预测
    连通分量计算[15]单源最短路径三角形计数[12]K-core链路预测
    目标定位逻辑操作(AND)逻辑操作(AND)Scan(A[i][j]=1)Scan(每行)行选择
    属性汇聚逻辑操作(OR)加法逻辑操作(AND)位计数(Bitcount)逻辑操作(AND/OR)
    计算更新Bitcount比较比较Bitcount比较删除顶点Bitcount除法
    下载: 导出CSV

    表  2  MTJ关键参数

    参数参数
    磁性隧道结表面长度40 nm隧道磁电阻100%
    磁性隧道结表面宽度40 nm饱和场106 A/m
    自旋霍尔角0.3吉尔伯特阻尼常数0.03
    磁性隧道结电阻面积乘积10–12 Ω·m2垂直磁各向异性4.5×105 A/m
    氧化物阻挡层厚度0.82 nm温度300 K
    下载: 导出CSV

    表  3  图计算存内加速架构实现与CPU实现速度对比(s)

    图数据集单源最短路径K-core链路预测
    CPUPIMCPUPIMCPUPIM
    p2p-Gnutella060.0630.00140.1870.0060.0010.00012
    p2p-Gnutella311.1870.0211.0840.0310.0020.00016
    email-Enron2.9720.0711.3830.0560.0010.00010
    email-EuAll2.8260.0813.8320.1930.0030.00041
    soc-Slashdot092210.9890.20310.1370.2410.0020.00017
    web-NotreDame12.5670.18443.9750.7540.0050.00037
    amazon030217.8370.32914.3920.3810.0020.00013
    amazon050538.6080.56550.6321.6490.0030.00043
    下载: 导出CSV
  • [1] CHI Ping, LI Shuangchen, XU Cong, et al. PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory[J]. ACM SIGARCH Computer Architecture News, 2016, 44(3): 27–39. doi: 10.1145/3007787.3001140
    [2] OZDAL M M, YESIL S, KIM T, et al. Energy efficient architecture for graph analytics accelerators[J]. ACM SIGARCH Computer Architecture News, 2016, 44(3): 166–177. doi: 10.1145/3007787.3001155
    [3] HAM T J, WU Lisa, SUNDARAM N, et al. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics[C]. 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Taipei, China, 2016: 1–13.
    [4] KYROLA A, BLELLOCH G, and GUESTRIN C. GraphChi: Large-scale graph computation on just a PC[C]. Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, Hollywood, USA, 2012: 31–46.
    [5] LIANG Shengwen, WANG Ying, LIU Cheng, et al. EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks[J]. IEEE Transactions on Computers, 2021, 70(9): 1511–1525. doi: 10.1109/TC.2020.3014632
    [6] DAI Guohao, HUANG Tianhao, CHI Yuze, et al. GraphH: A processing-in-memory architecture for large-scale graph processing[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 38(4): 640–653. doi: 10.1109/TCAD.2018.2821565
    [7] BEAMER S, ASANOVIC K, and PATTERSON D. Locality exists in graph processing: Workload characterization on an ivy bridge server[C]. 2015 IEEE International Symposium on Workload Characterization, Atlanta, USA, 2015: 56–65.
    [8] WANG Mengxing, CAI Wenlong, ZHU Daoqian, et al. Field-free switching of a perpendicular magnetic tunnel junction through the interplay of spin-orbit and spin-transfer torques[J]. Nature Electronics, 2018, 1(11): 582–588. doi: 10.1038/s41928-018-0160-7
    [9] GUO Zongxia, YIN Jialiang, BAI Yue, et al. Spintronics for energy-efficient computing: An overview and outlook[J]. Proceedings of the IEEE, 2021, 109(8): 1398–1417. doi: 10.1109/JPROC.2021.3084997
    [10] JAIN S, RANJAN A, ROY K, et al. Computing in memory with spin-transfer torque magnetic RAM[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018, 26(3): 470–483. doi: 10.1109/TVLSI.2017.2776954
    [11] ANGIZI S, SUN Jiao, ZHANG Wei, et al. GraphS: A graph processing accelerator leveraging SOT-MRAM[C]. 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019: 378–383.
    [12] WANG Xueyan, YANG Jianlei, ZHAO Yinglin, et al. Triangle counting accelerations: From algorithm to in-memory computing architecture[J]. IEEE Transactions on Computers, 2022, 71(10): 2462–2472. doi: 10.1109/TC.2021.3131049
    [13] LI Shuangchen, XU Cong, ZOU Qiaosha, et al. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories[C]. The 53rd ACM/EDAC/IEEE Design Automation Conference, Austin, USA, 2016: 1–6.
    [14] HAN Lei, SHEN Zhaoyan, LIU Duo, et al. A novel ReRAM-based processing-in-memory architecture for graph traversal[J]. ACM Transactions on Storage, 2018, 14(1): 9. doi: 10.1145/3177916
    [15] CHEN Xuhang, WANG Xueyan, JIA Xiaotao, et al. Accelerating graph-connected component computation with emerging processing-in-memory architecture[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(12): 5333–5342. doi: 10.1109/TCAD.2022.3163628
    [16] PEROZZI B, AL-RFOU R, and SKIENA S. DeepWalk: Online learning of social representations[C]. The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 2014: 701–710.
    [17] LESKOVEC J and KREVL A. SNAP Datasets: Stanford large network dataset collection[EB/OL]. http://snap.stanford.edu/data, 2014.
  • 加载中
图(4) / 表(3)
计量
  • 文章访问数:  449
  • HTML全文浏览量:  187
  • PDF下载量:  91
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-05-04
  • 修回日期:  2023-07-19
  • 网络出版日期:  2023-07-25
  • 刊出日期:  2023-09-27

目录

    /

    返回文章
    返回