高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于忆阻器的图卷积神经网络加速器设计

李冰 午康俊 王晶 李森 高岚 张伟功 倪天明

李冰, 午康俊, 王晶, 李森, 高岚, 张伟功, 倪天明. 基于忆阻器的图卷积神经网络加速器设计[J]. 电子与信息学报, 2023, 45(1): 106-115. doi: 10.11999/JEIT211435
引用本文: 李冰, 午康俊, 王晶, 李森, 高岚, 张伟功, 倪天明. 基于忆阻器的图卷积神经网络加速器设计[J]. 电子与信息学报, 2023, 45(1): 106-115. doi: 10.11999/JEIT211435
LI Bing, WU Kangjun, WANG Jing, LI Sen, GAO Lan, ZHANG Weigong, NI Tianming. Design of Graph Convolutional Network Accelerator Based on Resistive Random Access Memory[J]. Journal of Electronics & Information Technology, 2023, 45(1): 106-115. doi: 10.11999/JEIT211435
Citation: LI Bing, WU Kangjun, WANG Jing, LI Sen, GAO Lan, ZHANG Weigong, NI Tianming. Design of Graph Convolutional Network Accelerator Based on Resistive Random Access Memory[J]. Journal of Electronics & Information Technology, 2023, 45(1): 106-115. doi: 10.11999/JEIT211435

基于忆阻器的图卷积神经网络加速器设计

doi: 10.11999/JEIT211435
基金项目: 国家自然科学基金(62174001, 61904001),安徽省重点研究与开发计划(202104b11020032),安徽工程大学中青年拔尖人才计划
详细信息
    作者简介:

    李冰:女,博士,副研究员,研究方向为人工智能加速器软硬件协同优化、存内计算、类脑神经计算和忆阻器

    午康俊:男,硕士生,研究方向为计算机体系结构、存内计算、忆阻器

    王晶:女,博士,研究员,研究方向为计算机体系结构、高性能计算、容错计算

    李森:男,博士,工程师,研究方向为计算机系统与可靠性

    高岚:女,博士,讲师,研究方向为GPU体系结构及并行编程、CPU-GPU异构系统、共享缓存资源管理

    张伟功:男,博士,研究员,研究方向为计算机体系结构、高可靠嵌入式系统设计

    倪天明:男,博士,副教授,研究方向为数字集成电路可测性/可靠性和可制造性设计、3维集成电路容错设计、数字集成电路抗辐射加固设计、硬件安全

    通讯作者:

    王晶 jwang@cnu.edu.cn

  • 中图分类号: TN929.5; TN601

Design of Graph Convolutional Network Accelerator Based on Resistive Random Access Memory

Funds: The National Natural Science Foundation of China (62174001, 61904001), Anhui Provincial Key Research and Development Program (202104b11020032), Anhui Polytechnic University Young and Middle-Aged Top Talent Training Program
  • 摘要: 图卷积神经网络(GCN)在社交网络、电子商务、分子结构推理等任务中的表现远超传统人工智能算法,在近年来获得广泛关注。与卷积神经网络(CNN)数据独立分布不同,图卷积神经网络更加关注数据之间特征关系的提取,通过邻接矩阵表示数据关系,因此其输入数据和操作数相比卷积神经网络而言都更加稀疏且存在大量数据传输,所以实现高效的GCN加速器是一个挑战。忆阻器(ReRAM)作为一种新兴的非易失性存储器,具有高密度、读取访问速度快、低功耗和存内计算等优点。利用忆阻器为CNN加速已经被广泛研究,但是图卷积神经网络极大的稀疏性会导致现有加速器效率低下,因此该文提出一种基于忆阻器交叉阵列的高效图卷积神经网络加速器,首先,该文分析GCN中不同操作数的计算和访存特征,提出权重和邻接矩阵到忆阻器阵列的映射方法,有效利用两种操作数的计算密集特征并避免访存密集的特征向量造成过高开销;进一步地,充分挖掘邻接矩阵的稀疏性,提出子矩阵划分算法及邻接矩阵的压缩映射方案,最大限度降低GCN的忆阻器资源需求;此外,加速器提供对稀疏计算支持,支持压缩格式为坐标表(COO)的特征向量输入,保证计算过程规则且高效地执行。实验结果显示,该文加速器相比CPU有483倍速度提升和1569倍能量节省;相比GPU也有28倍速度提升和168倍能耗节省。
  • 图  1  图卷积神经网络计算

    图  2  Crossbar乘加运算和Crossbar上GCN映射方案

    图  3  加速器整体架构

    图  4  稀疏特征映射

    图  5  稀疏邻接矩阵子矩阵划分、压缩和映射

    图  6  邻接矩阵不同子矩阵划分下Tile数目

    图  7  各数据集在加速器优化方法(OPT)和基础方法(BASE)下两步计算正则化延迟结果

    图  8  各数据集在加速器优化方法(OPT)和基础方法(BASE)下两步计算正则化能耗结果

    图  9  不同数据集下优化方案OPT相比CPU, GPU和BASE的速度提升

    图  10  不同数据集下优化方案OPT相比CPU, GPU和BASE的能耗降低

    表  1  不同数据集密集度、体量和维度

    CoraCiteseerPubmedNell
    密集度(%)A0.180.110.0280.0073
    X11.270.8510.00.011
    W100100100100
    体量(%)A65.0147.1097.4492.12
    X134.4052.422.477.58
    维度节点数270833271971765755
    特征数143337035005414
    下载: 导出CSV

    表  2  各数据集子矩阵划分粒度

    CoraCiteseerPubmedNell
    子矩阵划分62×6264×645×510×10
    Tile数目11611512213194
    基线Tile数目121169608466049
    Tile减少率(倍)1.041.474.9820.68
    下载: 导出CSV
  • [1] KIPF T N and WELLING M. Semi-supervised classification with graph convolutional networks[C]. The 5th International Conference on Learning Representations, Toulon, France, 2017.
    [2] PARK H W, PARK S, and CHONG M. Conversations and medical news frames on twitter: Infodemiological study on COVID-19 in South Korea[J]. Journal of Medical Internet Research, 2020, 22(5): e18897. doi: 10.2196/18897
    [3] SHI Chence, XU Minkai, ZHU Zhaocheng, et al. GraphAF: A flow-based autoregressive model for molecular graph generation[C]. The 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [4] YAN Mingyu, DENG Lei, HU Xing, et al. HyGCN: A GCN accelerator with hybrid architecture[C]. 2020 IEEE International Symposium on High Performance Computer Architecture, San Diego, USA, 2020: 15–29.
    [5] GENG Tong, LI Ang, SHI Runbin, et al. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing[C]. The 53rd Annual IEEE/ACM International Symposium on Microarchitecture, Athens, Greece, 2020: 922–936.
    [6] LIANG Shengwen, WANG Ying, LIU Cheng, et al. EnGN: A high-throughput and energy-efficient accelerator for large graph neural networks[J]. IEEE Transactions on Computers, 2021, 70(9): 1511–1525. doi: 10.1109/TC.2020.3014632
    [7] WONG H S P, LEE H Y, YU Shimeng, et al. Metal–oxide RRAM[J]. Proceedings of the IEEE, 2012, 100(6): 1951–1970. doi: 10.1109/JPROC.2012.2190369
    [8] SHAFIEE A, NAG A, MURALIMANOHAR N, et al. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars[C]. The ACM/IEEE 43rd Annual International Symposium on Computer Architecture, Seoul, Korea, 2016: 14–26.
    [9] TANG Shibin, YIN Shouyi, ZHENG Shixuan, et al. AEPE: An area and power efficient RRAM crossbar-based accelerator for deep CNNs[C]. The IEEE 6th Non-Volatile Memory Systems and Applications Symposium, Hsinchu, China, 2017: 1–6.
    [10] CHI Ping, LI Shuangchen, XU Cong, et al. PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory[C]. The ACM/IEEE 43rd Annual International Symposium on Computer Architecture, Seoul, Korea, 2016: 27–39.
    [11] YANG T H, CHENG H Y, YANG C L, et al. Sparse ReRAM engine: Joint exploration of activation and weight sparsity in compressed neural networks[C]. The 46th Annual International Symposium on Computer Architecture, Phoenix, USA, 2019: 236–249.
    [12] SONG Linghao, ZHUO Youwei, QIAN Xuehai, et al. GraphR: Accelerating graph processing using ReRAM[C]. 2018 IEEE International Symposium on High Performance Computer Architecture, Vienna, Austria, 2018: 531–543.
    [13] CHALLAPALLE N, RAMPALLI S, SONG Linghao, et al. GaaS-X: Graph analytics accelerator supporting sparse data representation using crossbar architectures[C]. The 47th Annual International Symposium on Computer Architecture, Valencia, Spain, 2020: 433–445.
    [14] DAI Guohao, HUANG Tianhao, WANG Yu, et al. GraphSAR: A sparsity-aware processing-in-memory architecture for large-scale graph processing on ReRAMs[C]. The 24th Asia and South Pacific Design Automation Conference, Tokyo, Japan, 2019: 120–126.
    [15] WANG Zhao, GUAN Yijin, SUN Guangyu, et al. GNN-PIM: A processing-in-memory architecture for graph neural networks[C]. The 13th Conference on Advanced Computer Architecture, Kunming, China, 2020: 73–86.
    [16] HE Yintao, WANG Ying, LIU Cheng, et al. TARe: Task-adaptive in-situ ReRAM computing for graph learning[C]. The 58th ACM/IEEE Design Automation Conference, San Francisco, USA, 2021: 577–582.
    [17] WU Zonghan, PAN Shirui, CHEN Fengwen, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4–24. doi: 10.1109/TNNLS.2020.2978386
    [18] SEN P, NAMATA G, BILGIC M, et al. Collective classification in network data[J]. AI Magazine, 2008, 29(3): 93. doi: 10.1609/aimag.v29i3.2157
    [19] CARLSON A, BETTERIDGE J, KISIEL B, et al. Toward an architecture for never-ending language learning[C]. The 24th AAAI Conference on Artificial Intelligence, Atlanta, America, 2010: 1306–1313.
    [20] SONG Linghao, QIAN Xuehai, LI Hai, et al. PipeLayer: A pipelined ReRAM-based accelerator for deep learning[C]. 2017 IEEE International Symposium on High Performance Computer Architecture, Austin, USA, 2017: 541–552.
    [21] ZHU Zhenhua, SUN Hanbo, QIU Kaizhong, et al. MNSIM 2.0: A behavior-level modeling tool for memristor-based neuromorphic computing systems[C]. The 2020 on Great Lakes Symposium on VLSI, Beijing, China, 2020: 83–88.
    [22] FEY Y and LENSSEN J E. Fast graph representation learning with PyTorch geometric[EB/OL]. https://arxiv.org/abs/1903.02428v3, 2019.
    [23] ABOU-RJEILI A and KARYPIS G. Multilevel algorithms for partitioning power-law graphs[C]. The 20th IEEE International Parallel & Distributed Processing Symposium, Rhodes, Greece, 2006: 10.
  • 加载中
图(10) / 表(2)
计量
  • 文章访问数:  333
  • HTML全文浏览量:  95
  • PDF下载量:  110
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-12-06
  • 修回日期:  2022-04-05
  • 网络出版日期:  2022-04-19
  • 刊出日期:  2023-01-17

目录

    /

    返回文章
    返回