高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

STT-MRAM绝对差值原位计算驱动的轻量型AdderNet电路设计

王黎勋 张跃军 李琪康 张会红 温亮

王黎勋, 张跃军, 李琪康, 张会红, 温亮. STT-MRAM绝对差值原位计算驱动的轻量型AdderNet电路设计[J]. 电子与信息学报, 2025, 47(9): 3252-3261. doi: 10.11999/JEIT250627
引用本文: 王黎勋, 张跃军, 李琪康, 张会红, 温亮. STT-MRAM绝对差值原位计算驱动的轻量型AdderNet电路设计[J]. 电子与信息学报, 2025, 47(9): 3252-3261. doi: 10.11999/JEIT250627
WANG Lixun, ZHANG Yuejun, LI Qikang, ZHANG Huihong, WEN Liang. Lightweight AdderNet Circuit Enabled by STT-MRAM In-Memory Absolute Difference Computation[J]. Journal of Electronics & Information Technology, 2025, 47(9): 3252-3261. doi: 10.11999/JEIT250627
Citation: WANG Lixun, ZHANG Yuejun, LI Qikang, ZHANG Huihong, WEN Liang. Lightweight AdderNet Circuit Enabled by STT-MRAM In-Memory Absolute Difference Computation[J]. Journal of Electronics & Information Technology, 2025, 47(9): 3252-3261. doi: 10.11999/JEIT250627

STT-MRAM绝对差值原位计算驱动的轻量型AdderNet电路设计

doi: 10.11999/JEIT250627 cstr: 32379.14.JEIT250627
基金项目: 国家自然科学基金(62474100, 62174121, 62134002),浙江省“尖兵领雁+X”科技计划项目(2025C01063),宁波市科创甬江2035重点研发计划(2024Z139),慈溪市重点研发专项(CZ2025006),宁波大学研究生科研创新项目
详细信息
    作者简介:

    王黎勋:男,博士生,研究方向为低功耗存算一体电路设计及实现

    张跃军:男,教授,研究方向为低功耗、高信息密度集成电路理论和设计

    李琪康:男,博士生,研究方向为高效深度神经网络加速器设计

    张会红:女,副教授,研究方向为控制理论与应用、低功耗集成电路理论与优化设计

    温亮:男,博士,研究方向为低功耗存储器电路设计及实现

    通讯作者:

    张跃军 zhangyuejun@nbu.edu.cn

  • 中图分类号: TN403

Lightweight AdderNet Circuit Enabled by STT-MRAM In-Memory Absolute Difference Computation

Funds: The National Natural Science Foundation of China (62474100, 62174121, 62134002), “Vanguard Geese Leading and X” Science and Technology Program of Zhejiang Province (2025C01063), The Key R&D Program of Ningbo Science and Technology Yongjiang 2035 (2024Z139), The Key R&D Program of Cixi (CZ2025006), The Graduate Student Scientific Research and Innovation Project of Ningbo University
  • 摘要: 随着人工智能研究的不断深入,卷积神经网络(Convolutional Neural Networks, CNN)在资源受限环境中的部署需求不断上升。然而,受限于冯·诺依曼架构,CNN加速器随着部署模型深度增加,卷积核逐层堆叠所引发的乘累加运算呈现超线性增长趋势。为此,该文提出一种基于自旋转移矩磁性随机存储器(Spin Transfer Torque-Magnetoresistive Random Access Memory, STT-MRAM)的轻量型加法神经网络(AdderNet)加速电路设计方案。该方案首先将L1范数引入存算一体架构,提出STT-MRAM绝对差值原位计算方法,以轻量级加法取代乘累加运算;其次,设计基于磁阻状态映射的可配置全加器,结合稀疏优化策略,跳过零值参与的冗余逻辑判断;最后,进一步构建支持单周期进位链更新的并行全加器阵列,实现高效的卷积核映射与多核L1范数并行计算。实验结果显示,在CIFAR-10数据集上,该加速器实现90.66%的识别准确率,仅较软件模型下降1.18%,同时在133 MHz频率下达到32.31 GOPS的最大吞吐量与494.56 GOPS/W的峰值能效。
  • 图  1  STT-MRAM器件及其连接结构图

    图  2  基于STT-MRAM的全加器单元

    图  3  STT-MRAM全加器单元计算过程

    图  4  基于STT-MRAM的全加器阵列结构

    图  5  AdderNet算法的硬件架构及实现

    图  6  STT-MRAM阻态分布图

    图  7  进位Zi+1计算时序波形图

    图  8  和位S计算时序波形图

    图  9  AdderNet的准确率与损失变化曲线

    图  10  加法器读电路的蒙特卡罗仿真

    图  11  与基准模型的混淆矩阵对比

    表  1  多层感知机操作数统计

    层别 总操作数 “0”操作数 占比(%)
    输入层-隐藏层 1003250000 868548048 86.57
    隐藏层-输出层 12800000 6795030 53.08
    输入层-输出层 1016320000 875613078 86.16
    下载: 导出CSV

    表  2  与相关文献比较结果

    i5-1235U
    CPU[14]
    i7-12700H
    CPU[14]
    i5-1235U
    iGPU[14]
    i7-12700H
    iGPU[14]
    ICCAD
    2022[15]
    TCAS-I
    2024[13]
    TCAS-I
    2025[16]
    本文
    工艺(nm) 10 10 10 10 N/A 65 N/A 40
    电压(V) 1.1-1.3 1.1-1.3 0.9-1.1 0.9-1.1 12(FPGA) 1.2 12(FPGA) 0.6-1.1
    单元结构 N/A N/A N/A N/A LUT 2T1M LUT 3T3M
    频率(MHz) 4400 4700 1200 1400 200 117 200 133
    位宽(bit) 8 8 8 8 8 8 6 8
    功耗(mW) 5500 11500 1200 1650 1695 63.82 381 65.33
    网络模型 ResNet-50 ResNet-50 ResNet-50 ResNet-50 ResNet-20 VGG-8 ResNet-20 ResNet-20
    数据集 N/A N/A N/A N/A CIFAR-10 CIFAR-10 CIFAR-10 CIFAR-10
    准确率(%) N/A N/A N/A N/A 89.9% 93.72% 90.52% 90.66%
    最大吞吐量
    (GOPS)
    151.18 424 200.32 360.49 214.6 20.93 12.08/kLUT 32.31
    能效
    (GOPS/W)
    27.48 2.6 166.9 218.48 126.6 246.68 562.5 494.56
    下载: 导出CSV
  • [1] QI Haoran, QIU Yuwei, LUO Xing, et al. An efficient latent style guided transformer-CNN framework for face super-resolution[J]. IEEE Transactions on Multimedia, 2024, 26: 1589–1599. doi: 10.1109/TMM.2023.3283856.
    [2] 陈晓雷, 王兴, 张学功, 等. 面向360度全景图像显著目标检测的相邻协调网络[J]. 电子与信息学报, 2024, 46(12): 4529–4541. doi: 10.11999/JEIT240502.

    CHEN Xiaolei, WANG Xing, ZHANG Xuegong, et al. Adjacent coordination network for salient object detection in 360 degree omnidirectional images[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4529–4541. doi: 10.11999/JEIT240502.
    [3] 李沫谦, 杨陟卓, 李茹, 等. 基于多尺度卷积的阅读理解候选句抽取[J]. 中文信息学报, 2024, 38(8): 128–139,157. doi: 10.3969/j.issn.1003-0077.2024.08.015.

    LI Moqian, YANG Zhizhuo, LI Ru, et al. Evidence sentence extraction for reading comprehension based on multi-scale convolution[J]. Journal of Chinese Information Processing, 2024, 38(8): 128–139,157. doi: 10.3969/j.issn.1003-0077.2024.08.015.
    [4] LU Ye, XIE Kunpeng, XU Guanbin, et al. MTFC: A multi-GPU training framework for cube-CNN-based hyperspectral image classification[J]. IEEE Transactions on Emerging Topics in Computing, 2021, 9(4): 1738–1752. doi: 10.1109/TETC.2020.3016978.
    [5] HONG H, CHOI D, KIM N, et al. Mobile-X: Dedicated FPGA implementation of the MobileNet accelerator optimizing depthwise separable convolution[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2024, 71(11): 4668–4672. doi: 10.1109/TCSII.2024.3440884.
    [6] MUN H G, MOON S, KIM B, et al. Bottleneck-stationary compact model accelerator with reduced requirement on memory bandwidth for edge applications[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2023, 70(2): 772–782. doi: 10.1109/TCSI.2022.3222862.
    [7] WANG Tianyu, SHEN Zhaoyan, and SHAO Zili. CNN acceleration with joint optimization of practical PIM and GPU on embedded devices[C]. IEEE 40th International Conference on Computer Design (ICCD), Olympic Valley, CA, USA, 2022: 377–384. doi: 10.1109/ICCD56317.2022.00062.
    [8] BLOTT M, PREUßER T B, FRASER N J, et al. FINN-R: An end-to-end deep-learning framework for fast exploration of quantized neural networks[J]. ACM Transactions on Reconfigurable Technology and Systems, 2018, 11(3): 16. doi: 10.1145/3242897.
    [9] CONTI F, PAULIN G, GAROFALO A, et al. Marsellus: A heterogeneous RISC-V AI-IoT end-node SoC with 2–8 b DNN acceleration and 30%-boost adaptive body biasing[J]. IEEE Journal of Solid-State Circuits, 2024, 59(1): 128–142. doi: 10.1109/JSSC.2023.3318301.
    [10] CHEN Hanting, WANG Yunhe, XU Chunjing, et al. AdderNet: Do we really need multiplications in deep learning?[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, 2020: 1465–1474. doi: 10.1109/CVPR42600.2020.00154.
    [11] ZHANG Heng, HE Sunan, LU Xin, et al. SSM-CIM: An efficient CIM macro featuring single-step multi-bit MAC computation for CNN edge inference[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2023, 70(11): 4357–4368. doi: 10.1109/TCSI.2023.3301814.
    [12] 永若雪, 姜岩峰. 关于3D堆叠MRAM热学分析方法的研究[J]. 电子学报, 2023, 51(10): 2775–2782. doi: 10.12263/DZXB.20220275.

    YONG Ruoxue and JIANG Yanfeng. Research on thermal analysis method of 3D-stacked MRAM[J]. Acta Electronica Sinica, 2023, 51(10): 2775–2782. doi: 10.12263/DZXB.20220275.
    [13] LUO Lichuan, DENG Erya, LIU Dijun, et al. CiTST-AdderNets: Computing in toggle spin torques MRAM for energy-efficient AdderNets[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024, 71(3): 1130–1143. doi: 10.1109/TCSI.2023.3343081.
    [14] OpenVINO. OpenVINO 2025.3[EB/OL]. https://docs.openvino.ai/2025/index.html, 2025.
    [15] ZHANG Yunxiang, SUN Biao, JIANG Weixiong, et al. WSQ-AdderNet: Efficient weight standardization based quantized AdderNet FPGA accelerator design with high-density INT8 DSP-LUT co-packing optimization[C]. IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Diego, USA, 2022: 1–9.
    [16] ZHANG Yunxiang, AL KAILANI O, ZHOU Bin, et al. AdderNet 2.0: Optimal addernet accelerator designs with activation-oriented quantization and fused bias removal-based memory optimization[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2025. doi: 10.1109/TCSI.2025.3539912.
  • 加载中
图(11) / 表(2)
计量
  • 文章访问数:  72
  • HTML全文浏览量:  39
  • PDF下载量:  3
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-04-28
  • 修回日期:  2025-09-17
  • 网络出版日期:  2025-09-19
  • 刊出日期:  2025-09-24

目录

    /

    返回文章
    返回