忆阻器轻量化门控循环单元网络模型设计

华宏虎; 许佳; 张博昊; 王伟; 李智炜; 刘海军

doi:10.11999/JEIT260152

忆阻器轻量化门控循环单元网络模型设计

doi: 10.11999/JEIT260152 cstr: 32379.14.JEIT260152

国防科技大学电子科学学院长沙 410073

基金项目: 国家自然科学基金 (62074166, 62304254, 62104256, 62404253, U23A20322)

详细信息

作者简介:
华宏虎：男，博士生，研究方向为忆阻器智能计算架构等

许佳：女，硕士生，研究方向为忆阻器智能计算架构等

张博昊：男，博士生，研究方向为忆阻器智能计算架构等

王伟：男，副研究员，研究方向为忆阻器材料、器件和忆阻器类脑芯片等

李智炜：男，副研究员，研究方向为忆阻器智能计算架构

刘海军：男，副教授，研究方向为忆阻器智能计算架构、先进集成电路等

通讯作者:
刘海军　liuhaijun@nudt.edu.cn

中图分类号: TN601; TP183
计量
- 文章访问数: 238
- HTML全文浏览量: 145
- PDF下载量: 20
- 被引次数: 0
出版历程
- 收稿日期: 2026-02-05
- 修回日期: 2026-06-02
- 录用日期: 2026-06-24
- 网络出版日期: 2026-07-04

Design of Lightweight Gated Recurrent Unit Network Model Based on Memristor

College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China

Funds: The National Natural Science Foundation of China (62074166, 62304254, 62104256, 62404253, U23A20322)

摘要

摘要: 忆阻器门控循环单元 (GRU) 网络对于时序数据处理系统的嵌入式部署提供了新的解决途径，但是由于网络规模大、权值精度高，难以直接部署到嵌入式端侧设备。为此，该文对忆阻器轻量化GRU网络模型设计展开研究，构建能够部署在有限资源上的GRU网络模型，设计忆阻器交叉阵列的映射方式，提出基于性能分析与器件感知的融合量化方法，综合考虑网络性能与权值部署、激活函数计算的不同器件实现方式，使用权值对称量化、激活值非对称量化的策略对忆阻器GRU网络模型进行量化，采用权值加噪的方式提升网络模型对忆阻器件非理想因素的包容性。仿真实验表明，该文所设计的忆阻器GRU网络模型在公开的UrbanSound8K数据集上的分类准确率为93.94%，量化至6 bit后模型分类准确率为92.68%，相比于全精度的Dilated Convolution, LM-MFCC+GRU及TFFS-DNN模型分别高出14.68%, 0.68%和3.94%，且权值加噪训练能够有效提升轻量化网络模型对忆阻器件非理想因素的适应能力。此外，还验证了该网络模型在真假轨迹判别任务上的性能，在自建的真假轨迹数据集上的分类准确率为97.35%，量化至6 bit后分类准确率仅下降0.84%。
- 门控循环单元 /
- 忆阻器 /
- 网络模型量化 /
- 城市音频分类
Abstract: Objective With the slowdown of Complementary Metal-Oxide-Semiconductor (CMOS) technology scaling and the inherent memory-computation separation of von Neumann architectures, conventional computing systems face increasing challenges in processing large-scale sequential data. Memristors provide a promising solution because of their high integration density, fast switching speed, and synaptic plasticity. Memristor crossbar arrays naturally support Vector-Matrix Multiplication (VMM) in the analog domain, enabling energy-efficient in-memory computing. As a representative recurrent neural network, the Gated Recurrent Unit (GRU) has achieved excellent performance in sequential tasks such as trajectory prediction and urban sound classification. However, conventional hardware implementations of GRU networks require frequent data transfer between memory and processing units, resulting in high energy consumption and limited throughput. Although memristor-based GRU implementations improve computational efficiency, their large parameter size and high weight precision require substantial hardware resources and reduce deployment reliability on resource-constrained memristor arrays. In addition, device non-idealities, such as conductance fluctuations, further reduce inference accuracy. Existing memristor-based GRU methods generally treat weights and activations using the same quantization strategy without considering their different hardware implementation characteristics, and they provide limited robustness against device variations. This paper addresses these issues through a hardware-algorithm co-design strategy. Methods This paper proposes a lightweight memristor-based GRU network model. A 1T1R (one-transistor-one-resistor) memristor crossbar array is adopted for weight mapping and analog Multiply-Accumulate (MAC) operations. Signed weights are represented by differential pairs of positive and negative conductance matrices because memristor conductance values are inherently non-negative. A linear transformation is used to map trained network weights to memristor conductance values. To account for the different hardware implementation paths of weights and activations, a device-aware fusion quantization method based on performance analysis is proposed. Symmetric quantization is applied to weights stored in the memristor array because the zero-centered quantization range eliminates zero-point storage and simplifies write-driver circuit design. In contrast, asymmetric quantization is applied to activation values computed in peripheral circuits, thereby preserving the dynamic range and reducing quantization error. To improve robustness against memristor conductance fluctuations, weight noise training is incorporated into Quantization-Aware Training (QAT). Gaussian noise with an intensity determined by the device variation parameter is injected into quantized weights during each forward pass. This strategy acts as a regularizer that guides the model toward flatter loss minima and improves tolerance to weight perturbations. During backpropagation, the straight-through estimator updates the full-precision floating-point weights, whereas noise is dynamically resampled in every forward pass. Results and Discussions On the public UrbanSound8K dataset, the proposed full-precision lightweight memristor-based GRU network model achieves a classification accuracy of 93.94%. After applying the device-aware fusion quantization method, the 6-bit quantized model achieves 92.68% accuracy, corresponding to only a 1.26% decrease while reducing weight precision by 81.25% (Table 1). The proposed model outperforms Dilated Convolution (78.00%), LM-MFCC+GRU (92.00%), TFFS-DNN (88.74%), TFCNN (93.10%), and CL-Transformer (92.95%) under their full-precision settings (Table 2). Under noisy input conditions with Signal-to-Noise Ratios (SNRs) ranging from −10 dB to 10 dB, the 6-bit quantized model exhibits robustness comparable to or better than that of the full-precision model, demonstrating the effectiveness of the proposed device-aware fusion quantization strategy (Table 3). From the perspectives of storage, hardware resources, and device feasibility, 6-bit quantization reduces weight storage from 5.6 MB to 1.05 MB, corresponding to a compression ratio of 81.2%, while requiring only 2.8 million memristor cells under the 1T1R mapping scheme. Weight noise training also substantially improves robustness against device non-idealities. When the conductance variation reaches 14%, the classification accuracy increases from 82.97% to 91.14%. At the maximum simulated variation of 28%, the accuracy increases from 54.23% to 87.01% (Fig. 7), demonstrating improved tolerance to memristor device variations. On a self-constructed true-false trajectory dataset, the lightweight memristor-based GRU network model achieves 97.35% accuracy at full precision and 96.51% after 6-bit quantization, with only a 0.84% decrease, outperforming the Dilated Convolution baseline (Table 4). To further verify its applicability to different sequential tasks, the lightweight memristor-based GRU network model is evaluated on lithium-ion battery State-of-Charge (SOC) estimation using a public dataset. The 6-bit quantized model achieves Root Mean Square Errors (RMSEs) of 1.48%, 0.79%, and 0.74% at 0 °C, 25 °C, and 45 °C, respectively, outperforming the existing memristor-based GRU implementation. The proposed model also achieves lower RMSEs than the comparison method at all evaluated quantization precisions of 6 bits and above (Table 5). Conclusions This paper presents a lightweight memristor-based GRU network model for hardware deployment. By combining device-aware fusion quantization with weight noise training integrated into Quantization-Aware Training (QAT), the model achieves substantial memory compression while maintaining high classification accuracy and improving robustness to memristor device non-idealities. Experimental results on multiple datasets and sequential tasks demonstrate that the 6-bit quantized model preserves competitive accuracy and stable performance, providing an effective solution for deploying GRU networks on resource-constrained memristor-based edge computing platforms.
- Gated Recurrent Unit (GRU) /
- Memristor /
- Network model quantization /
- Urban sound classification

HTML全文

图 1 GRU网络结构及其核心单元结构

下载: 全尺寸图片幻灯片

图 2 1T1R忆阻器交叉阵列

下载: 全尺寸图片幻灯片

图 3 基于性能分析与器件感知的融合量化方法

下载: 全尺寸图片幻灯片

图 4 狗叫声可视化示例

下载: 全尺寸图片幻灯片

图 5 假轨迹示例

下载: 全尺寸图片幻灯片

图 6 面向城市音频分类任务的GRU网络模型训练情况

下载: 全尺寸图片幻灯片

图 7 考虑器件波动性的轻量化GRU网络模型的分类准确率变化情况

下载: 全尺寸图片幻灯片

图 8 面向真假轨迹判别任务的GRU网络模型训练情况

下载: 全尺寸图片幻灯片

表 1 面向城市音频分类任务的轻量化GRU网络模型分类性能

量化精度 (bit)	分类准确率 (%)
2	12.01
3	36.96
4	51.26
5	78.15
6	92.68
7	93.59
8	93.71
16	93.82

下载: 导出CSV

表 2 与其他模型在UrbanSound8K数据集上的性能对比

模型	分类准确率 (%)	权值精度 (bit)	参数量 (M)
Dilated Convolution[15]	78.00	32	-
LM-MFCC+GRU[16]	92.00	32	0.7
TFCNN[17]	93.10	32	1.6
TFFS-DNN[18]	88.74	32	-
CL-Transformer[19]	92.95	32	-
Ours(Before quantification)	93.94	32	1.4
Ours(quantification of 6 bit)	92.68	6	1.4

下载: 导出CSV

表 3 在UrbanSound8K数据集上加入不同SNR水平噪声时GRU网络模型的分类性能

SNR (dB)	分类准确率 (%)
SNR (dB)	全精度模型	6 bit量化模型
–10	34.44	38.79
–5	69.91	70.25
0	79.52	85.24
5	84.04	87.30
10	92.11	90.96

下载: 导出CSV

表 4 面向真假轨迹判别任务的轻量化GRU网络模型和Dilated Convolution^[15]模型的分类性能

量化精度 (bit)	分类准确率 (%)
量化精度 (bit)	本文	Dilated Convolution[15]
2	63.44	65.53
3	63.72	75.53
4	68.28	90.14
5	91.40	90.65
6	96.51	90.56
7	97.16	90.65
8	97.26	90.60
16	97.30	90.98

下载: 导出CSV

表 5 不同模型对各种环境温度下FUDS的SOC估计性能RMSE (%)

模型	环境温度 (℃)
模型	0	25	45
Memristor-based GRU[20]	2.18	1.36	1.23
Ours (full precision)	1.26	0.58	0.56
Ours (16 bit)	1.57	0.62	0.52
Ours (8 bit)	1.39	0.58	0.52
Ours (7 bit)	1.55	0.73	0.75
Ours (6 bit)	1.48	0.79	0.74
Ours (5 bit)	2.64	1.86	1.76
Ours (4 bit)	13.33	10.50	11.75
Ours (3 bit)	22.62	23.37	24.29
Ours (2 bit)	22.24	22.81	23.82

下载: 导出CSV

参考文献(20)

[1]	STRUKOV D B, SNIDER G S, STEWART D R, et al. The missing memristor found[J]. Nature, 2008, 453(7191): 80–83. doi: 10.1038/nature06932.
[2]	LECUN Y, BENGIO Y, and HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436–444. doi: 10.1038/nature14539.
[3]	刘凇佐, 王虔, 李磊, 等. 粒子群优化的门控循环单元网络漂流浮标轨迹预测[J]. 电子与信息学报, 2024, 46(8): 3295–3304. doi: 10.11999/JEIT230945. LIU Songzuo, WANG Qian, LI Lei, et al. Gated recurrent unit network of particle swarm optimization for drifting buoy trajectory prediction[J]. Journal of Electronics & Information Technology, 2024, 46(8): 3295–3304. doi: 10.11999/JEIT230945.
[4]	WANG Jiayang, JI Xiaoyue, DONG Zhekang, et al. Circuit design of memristor-based GRU and its applications in SOC estimation[C]. 2023 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, USA, 2023: 1–5. doi: 10.1109/ICCE56470.2023.10043585.
[5]	TONG Peiwen, XU Hui, SUN Yi, et al. Lightweight and highly robust memristor-based hybrid neural networks for electroencephalogram signal processing[J]. Chinese Physics B, 2023, 32(7): 078505. doi: 10.1088/1674-1056/ac9cbc.
[6]	李源堃, 王泽, 张清天, 等. NAS4CIM: 面向忆阻器存算一体芯片的神经网络结构搜索框架[J]. 电子与信息学报, 2025, 47(12): 4948–4958. doi: 10.11999/JEIT250978. LI Yuankun, WANG Ze, ZHANG Qingtian, et al. NAS4CIM: Tailored neural network architecture search for RRAM-based compute-in-memory chips[J]. Journal of Electronics & Information Technology, 2025, 47(12): 4948–4958. doi: 10.11999/JEIT250978.
[7]	蔺海荣, 段晨星, 邓晓衡, 等. 双忆阻类脑混沌神经网络及其在IoMT数据隐私保护中应用[J]. 电子与信息学报, 2025, 47(7): 2194–2210. doi: 10.11999/JEIT241133. LIN Hairong, DUAN Chenxing, DENG Xiaoheng, et al. Dual-memristor brain-like chaotic neural network and its application in IoMT data privacy protection[J]. Journal of Electronics & Information Technology, 2025, 47(7): 2194–2210. doi: 10.11999/JEIT241133.
[8]	BALASKAS K, KARATZAS A, SAD C, et al. Hardware-aware DNN compression via diverse pruning and mixed-precision quantization[J]. IEEE Transactions on Emerging Topics in Computing, 2024, 12(4): 1079–1092. doi: 10.1109/TETC.2023.3346944.
[9]	PERRIN M, GUICQUERO W, PAILLE B, et al. Hardware-aware Bayesian neural architecture search of quantized CNNs[J]. IEEE Embedded Systems Letters, 2025, 17(1): 42–45. doi: 10.1109/LES.2024.3434379.
[10]	CHEN Junren, WU Huaqiang, GAO Bin, et al. Optimization strategy for accelerating multi-bit resistive weight programming on the RRAM array[C]. 2019 IEEE International Workshop on Future Computing (IWOFC), Hangzhou, China, 2019: 1–3. doi: 10.1109/IWOFC48002.2019.9078447.
[11]	HONG Haiqiao, DU Zhiyuan, JIANG Mingrui, et al. Memristor-based adaptive analog-to-digital conversion for efficient and accurate compute-in-memory[J]. Nature Communications, 2025, 16(1): 9749. doi: 10.1038/s41467-025-65233-w.
[12]	LI Can, WANG Zhongrui, RAO Mingyi, et al. Long short-term memory networks in memristor crossbar arrays[J]. Nature Machine Intelligence, 2019, 1(1): 49–57. doi: 10.1038/s42256-018-0001-4.
[13]	HUANG Lixing, YU Hongqi, CHEN Changlin, et al. A training strategy for improving the robustness of memristor-based binarized convolutional neural networks[J]. Semiconductor Science and Technology, 2022, 37(1): 015013. doi: 10.1088/1361-6641/ac31e3.
[14]	SUN Yi, XU Hui, WANG Chao, et al. A Ti/AlO_x/TaO_x/Pt analog synapse for memristive neural network[J]. IEEE Electron Device Letters, 2018, 39(9): 1298–1301. doi: 10.1109/LED.2018.2860053.
[15]	CHEN Yan, GUO Qian, LIANG Xinyan, et al. Environmental sound classification with dilated convolutions[J]. Applied Acoustics, 2019, 148: 123–132. doi: 10.1016/j.apacoust.2018.12.019.
[16]	PENG Ning, CHEN Aibin, ZHOU Guoxiong, et al. Environment sound classification based on visual multi-feature fusion and GRU-AWS[J]. IEEE Access, 2020, 8: 191100–191114. doi: 10.1109/ACCESS.2020.3032226.
[17]	MU Wenjie, YIN Bo, HUANG Xianqing, et al. Environmental sound classification using temporal-frequency attention based convolutional neural network[J]. Scientific Reports, 2021, 11(1): 21552. doi: 10.1038/s41598-021-01045-4.
[18]	WU Bo and ZHANG Xiaoping. Environmental sound classification via time–frequency attention and framewise self-attention-based deep neural networks[J]. IEEE Internet of Things Journal, 2022, 9(5): 3416–3428. doi: 10.1109/JIOT.2021.3098464.
[19]	CHEN Xu, WANG Mei, KAN Ruixiang, et al. Improved patch-mix transformer and contrastive learning method for sound classification in noisy environments[J]. Applied Sciences, 2024, 14(21): 9711. doi: 10.3390/app14219711.
[20]	CHEN Yanan, LUO Wei, CARTER M, et al. Organic electrode for non-aqueous potassium-ion batteries[J]. Nano Energy, 2015, 18: 205–211. doi: 10.1016/j.nanoen.2015.10.015.