Citation: | GONG Yu, WANG Liping, WANG You, LIU Weiqiang. Application and Research Progress of Approximate Computing as a New Computing Paradigm in AI Acceleration Systems[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3098-3108. doi: 10.11999/JEIT230352 |
[1] |
KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84–90. doi: 10.1145/3065386
|
[2] |
SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
|
[3] |
LIN Zichuan, LI Junyou, SHI Jianing, et al. JueWu-MC: Playing minecraft with sample-efficient hierarchical reinforcement learning[C]. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria, 2022.
|
[4] |
LI Zewen, LIU Fan, YANG Wenjie, et al. A survey of convolutional neural networks: Analysis, applications, and prospects[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 6999–7019. doi: 10.1109/TNNLS.2021.3084827
|
[5] |
REYNOLDS L and MCDONELL K. Prompt programming for large language models: Beyond the few-shot paradigm[C]. Proceedings of 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 2021,
|
[6] |
GAWLIKOWSKI J, TASSI C R N, ALI M, et al. A survey of uncertainty in deep neural networks[J]. arXiv: 2107.03342, 2021.
|
[7] |
LECUN Y. 1.1 Deep learning hardware: Past, present, and future[C]. Proceedings of 2019 IEEE International Solid-State Circuits Conference, San Francisco, USA, 2019: 536–543.
|
[8] |
MENGHANI G. Efficient deep learning: A survey on making deep learning models smaller, faster, and better[J]. ACM Computing Surveys, 2023, 55(12): 259. doi: 10.1145/3578938
|
[9] |
CAI Hao, BIAN Zhongjian, HOU Yaoru, et al. 33.4 A 28nm 2Mb STT-MRAM computing-in-memory macro with a refined bit-cell and 22.4–41.5 TOPS/W for AI inference[C]. Proceedings of 2023 IEEE International Solid-State Circuits Conference, San Francisco, USA, 2023.
|
[10] |
GUO An, SI Xin, CHEN Xi, et al. A 28nm 64-kb 31.6-TFLOPS/W digital-domain floating-point-computing-unit and double-bit 6T-SRAM computing-in-memory macro for floating-point CNNs[C]. Proceedings of 2023 IEEE International Solid-State Circuits Conference, San Francisco, USA, 2023.
|
[11] |
WANG Shaowei, XIE Guangjun, HAN Jie, et al. Highly accurate division and square root circuits by exploiting signal correlation in stochastic computing[J]. International Journal of Circuit Theory and Applications, 2022, 50(4): 1375–1385. doi: 10.1002/cta.3219
|
[12] |
XU Wenbing, XIE Guangjun, WANG Shaowei, et al. A stochastic computing architecture for local contrast and mean image thresholding algorithm[J]. International Journal of Circuit Theory and Applications, 2022, 50(9): 3279–3291. doi: 10.1002/cta.3320
|
[13] |
LIU Shanshan, TANG Xiaochen, NIKNIA F, et al. Stochastic dividers for low latency neural networks[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2021, 68(10): 4102–4115. doi: 10.1109/TCSI.2021.3103926
|
[14] |
LIU Weiqiang and LOMBARDI F. Approximate Computing[M]. Cham: Springer, 2022.
|
[15] |
LIU Weiqiang, LOMBARDI F, and SCHULTE M. Approximate computing: From circuits to applications[J]. Proceedings of the IEEE, 2020, 108(12): 2103–2107. doi: 10.1109/JPROC.2020.3033361
|
[16] |
REDA S and SHAFIQUE M. Approximate Circuits[M]. Cham: Springer, 2019.
|
[17] |
高越. 深度神经网络中Softmax函数的近似设计与硬件加速[D]. [硕士论文], 南京航空航天大学, 2021.
GAO Yue. Approximate design and hardware acceleration of Softmax function for deep neural networks[D]. [Master dissertation], Nanjing University of Aeronautics and Astronautics, 2021.
|
[18] |
闫成刚, 赵轩, 徐宸宇, 等. 基于部分积概率分析的高精度低功耗近似浮点乘法器设计[J]. 电子与信息学报, 2023, 45(1): 87–95. doi: 10.11999/JEIT211485
YAN Chenggang, ZHAO Xuan, XU Chenyu, et al. Design of high precision low power approximate floating-point multiplier based on partial product probability analysis[J]. Journal of Electronics &Information Technology, 2023, 45(1): 87–95. doi: 10.11999/JEIT211485
|
[19] |
黄乐朋. 面向低功耗关键词识别的近似计算模块设计[D]. [硕士论文], 东南大学, 2021.
HUANG Lepeng. Design of approximate computing module for low power keyword recognition[D]. [Master dissertation], Southeast University, 2021.
|
[20] |
朱文涛. 面向多噪声场景低功耗关键词识别的可重构架构及电路实现[D]. [硕士论文], 东南大学, 2021.
ZHU Wentao. Design and implementation of a reconfigurable architecture for low-power keyword spotting in multiple noise scenarios[D]. [Master dissertation], Southeast University, 2021.
|
[21] |
李焱. 面向二值化权重网络的近似加法单元设计[D]. [硕士论文], 东南大学, 2021.
LI Yan. Design of approximate addition unit for binarized weight network[D]. [Master dissertation], Southeast University, 2021.
|
[22] |
张子骥. 基于近似计算的高能效电路设计技术研究[D]. [博士论文], 电子科技大学, 2021.
ZHANG Ziji. Research on energy-efficient circuit design technology based on approximate calculation[D]. [Ph. D. dissertation], University of Electronic Science and Technology, 2021.
|
[23] |
裴浩然. 基于近似计算的自适应滤波器设计[D]. [硕士论文], 电子科技大学, 2021.
PEI Haoran. Design of adaptive filter based on approximate computing[D]. [Master dissertation], University of Electronic Science and Technology, 2021.
|
[24] |
徐成文. 基于神经网络的近似计算训练算法[D]. [硕士论文], 上海交通大学, 2019.
XU Chengwen. Cost-efficient and quality assured approximate computing framework using nerual network[D]. [Master dissertation], Shanghai Jiao Tong University, 2019.
|
[25] |
武翔宇. 一种新型的近似计算训练框架[D]. [硕士论文], 上海交通大学, 2017.
WU Xiangyu. A novel quality trade-offs method for approximate acceleration by iterative training[D]. [Master dissertation], Shanghai Jiao Tong University, 2017.
|
[26] |
赵越. 基于近似计算的低功耗乘法器设计与实现[D]. [硕士论文], 上海交通大学, 2020.
ZHAO Yue. Design and implementation of low power multiplier based on approximate computing[D]. [Master dissertation], Shanghai Jiao Tong University, 2020.
|
[27] |
季宇, 张悠慧, 郑纬民. 基于忆阻器的近似计算方法[J]. 清华大学学报:自然科学版, 2021, 61(6): 610–617. doi: 10.16511/j.cnki.qhdxxb.2020.22.027
JI Yu, ZHANG Youhui, and ZHENG Weimin. Approximate computing method based on memristors[J]. Journal of Tsinghua University:Science and Technology, 2021, 61(6): 610–617. doi: 10.16511/j.cnki.qhdxxb.2020.22.027
|
[28] |
王智慧. 基于近似计算的高能效JPEG加速器设计[D]. [硕士论文], 清华大学, 2018.
WANG Zhihui. An energy efficient JPEG encoder with approximate computing paradigm[D]. [Master dissertation], Tsinghua University, 2018.
|
[29] |
张士长, 王郁杰, 肖航, 等. 支持CNN与LSTM的二值权重神经网络芯片[J]. 高技术通讯, 2021, 31(2): 122–128. doi: 10.3772/j.issn.1002-0470.2021.02.002
ZHANG Shichang, WANG Yujie, XIAO Hang, et al. Binary-weight neural network chip supporting CNN and LSTM[J]. Chinese High Technology Letters, 2021, 31(2): 122–128. doi: 10.3772/j.issn.1002-0470.2021.02.002
|
[30] |
陆维娜, 胡瑜, 叶靖, 等. 面向卷积神经网络加速器吞吐量优化的FPGA自动化设计方法[J]. 计算机辅助设计与图形学学报, 2018, 30(11): 2164–2173. doi: 10.3724/SP.J.1089.2018.17039
LU Weina, HU Yu, YE Jing, et al. Throughput-oriented automatic design of FPGA accelerator for convolutional neural networks[J]. Journal of Computer-Aided Design &Computer Graphics, 2018, 30(11): 2164–2173. doi: 10.3724/SP.J.1089.2018.17039
|
[31] |
朱新忠, 程利甫, 吴有余, 等. 基于误差模型的权重二值神经网络近似加速[J]. 上海航天(中英文), 2021, 38(4): 25–30. doi: 10.19328/j.cnki.2096-8655.2021.04.004
ZHU Xinzhong, CHENG Lifu, WU Youyu, et al. Error model based approximate computing design for binarized weight neural network system[J]. Aerospace Shanghai (Chinese &English), 2021, 38(4): 25–30. doi: 10.19328/j.cnki.2096-8655.2021.04.004
|
[32] |
LIANG Tailin, GLOSSNER J, WANG Lei, et al. Pruning and quantization for deep neural network acceleration: A survey[J]. Neurocomputing, 2021, 461: 370–403. doi: 10.1016/j.neucom.2021.07.045
|
[33] |
BASKIN C, LISS N, SCHWARTZ E, et al. UNIQ: Uniform noise injection for non-uniform quantization of neural networks[J]. ACM Transactions on Computer Systems, 2019, 37(1/4): 4. doi: 10.1145/3444943
|
[34] |
BANNER R, NAHSHAN Y, and SOUDRY D. Post training 4-bit quantization of convolutional networks for rapid-deployment[C]. Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019.
|
[35] |
VOGEL S, SPRINGER J, GUNTORO A, et al. Self-supervised quantization of pre-trained neural networks for multiplierless acceleration[C]. Proceedings of 2019 Design, Automation & Test in Europe Conference & Exhibition, Florence, Italy, 2019: 1094–1099.
|
[36] |
GYSEL P, PIMENTEL J, MOTAMEDI M, et al. Ristretto: A framework for empirical study of resource-efficient inference in convolutional neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(11): 5784–5789. doi: 10.1109/TNNLS.2018.2808319
|
[37] |
NAGEL M, FOURNARAKIS M, AMJAD R A, et al. A white paper on neural network quantization[J]. arXiv: 2106.08295, 2021.
|
[38] |
JUNG S, SON C, LEE S, et al. Learning to quantize deep networks by optimizing quantization intervals with task loss[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4345–4354.
|
[39] |
CHOI J, VENKATARAMANI S, SRINIVASAN V, et al. Accurate and efficient 2-bit quantized neural networks[C]. Proceedings of Machine Learning and Systems 2019, Stanford, USA, 2019.
|
[40] |
ZHOU Shuchang, WU Yuxin, NI Zekun, et al. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients[J]. arXiv: 1606.06160, 2018.
|
[41] |
RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks[C]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 525–542.
|
[42] |
ANDRI R, CAVIGELLI L, ROSSI D, et al. YodaNN: An architecture for ultralow power binary-weight CNN acceleration[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 37(1): 48–60. doi: 10.1109/TCAD.2017.2682138
|
[43] |
GONG Yu, CAI Hao, WU Haige, et al. Quality driven systematic approximation for binary-weight neural network deployment[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2022, 69(7): 2928–2940. doi: 10.1109/TCSI.2022.3164170
|
[44] |
SHAFIQUE M, AHMAD W, HAFIZ R, et al. A low latency generic accuracy configurable adder[C]. Proceedings of the 52nd Annual Design Automation Conference, San Francisco, USA, 2015: 86.
|
[45] |
HANIF M A, HAFIZ R, HASAN O, et al. QuAd: Design and analysis of quality-area optimal low-latency approximate adders[C]. Proceedings of the 54th Annual Design Automation Conference, Austin, USA, 2017: 42.
|
[46] |
ZHU Ning, GOH W L, ZHANG W, et al. Design of low-power high-speed truncation-error-tolerant adder and its application in digital signal processing[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2010, 18(8): 1225–1229. doi: 10.1109/TVLSI.2009.2020591
|
[47] |
ZHU Ning, GOH W L, and YEO K S. An enhanced low-power high-speed adder for error-tolerant application[C]. Proceedings of the 2009 12th International Symposium on Integrated Circuits, Singapore, 2009: 69–72.
|
[48] |
GUPTA V, MOHAPATRA D, RAGHUNATHAN A, et al. Low-power digital signal processing using approximate adders[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2013, 32(1): 124–137. doi: 10.1109/TCAD.2012.2217962
|
[49] |
CAMUS V, SCHLACHTER J, and ENZ C. Energy-efficient inexact speculative adder with high performance and accuracy control[C]. Proceedings of 2015 IEEE International Symposium on Circuits and Systems, Lisbon, Portugal, 2015: 45–48.
|
[50] |
CAMUS V, SCHLACHTER J, and ENZ C. A low-power carry cut-back approximate adder with fixed-point implementation and floating-point precision[C]. Proceedings of the 53rd Annual Design Automation Conference, Austin, USA, 2016: 127.
|
[51] |
VERMA A K, BRISK P, and IENNE P. Variable latency speculative addition: A new paradigm for arithmetic circuit design[C]. Proceedings of 2008 Design, Automation and Test in Europe, Munich, Germany, 2008: 1250–1255.
|
[52] |
MORGENSHTEIN A, YUZHANINOV V, KOVSHILOVSKY A, et al. Full-swing gate diffusion input logic-case-study of low-power CLA adder design[J]. Integration, 2014, 47(1): 62–70. doi: 10.1016/j.vlsi.2013.04.002
|
[53] |
HAN Jie and ORSHANSKY M. Approximate computing: An emerging paradigm for energy-efficient design[C]. Proceedings of 2013 18th IEEE European Test Symposium, Avignon, France, 2013: 1–6.
|
[54] |
MAHDIANI H R, AHMADI A, FAKHRAIE S M et al. Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2010, 57(4): 850–862. doi: 10.1109/TCSI.2009.2027626
|
[55] |
JOHN V, SAM S, RADHA S, et al. Design of a power-efficient Kogge–Stone adder by exploring new OR gate in 45nm CMOS process[J]. Circuit World, 2020, 46(4): 257–269. doi: 10.1108/CW-12-2018-0104
|
[56] |
LIU Bo, WANG Ziyu, WANG Xuetao, et al. An efficient BCNN deployment method using quality-aware approximate computing[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11): 4217–4228. doi: 10.1109/TCAD.2022.3197509
|
[57] |
LIU Weiqiang, QIAN Liangyu, WANG Chenghua, et al. Design of approximate radix-4 booth multipliers for error-tolerant computing[J]. IEEE Transactions on Computers, 2017, 66(8): 1435–1441. doi: 10.1109/TC.2017.2672976
|
[58] |
BORO B, REDDY K M, KUMAR Y B N, et al. Approximate radix-8 Booth multiplier for low power and high speed applications[J]. Microelectronics Journal, 2020, 101: 104816. doi: 10.1016/j.mejo.2020.104816
|
[59] |
WARIS H, WANG Chenghua, and LIU Weiqiang. Hybrid low radix encoding-based approximate booth multipliers[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2020, 67(12): 3367–3371. doi: 10.1109/TCSII.2020.2975094
|
[60] |
YIN Shouyi, OUYANG Peng, ZHENG Shixuan, et al. A 141 UW, 2.46 PJ/neuron binarized convolutional neural network based self-learning speech recognition processor in 28NM CMOS[C]. Proceedings of 2018 IEEE Symposium on VLSI Circuits, Honolulu, USA, 2018: 139–140.
|
[61] |
ZHAO Yue, LI Tong, DONG Feng, et al. A new approximate multiplier design for digital signal processing[C]. Proceedings of 2019 IEEE 13th international conference on ASIC, Chongqing China, 2019: 1–4.
|
[62] |
MITCHELL J N. Computer multiplication and division using binary logarithms[J]. IRE Transactions on Electronic Computers, 1962, EC-11(4): 512–517. doi: 10.1109/TEC.1962.5219391
|
[63] |
YIN Peipei, WANG Chenghua, WARIS H, et al. Design and analysis of energy-efficient dynamic range approximate logarithmic multipliers for machine learning[J]. IEEE Transactions on Sustainable Computing, 2021, 6(4): 612–625. doi: 10.1109/TSUSC.2020.3004980
|
[64] |
LIU Weiqiang, XU Jiahua, WANG Danye, et al. Design and evaluation of approximate logarithmic multipliers for low power error-tolerant applications[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2018, 65(9): 2856–2868. doi: 10.1109/TCSI.2018.2792902
|
[65] |
SAADAT H, JAVAID H, IGNJATOVIC A, et al. REALM: Reduced-error approximate log-based integer multiplier[C]. Proceedings of 2020 Design, Automation & Test in Europe Conference & Exhibition, Grenoble, France, 2020: 1366–1371.
|
[66] |
MRAZEK V, HRBACEK R, VASICEK Z, et al. EvoApprox8b: Library of approximate adders and multipliers for circuit design and benchmarking of approximation methods[C]. Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, Lausanne, Switzerland, 2017: 258–261.
|
[67] |
VENKATARAMANI S, RANJAN A, ROY K, et al. AxNN: Energy-efficient neuromorphic systems using approximate computing[C]. Proceedings of 2014 IEEE/ACM International Symposium on Low Power Electronics and Design, La Jolla, USA, 2014: 27–32.
|
[68] |
GIRALDO J S P and VERHELST M. Laika: A 5uW programmable LSTM accelerator for always-on keyword spotting in 65nm CMOS[C]. Proceedings of 2018 IEEE 44th European Solid State Circuits Conference, Dresden, Germany, 2018: 166–169.
|
[69] |
PRICE M, GLASS J, and CHANDRAKASAN A P. 14.4 A scalable speech recognizer with deep-neural-network acoustic models and voice-activated power gating[C]. Proceedings of 2017 IEEE International Solid-State Circuits Conference, San Francisco, USA, 2017: 244–245.
|
[70] |
YIN Shouyi, OUYANG Peng, TANG Shibin, et al. A high energy efficient reconfigurable hybrid neural network processor for deep learning applications[J]. IEEE Journal of Solid-State Circuits, 2018, 53(4): 968–982. doi: 10.1109/JSSC.2017.2778281
|
[71] |
YIN Shouyi, OUYANG Peng, YANG Jianxun, et al. An energy-efficient reconfigurable processor for binary-and ternary-weight neural networks with flexible data bit width[J]. IEEE Journal of Solid-State Circuits, 2019, 54(4): 1120–1136. doi: 10.1109/JSSC.2018.2881913
|
[72] |
LU Wenyan, YAN Guihai, and LI Xiaowei. AdaFlow: Aggressive convolutional neural networks approximation by leveraging the input variability[J]. Journal of Low Power Electronics, 2018, 14(4): 481–495. doi: 10.1166/jolpe.2018.1581
|
[73] |
WANG Ying, HE Yintao, CHENG Long, et al. A fast precision tuning solution for always-on DNN accelerators[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(5): 1236–1248. doi: 10.1109/TCAD.2021.3089667
|
[74] |
KIM Y D, JEONG W, JUNG L, et al. 2.4 A 7nm high-performance and energy-efficient mobile application processor with tri-cluster CPUs and a sparsity-aware NPU[C]. Proceedings of 2020 IEEE International Solid-State Circuits Conference, San Francisco, USA, 2020: 48–50.
|
[75] |
MAZAHIR S, HASAN O, HAFIZ R, et al. Probabilistic error modeling for approximate adders[J]. IEEE Transactions on Computers, 2017, 66(3): 515–530. doi: 10.1109/TC.2016.2605382
|
[76] |
AYUB M K, HASAN O, and SHAFIQUE M. Statistical error analysis for low power approximate adders[C]. Proceedings of the 54th ACM/EDAC/IEEE Design Automation Conference, Austin, USA, 2017: 1–6.
|
[77] |
ZHU Yiying, LIU Weiqiang, HAN Jie, et al. A probabilistic error model and framework for approximate booth multipliers[C]. Proceedings of 2018 IEEE/ACM International Symposium on Nanoscale Architectures, Athens, Greece, 2018: 1–6.
|
[78] |
LIU Chang, YANG Xinghua, QIAO Fei, et al. Design methodology for approximate accumulator based on statistical error model[C]. Proceedings of the 20th Asia and South Pacific Design Automation Conference, Chiba, Japan, 2015: 237–242.
|
[79] |
VENKATESAN R, AGARWAL A, ROY K, et al. MACACO: Modeling and analysis of circuits for approximate computing[C]. Proceedings of 2011 IEEE/ACM International Conference on Computer-Aided Design, San Jose, USA, 2011: 667–673.
|
[80] |
LIU Zheyu, LI Guihong, QIAO Fei, et al. Concrete: A per-layer configurable framework for evaluating DNN with approximate operators[C]. Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton UK, 2019: 1552–1556.
|
[81] |
FRENO B A and CARLBERG K T. Machine-learning error models for approximate solutions to parameterized systems of nonlinear equations[J]. Computer Methods in Applied Mechanics and Engineering, 2019, 348: 250–296. doi: 10.1016/j.cma.2019.01.024
|
[82] |
JIANG Weiwen, ZHANG Xinyi, SHA E H M, et al. Accuracy vs. efficiency: Achieving both through FPGA-implementation aware neural architecture search[C]. Proceedings of 2019 56th ACM/IEEE Design Automation Conference, Las Vegas, USA, 2019: 1–6.
|
[83] |
MRAZEK V, HANIF M A, VASICEK Z, et al. autoAx: An automatic design space exploration and circuit building methodology utilizing libraries of approximate components[C]. Proceedings of 2019 56th ACM/IEEE Design Automation Conference, Las Vegas, USA, 2019: 1–6.
|
[84] |
ULLAH S, SAHOO S S, and KUMAR A. CLAppED: A design framework for implementing cross-layer approximation in FPGA-based embedded systems[C]. Proceedings of 2021 58th ACM/IEEE Design Automation Conference, San Francisco, USA, 2021: 475–480.
|
[85] |
YU Ye, LI Yingmin, CHE Shuai, et al. Software-defined design space exploration for an efficient DNN accelerator architecture[J]. IEEE Transactions on Computers, 2021, 70(1): 45–56. doi: 10.1109/TC.2020.2983694
|
[86] |
MEI Linyan, HOUSHMAND P, JAIN V, et al. ZigZag: Enlarging joint architecture-mapping design space exploration for DNN accelerators[J]. IEEE Transactions on Computers, 2021, 70(8): 1160–1174. doi: 10.1109/TC.2021.3059962
|
[87] |
LI Haitong, BHARGAV M, WHATMOUGH P N, et al. On-chip memory technology design space explorations for mobile deep neural network accelerators[C]. Proceedings of 2019 56th ACM/IEEE Design Automation Conference, Las Vegas, USA, 2019: 1–6.
|
[88] |
JIANG Weiwen, LOU Qiuwen, YAN Zheyu, et al. Device-circuit-architecture co-exploration for computing-in-memory neural accelerators[J]. IEEE Transactions on Computers, 2021, 70(4): 595–605. doi: 10.1109/TC.2020.2991575
|
[89] |
VENKATESAN R, SHAO Y S, WANG Miaorong, et al. MAGNet: A modular accelerator generator for neural networks[C]. Proceedings of 2019 IEEE/ACM International Conference on Computer-Aided Design, Westminster, USA, 2019: 1–8.
|
[90] |
LIN Yujun, YANG Mengtian, and HAN Song. NAAS: Neural accelerator architecture search[C]. Proceedings of 2021 58th ACM/IEEE Design Automation Conference, San Francisco, USA, 2021: 1051–1056.
|
[91] |
VENIERIS S I and BOUGANIS C S. fpgaConvNet: Mapping regular and irregular convolutional neural networks on FPGAs[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(2): 326–342. doi: 10.1109/TNNLS.2018.2844093
|
[92] |
REAGEN B, HERNÁNDEZ-LOBATO J M, ADOLF R, et al. A case for efficient accelerator design space exploration via Bayesian optimization[C]. Proceedings of 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, Taipei, China, 2017: 1–6.
|