NN-EdgeBuilder: High-performance Neural Network Inference Framework for Edge Devices

ZHANG Meng; ZHANG Yu; ZHANG Jingwei; CAO Xinye; LI He

doi:10.11999/JEIT230325

Volume 45 Issue 9

Sep. 2023

Turn off MathJax

Article Contents

Article Navigation > Journal of Electronics & Information Technology > 2023 > 45(9): 3132-3140

Zhang Wen-jun, Zhao Yong-bo, Zhang Shou-hong. Altitude Measurement of Meter-Wave Radar Using the General MUSIC Algorithm and Its Improvement[J]. Journal of Electronics & Information Technology, 2007, 29(2): 387-390. doi: 10.3724/SP.J.1146.2005.00859

Citation:

ZHANG Meng, ZHANG Yu, ZHANG Jingwei, CAO Xinye, LI He. NN-EdgeBuilder: High-performance Neural Network Inference Framework for Edge Devices[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3132-3140. doi: 10.11999/JEIT230325

Citation:

PDF( 3511 KB)

NN-EdgeBuilder: High-performance Neural Network Inference Framework for Edge Devices

doi: 10.11999/JEIT230325

School of Electronic Sci. and Eng., Southeast University, Nanjing 210096, China

Funds: The Research and Development Program of Guangdong Province (2021B1101270006), The Natural Science Foundation of Jiangsu Province (BK20201145)

Received Date: 2023-04-26
Rev Recd Date: 2023-08-23

Available Online: 2023-08-28

Publish Date: 2023-09-27

Abstract

Abstract

The rapidly developing neural network has achieved great success in fields such as target detection. Currently, an important research direction is to deploy efficiently and automatically network models on various edge devices through a neural network inference framework. In response to these issues, a neural network inference framework NN-EdgeBuilder for edge FPGA is designed in this paper, which can fully explore the parallelism factors and quantization bit widths of each layer of the network through a design space exploration algorithm based on multi-objective Bayesian optimization. Then high-performance and universal hardware acceleration operators are called to generate low-latency and low-power neural network accelerators. NN-EdgeBuilder is used to deploy UltraNet and VGG networks on Ultra96-V2 FPGA in this study, and the generated UltraNet-P1 accelerator improves power consumption and energy efficiency by 17.71% and 21.54%, respectively, compared with the state-of-the-art UltraNet custom accelerator. Compared with mainstream inference frameworks, energy efficiency of the VGG accelerator generated by NN-EdgeBuilder is improved by 4.40 times and Digital Signal Processor(DSP) computing efficiency is improved by 50.65%.
- Neural network inference framework,
- Design space exploration,
- Multi-objective Bayesian optimization,
- Hardware acceleration operators

FullText(HTML)

References(17)

References

[1]	SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1–14.
[2]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. The 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
[3]	张萌, 张经纬, 李国庆, 等. 面向深度神经网络加速芯片的高效硬件优化策略[J]. 电子与信息学报, 2021, 43(6): 1510–1517. doi: 10.11999/JEIT210002 ZHANG Meng, ZHANG Jingwei, LI Guoqing, et al. Efficient hardware optimization strategies for deep neural networks acceleration chip[J]. Journal of Electronics &Information Technology, 2021, 43(6): 1510–1517. doi: 10.11999/JEIT210002
[4]	ZHANG Xiaofan, LU Haoming, HAO Cong, et al. SkyNet: a hardware-efficient method for object detection and tracking on embedded systems[C]. Machine Learning and Systems, Austin, USA, 2020: 216–229.
[5]	LI Guoqing, ZHANG Jingwei, ZHANG Meng, et al. Efficient depthwise separable convolution accelerator for classification and UAV object detection[J]. Neurocomputing, 2022, 490: 1–16. doi: 10.1016/j.neucom.2022.02.071
[6]	CAI Xuyi, WANG Ying, MA Xiaohan, et al. DeepBurning-SEG: Generating DNN accelerators of segment-grained pipeline architecture[C]. 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), Chicago, USA, 2022: 1396–1413.
[7]	VENIERIS S I and BOUGANIS C S. fpgaConvNet: Mapping regular and irregular convolutional neural networks on FPGAs[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(2): 326–342. doi: 10.1109/TNNLS.2018.2844093
[8]	YE Hanchen, ZHANG Xiaofan, HUANG Zhize, et al. HybridDNN: A framework for high-performance hybrid DNN accelerator design and implementation[C]. 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, USA, 2020: 1–6.
[9]	ZHANG Xiaofan, WANG Junsong, ZHU Chao, et al. DNNBuilder: An automated tool for building high-performance DNN hardware accelerators for FPGAs[C]. 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Diego, USA, 2018: 1–8.
[10]	BANNER R, NAHSHAN Y, and SOUDRY D. Post training 4-bit quantization of convolutional networks for rapid-deployment[C]. The 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019: 714.
[11]	DUARTE J, HAN S, HARRIS P, et al. Fast inference of deep neural networks in FPGAs for particle physics[J]. Journal of Instrumentation, 2018, 13: P07027. doi: 10.1088/1748-0221/13/07/P07027
[12]	GHIELMETTI N, LONCAR V, PIERINI M, et al. Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml[J]. Machine Learning:Science and Technology, 2022, 3(4): 045011. doi: 10.1088/2632-2153/ac9cb5
[13]	ZHANG Zheng, CHEN Tinghuan, HUANG Jiaxin, et al. A fast parameter tuning framework via transfer learning and multi-objective bayesian optimization[C]. The 59th ACM/IEEE Design Automation Conference, San Francisco, USA, 2022: 133–138. doi: 10.1145/3489517.3530430.
[14]	HUTTER F, HOOS H H, and LEYTON-BROWN K. Sequential model-based optimization for general algorithm configuration[C]. The 5th International Conference on Learning and Intelligent Optimization, Rome, Italy, 2011: 507–523.
[15]	ZHAN Dawei and XING Huanlai. Expected improvement for expensive optimization: A review[J]. Journal of Global Optimization, 2020, 78(3): 507–544. doi: 10.1007/s10898-020-00923-x
[16]	EMMERICH M T M, DEUTZ A H, and KLINKENBERG J W. Hypervolume-based expected improvement: Monotonicity properties and exact computation[C]. 2011 IEEE Congress of Evolutionary Computation (CEC), New Orleans, USA, 2011: 2147–2154.
[17]	ABDOLSHAH M, SHILTON A, RANA S, et al. Expected hypervolume improvement with constraints[C]. 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 2018: 3238–3243.