Advanced Search
Volume 45 Issue 9
Sep.  2023
Turn off MathJax
Article Contents
ZHANG Meng, ZHANG Yu, ZHANG Jingwei, CAO Xinye, LI He. NN-EdgeBuilder: High-performance Neural Network Inference Framework for Edge Devices[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3132-3140. doi: 10.11999/JEIT230325
Citation: ZHANG Meng, ZHANG Yu, ZHANG Jingwei, CAO Xinye, LI He. NN-EdgeBuilder: High-performance Neural Network Inference Framework for Edge Devices[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3132-3140. doi: 10.11999/JEIT230325

NN-EdgeBuilder: High-performance Neural Network Inference Framework for Edge Devices

doi: 10.11999/JEIT230325
Funds:  The Research and Development Program of Guangdong Province (2021B1101270006), The Natural Science Foundation of Jiangsu Province (BK20201145)
  • Received Date: 2023-04-26
  • Rev Recd Date: 2023-08-23
  • Available Online: 2023-08-28
  • Publish Date: 2023-09-27
  • The rapidly developing neural network has achieved great success in fields such as target detection. Currently, an important research direction is to deploy efficiently and automatically network models on various edge devices through a neural network inference framework. In response to these issues, a neural network inference framework NN-EdgeBuilder for edge FPGA is designed in this paper, which can fully explore the parallelism factors and quantization bit widths of each layer of the network through a design space exploration algorithm based on multi-objective Bayesian optimization. Then high-performance and universal hardware acceleration operators are called to generate low-latency and low-power neural network accelerators. NN-EdgeBuilder is used to deploy UltraNet and VGG networks on Ultra96-V2 FPGA in this study, and the generated UltraNet-P1 accelerator improves power consumption and energy efficiency by 17.71% and 21.54%, respectively, compared with the state-of-the-art UltraNet custom accelerator. Compared with mainstream inference frameworks, energy efficiency of the VGG accelerator generated by NN-EdgeBuilder is improved by 4.40 times and Digital Signal Processor(DSP) computing efficiency is improved by 50.65%.
  • loading
  • [1]
    SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1–14.
    [2]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. The 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [3]
    张萌, 张经纬, 李国庆, 等. 面向深度神经网络加速芯片的高效硬件优化策略[J]. 电子与信息学报, 2021, 43(6): 1510–1517. doi: 10.11999/JEIT210002

    ZHANG Meng, ZHANG Jingwei, LI Guoqing, et al. Efficient hardware optimization strategies for deep neural networks acceleration chip[J]. Journal of Electronics &Information Technology, 2021, 43(6): 1510–1517. doi: 10.11999/JEIT210002
    [4]
    ZHANG Xiaofan, LU Haoming, HAO Cong, et al. SkyNet: a hardware-efficient method for object detection and tracking on embedded systems[C]. Machine Learning and Systems, Austin, USA, 2020: 216–229.
    [5]
    LI Guoqing, ZHANG Jingwei, ZHANG Meng, et al. Efficient depthwise separable convolution accelerator for classification and UAV object detection[J]. Neurocomputing, 2022, 490: 1–16. doi: 10.1016/j.neucom.2022.02.071
    [6]
    CAI Xuyi, WANG Ying, MA Xiaohan, et al. DeepBurning-SEG: Generating DNN accelerators of segment-grained pipeline architecture[C]. 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), Chicago, USA, 2022: 1396–1413.
    [7]
    VENIERIS S I and BOUGANIS C S. fpgaConvNet: Mapping regular and irregular convolutional neural networks on FPGAs[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(2): 326–342. doi: 10.1109/TNNLS.2018.2844093
    [8]
    YE Hanchen, ZHANG Xiaofan, HUANG Zhize, et al. HybridDNN: A framework for high-performance hybrid DNN accelerator design and implementation[C]. 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, USA, 2020: 1–6.
    [9]
    ZHANG Xiaofan, WANG Junsong, ZHU Chao, et al. DNNBuilder: An automated tool for building high-performance DNN hardware accelerators for FPGAs[C]. 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Diego, USA, 2018: 1–8.
    [10]
    BANNER R, NAHSHAN Y, and SOUDRY D. Post training 4-bit quantization of convolutional networks for rapid-deployment[C]. The 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019: 714.
    [11]
    DUARTE J, HAN S, HARRIS P, et al. Fast inference of deep neural networks in FPGAs for particle physics[J]. Journal of Instrumentation, 2018, 13: P07027. doi: 10.1088/1748-0221/13/07/P07027
    [12]
    GHIELMETTI N, LONCAR V, PIERINI M, et al. Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml[J]. Machine Learning:Science and Technology, 2022, 3(4): 045011. doi: 10.1088/2632-2153/ac9cb5
    [13]
    ZHANG Zheng, CHEN Tinghuan, HUANG Jiaxin, et al. A fast parameter tuning framework via transfer learning and multi-objective bayesian optimization[C]. The 59th ACM/IEEE Design Automation Conference, San Francisco, USA, 2022: 133–138. doi: 10.1145/3489517.3530430.
    [14]
    HUTTER F, HOOS H H, and LEYTON-BROWN K. Sequential model-based optimization for general algorithm configuration[C]. The 5th International Conference on Learning and Intelligent Optimization, Rome, Italy, 2011: 507–523.
    [15]
    ZHAN Dawei and XING Huanlai. Expected improvement for expensive optimization: A review[J]. Journal of Global Optimization, 2020, 78(3): 507–544. doi: 10.1007/s10898-020-00923-x
    [16]
    EMMERICH M T M, DEUTZ A H, and KLINKENBERG J W. Hypervolume-based expected improvement: Monotonicity properties and exact computation[C]. 2011 IEEE Congress of Evolutionary Computation (CEC), New Orleans, USA, 2011: 2147–2154.
    [17]
    ABDOLSHAH M, SHILTON A, RANA S, et al. Expected hypervolume improvement with constraints[C]. 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 2018: 3238–3243.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(5)

    Article Metrics

    Article views (531) PDF downloads(108) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return