Citation: | LI Tianyang, ZHANG Fan, WANG Song, CAO Wei, CHEN Li. FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer[J]. Journal of Electronics & Information Technology, 2024, 46(6): 2663-2672. doi: 10.11999/JEIT230713 |
[1] |
SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2015.
|
[2] |
HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
|
[3] |
SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
|
[4] |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 6000–6010.
|
[5] |
CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]. The 16th European Conference on Computer Vision, Glasgow, UK, 2020: 213–229.
|
[6] |
陈莹, 匡澄. 基于CNN和TransFormer多尺度学习行人重识别方法[J]. 电子与信息学报, 2023, 45(6): 2256–2263. doi: 10.11999/JEIT220601.
CHEN Ying and KUANG Cheng. Pedestrian re-identification based on CNN and Transformer multi-scale learning[J]. Journal of Electronics &Information Technology, 2023, 45(6): 2256–2263. doi: 10.11999/JEIT220601.
|
[7] |
ZHAI Xiaohua, KOLESNIKOV A, HOULSBY N, et al. Scaling vision transformers[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 1204–1213.
|
[8] |
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, 2021.
|
[9] |
WANG Teng, GONG Lei, WANG Chao, et al. ViA: A novel vision-transformer accelerator based on FPGA[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11): 4088–4099. doi: 10.1109/TCAD.2022.3197489.
|
[10] |
NAG S, DATTA G, KUNDU S, et al. ViTA: A vision transformer inference accelerator for edge applications[C]. 2023 IEEE International Symposium on Circuits and Systems, Monterey, USA, 2023: 1–5.
|
[11] |
LI Zhengang, SUN Mengshu, LU A, et al. Auto-ViT-Acc: an FPGA-aware automatic acceleration framework for vision transformer with mixed-scheme quantization[C]. 2022 32nd International Conference on Field-Programmable Logic and Applications, Belfast, UK, 2022: 109–116.
|
[12] |
吴瑞东, 刘冰, 付平, 等. 应用于极致边缘计算场景的卷积神经网络加速器架构设计[J]. 电子与信息学报, 2023, 45(6): 1933–1943. doi: 10.11999/JEIT220130.
WU Ruidong, LIU Bing, FU Ping, et al. Convolutional neural network accelerator architecture design for ultimate edge computing scenario[J]. Journal of Electronics &Information Technology, 2023, 45(6): 1933–1943. doi: 10.11999/JEIT220130.
|
[13] |
NGUYEN D T, JE H, NGUYEN T N, et al. ShortcutFusion: from tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2022, 69(6): 2477–2489. doi: 10.1109/TCSI.2022.3153288.
|
[14] |
LI Tianyang, ZHANG Fan, FAN Xitian, et al. Unified accelerator for attention and convolution in inference based on FPGA[C]. 2023 IEEE International Symposium on Circuits and Systems, Monterey, USA, 2023: 1–5.
|
[15] |
LOMONT C. Fast inverse square root[EB/OL]. http://lomont.org/papers/2003/InvSqrt.pdf, 2023.
|
[16] |
WU E, ZHANG Xiaoqian, BERMAN D, et al. A high-throughput reconfigurable processing array for neural networks[C]. 27th International Conference on Field Programmable Logic and Applications, Ghent, Belgium, 2017: 1–4.
|
[17] |
FU Yao, WU E, SIRASAO A, et al. Deep learning with INT8 optimization on Xilinx devices[EB/OL]. Xilinx. https://www.origin.xilinx.com/content/dam/xilinx/support/documents/white_papers/wp486-deep-learning-int8.pdf, 2017.
|
[18] |
ZHU Feng, GONG Ruihao, YU Fengwei, et al. Towards unified INT8 training for convolutional neural network[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 1966–1976.
|
[19] |
JACOB B, KLIGYS S, CHEN Bo, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 2704–2713.
|
[20] |
SUN Qiwei, DI Zhixiong, LV Zhengyang, et al. A high speed SoftMax VLSI architecture based on basic-split[C]. 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology, Qingdao, China, 2018: 1–3.
|
[21] |
WANG Meiqi, LU Siyuan, ZHU Danyang, et al. A high-speed and low-complexity architecture for softmax function in deep learning[C]. 2018 IEEE Asia Pacific Conference on Circuits and Systems, Chengdu, China, 2018: 223–226.
|
[22] |
GAO Yue, LIU Weiqiang, and LOMBARDI F. Design and implementation of an approximate softmax layer for deep neural networks[C]. 2020 IEEE International Symposium on Circuits and Systems, Seville, Spain, 2020: 1–5.
|
[23] |
LI Yue, CAO Wei, ZHOU Xuegong, et al. A low-cost reconfigurable nonlinear core for embedded DNN applications[C]. 2020 International Conference on Field-Programmable Technology, Maui, USA, 2020: 35–38.
|
[24] |
HADJIS S and OLUKOTUN K. TensorFlow to cloud FPGAs: Tradeoffs for accelerating deep neural networks[C]. 29th International Conference on Field Programmable Logic and Applications, Barcelona, Spain, 2019: 360–366.
|