Millimeter Wave Radar Gesture Recognition Algorithm Based on Spatio-temporal Compression Feature Representation Learning
-
摘要: 针对现有无线射频信号的手势识别研究中的数据预处理和特征利用问题,该文提出一种用于调频连续波(FMCW)雷达的时空压缩特征表示学习的手势识别算法。首先对手部反射的毫米波雷达回波信号的距离-多普勒(RD)图进行静态干扰去除和动目标点筛选,减少杂波对手势信号的干扰,同时减少计算数据量;然后提出一种压缩手势时空特征的表示方法,利用动目标点的主导速度来表示手势的运动特征,实现多维特征的压缩映射,并保留手势运动的关键特征信息;最后设计了一个单通道的卷积神经网络(CNN)来学习和分类多维手势特征信息并应用于多用户和多位置的手势识别。实验结果表明,与现有其他手势识别算法相比,该文提出的手势识别方法在识别精度、实时性以及泛化能力上都具有明显的优势。Abstract: To solve the problems of data preprocessing and feature utilization in the existing work of gesture recognition of radio frequency signals, a gesture recognition algorithm for spatio-temporal compressed feature representation learning of Frequency Modulated Continuous Wave (FMCW) millimeter wave radar is proposed. First, static interference removal and moving target point filtering are performed on the Range-Doppler (RD) image of the FMCW radar echo reflected by the hand, which could reduce the interference of clutter on the gesture signal, and also reduce greatly the calculation of the data. Then, a method for compressing the spatial-temporal features of gesture is adopted to realize the compression mapping of multidimensional features using the dominant velocity of the moving target point to represent the motion characteristics of the gesture, which includes the key feature information of the gesture motion. Finally, a single channel Convolutional Neural Network (CNN) is designed to learn and classify multidimensional gesture feature information in multi-user and multi-location gesture application scenes. Experimental results show that the proposed gesture recognition method has significant performance in recognition accuracy, real-time performance and generalization ability.
-
表 1 手势识别系统中雷达的参数设置
参数 值 参数 值 扫频范围 60~64 GHz 帧率 25 帧/s 带宽 4 GHz 距离分辨率 3.75 cm 扫频信号斜率 29.9 MHz/μs 最大探测距离 10.9 m 采样率 1 Msps 距离精度 5.54 mm 采样点数 256 速度分辨率 4.4 cm/s 采样间隔 100 μs 最大探测速度 6.08 m/s 帧周期 40 ms 速度精度 1.4 mm/s chrips 32 发射天线数 3 帧数 50 接收天线数 4 表 2 RDI与RDTI网络结构参数模型大小对比
手势识别方法 网络模型 参数模型大小 RDI[20] CNN+LSTM 27952k RDTI(本文方法) CNN 6458k -
[1] CHENG Hong, DAI Zhongjun, and LIU Zicheng. Image-to-class dynamic time warping for 3D hand gesture recognition[C]. 2013 IEEE International Conference on Multimedia and Expo, San Jose, USA, 2013: 1–6. [2] LI Yi. Hand gesture recognition using Kinect[C]. 2012 IEEE International Conference on Computer Science and Automation Engineering, Beijing, China, 2012: 196–199. [3] REN Zhou, YUAN Junsong, MENG Jingjing, et al. Robust part-based hand gesture recognition using kinect sensor[J]. IEEE Transactions on Multimedia, 2013, 15(5): 1110–1120. doi: 10.1109/TMM.2013.2246148 [4] SAHA S, GHOSH S, KONAR A, et al. Gesture recognition from Indian classical dance using kinect sensor[C]. The 5th International Conference on Computational Intelligence, Communication Systems and Networks, Madrid, Spain, 2013: 3–8. [5] LING Yu, Chen Xiang, RUAN Yuwen, et al. Comparative study of gesture recognition based on accelerometer and photoplethysmography sensor for gesture interactions in wearable devices[J]. IEEE Sensors Journal, 2021, 21(15): 17107–17117. doi: 10.1109/JSEN.2021.3081714 [6] JIANG Xianta, MERHI L K, XIAO Zhengang, et al. Exploration of force myography and surface electromyography in hand gesture classification[J]. Medical Engineering & Physics, 2017, 41: 63–73. doi: 10.1016/J.MEDENGPHY.2017.01.015 [7] LIU Jingtao, GU Changzhan, ZHANG Yueping, et al. Analysis on a 77 GHz MIMO radar for touchless gesture sensing[J]. IEEE Sensors Letters, 2020, 4(5): 3500804. doi: 10.1109/LSENS.2020.2987814 [8] LI Gang, ZHANG Rui, RITCHIE M, et al. Sparsity-based dynamic hand gesture recognition using micro-Doppler signatures[C]. 2017 IEEE Radar Conference, Seattle, USA, 2017: 928–931. [9] LIEN J, GILLIAN N, KARAGOZLER M E, et al. Soli: Ubiquitous gesture sensing with millimeter wave radar[J]. ACM Transactions on Graphics, 2016, 35(4): 142. doi: 10.1145/2897824.2925953 [10] ZHANG Jiajun, TAO Jinkun, and SHI Zhiguo. Doppler-radar based hand gesture recognition system using convolutional neural networks[C]. 2017 International Conference in Communications, Signal Processing, and Systems, Harbin, China, 2017: 1096–1113. [11] MOLCHANOV P, GUPTA S, KIM K, et al. Multi-sensor system for driver's hand-gesture recognition[C]. The 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Ljubljana, Slovenia, 2015: 1–8. [12] ZHENG Qiangwen, YANG Lijie, XIE Yaping, et al. A target detection scheme with decreased complexity and enhanced performance for range-Doppler FMCW radar[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 70: 8001113. doi: 10.1109/TIM.2020.3027407 [13] SUN Yuliang, FEI Tai, LI Xibo, et al. Multi-feature encoder for radar-based gesture recognition[C]. 2020 IEEE International Radar Conference, Washington, USA, 2020: 351–356. [14] SUN Yuliang, FEI Tai, LI Xibo, et al. Real-time radar-based gesture detection and recognition built in an edge-computing platform[J]. IEEE Sensors Journal, 2020, 20(18): 10706–10716. doi: 10.1109/JSEN.2020.2994292 [15] XIA Zhaoyang, LUOMEI Yixiang, ZHOU Chenglong, et al. Multidimensional feature representation and learning for robust hand-gesture recognition on commercial millimeter-wave radar[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(6): 4749–4764. doi: 10.1109/TGRS.2020.3010880 [16] 夏朝阳, 周成龙, 介钧誉, 等. 基于多通道调频连续波毫米波雷达的微动手势识别[J]. 电子与信息学报, 2020, 42(1): 164–172. doi: 10.11999/JEIT190797XIA Zhaoyang, ZHOU Chenglong, JIE Junyu, et al. Micro-motion gesture recognition based on multi-channel frequency modulated continuous wave millimeter wave radar[J]. Journal of Electronics &Information Technology, 2020, 42(1): 164–172. doi: 10.11999/JEIT190797 [17] KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1725–1732. [18] TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4489–4497. [19] WANG Xuanhan, GAO Lianli, SONG Jingkuan, et al. Beyond frame-level CNN: Saliency-aware 3-D CNN with LSTM for video action recognition[J]. IEEE Signal Processing Letters, 2017, 24(4): 510–514. doi: 10.1109/LSP.2016.2611485 [20] WANG Saiwen, SONG Jie, LIEN J, et al. Interacting with soli: Exploring fine-grained dynamic gesture recognition in the radio-frequency spectrum[C]. The 29th Annual Symposium on User Interface Software and Technology, Tokyo, Japan, 2016: 851–860.