基于调频连续波雷达的多维信息特征融合人体姿势识别方法

冯心欣; 李文龙; 何兆; 郑海峰

doi:10.11999/JEIT210696

基于调频连续波雷达的多维信息特征融合人体姿势识别方法

doi: 10.11999/JEIT210696

福州大学物理与信息工程学院福建省媒体信息智能处理与无线传输重点实验室福州 350108

基金项目: 国家自然科学基金(61971139, 61601126)，福建省自然科学基金(2021J01576)

详细信息

作者简介:
冯心欣：女，副教授，研究方向为智能物联网、车联网、张量理论及其应用、经济学理论及其应用等

李文龙：男，硕士生，研究方向为深度学习、数据分析和边缘计算系统架构

何兆：男，硕士生，研究方向为深度学习和雷达信号处理

郑海峰：男，教授，博士生导师，研究方向为边缘计算、无线感知、机器学习、张量理论及其应用等

通讯作者:
郑海峰　zhenghf@fzu.edu.cn

中图分类号: TN958
计量
- 文章访问数: 1266
- HTML全文浏览量: 781
- PDF下载量: 175
- 被引次数: 10
出版历程
- 收稿日期: 2021-07-12
- 修回日期: 2021-11-18
- 录用日期: 2021-11-25
- 网络出版日期: 2021-11-26
- 刊出日期: 2022-10-19

Human Posture Recognition Based on Multi-dimensional Information Feature Fusion of Frequency Modulated Continuous Wave Radar

College of Physics and Information Engineering, Fuzhou University, Fujian Key Lab for Intelligent Processing and Wireless Transmission of Media Information, Fuzhou 350108, China

Funds: The National Natural Science Foundation of China (61971139,61601126), The Natural Science Foundation of Fujian Province (2021J01576)

摘要

摘要: 为实现在复杂多样的环境下人体姿势的识别，该文提出一种基于调频连续波(FMCW)雷达的多维信息特征融合的人体姿势识别方法。该方法通过对FMCW雷达原始信号进行3维快速傅里叶变换得到目标距离、速度和角度的多维信息，在采用具有噪声的基于密度的聚类算法(DBSCAN)和 Hampel滤波算法解决运动范围内动态或静态目标的噪声干扰后使用卷积神经网络对多维信息进行特征提取，然后利用低秩多模态融合网络(LMF)充分融合多维信息的特征，并通过域鉴别器进一步获得与环境无关的特征，最终使用活动识别器获得姿势识别结果。为了实用性，在边缘计算平台上搭载预先设计的算法和训练好的网络模型进行实验验证。实验结果表明，在复杂的环境下该方法的识别精度可达到91.5%。
- 调频连续波雷达 /
- 姿势识别 /
- 背景消除 /
- 特征融合
Abstract: In order to realize the recognition of human posture in complex and diverse environments, a method based on Frequency Modulated Continuous Wave (FMCW) radar signal is proposed. This method obtains multi-dimensional information of distance, speed and angle by performing 3D fast Fourier transform on the original signal of FMCW radar. After using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm and the Hampel filter algorithm to solve the noise interference of dynamic or static targets in the range of motion, the convolutional neural network is used to extract the features of multi-dimensional information. Then Low-rank Multimodal Fusion network (LMF) is used to fully fuse the features of multi-dimensional information. Finally, the domain discriminator further obtains domain-independent features, and the activity recognizer obtains the result of gesture recognition. The pre-designed algorithm and the trained network model are carried out on the edge computing platform for experimental verification. Experimental results show that the recognition accuracy of this method can reach 91.5% in complex environments.
- Frequency Modulated Continuous Wave (FMCW) radar /
- Human posture recognition /
- Background noise elimination /
- Feature fusion

HTML全文

图 1 DV-EI-Net网络结构

下载: 全尺寸图片幻灯片

图 2 LMF结构图

下载: 全尺寸图片幻灯片

图 3 实验平台

下载: 全尺寸图片幻灯片

图 4 DBSCAN聚类效果

下载: 全尺寸图片幻灯片

图 5 Hampel滤波及线性插值效果

下载: 全尺寸图片幻灯片

图 6 4种姿势对应的数据集及识别结果

下载: 全尺寸图片幻灯片

表 1 DV-EI-Net单参数网络目标域分类精度(%)

模型	VTM	DTM
单参数网络(无域鉴别器)	81.5	78.0
单参数网络(有域鉴别器)	84.5	80.0

下载: 导出CSV

表 2 姿势分类混淆矩阵(%)

真实类别	预测类别
真实类别	挥拳	行走	站立	坐下
挥拳	91	9	0	0
行走	5	93	2	0
站立	3	0	94	3
坐下	0	5	7	88

下载: 导出CSV

表 3 DV-EI-Net多参数网络目标域分类精度(%)

模型	VTM+DTM
基于串联特征融合方式(无域鉴别器)	85.0
基于LMF融合方式(无域鉴别器)	86.5
基于串联特征融合方式(有域鉴别器)	87.5
基于LMF融合方式(有域鉴别器)	91.5

下载: 导出CSV

表 4 本文方法与其他方法平均精度的比较(%)

模型	数据集	源域精度	目标域精度
CNN^[21]单参数网络	VTM	86.5	82.0
VGG16^[22]单参数网络	VTM	87.0	83.5
RDA-T^[10]多参数网络	VTM, DTM	92.5	85.0
CNN^[21]单参数网络+域鉴别器	VTM	86.5	85.5
VGG16^[22]单参数网络+域鉴别器	VTM	87.5	86.5
RDA-T^[10]多参数网络+域鉴别器	VTM, DTM	92.5	88.5
DV-EI-Net	VTM, DTM	94.0	91.5

下载: 导出CSV

参考文献(22)

[1]	杨丽梅, 李致豪. 面向人机交互的手势识别系统设计[J]. 工业控制计算机, 2020, 33(3): 18–20,22. doi: 10.3969/j.issn.1001-182X.2020.03.007 YANG Limei and LI Zhihao. Design of gesture recognition system towards human computer interaction[J]. Industrial Control Computer, 2020, 33(3): 18–20,22. doi: 10.3969/j.issn.1001-182X.2020.03.007
[2]	AGGARWAL J K and XIA Lu. Human activity recognition from 3D data: A review[J]. Pattern Recognition Letters, 2014, 48: 70–80. doi: 10.1016/j.patrec.2014.04.011
[3]	TRAN D, WANG Heng, TORRESANI L, et al. A closer look at spatiotemporal convolutions for action recognition[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 6450–6459.
[4]	熊昕, 郑杨娇子, 张上. 基于长短时记忆网络及变体的跌倒检测和人体行为识别系统[J]. 信息通信, 2020(2): 65–67. doi: 10.3969/j.issn.1673-1131.2020.02.027 XIONG Xin, ZHENG Yangjiaozi, and ZHANG Shang. Fall detection and human behavior recognition system based on long and short time memory networks and variants[J]. Information &Communications, 2020(2): 65–67. doi: 10.3969/j.issn.1673-1131.2020.02.027
[5]	SABOKROU M, POURREZA M, FAYYAZ M, et al. AVID: Adversarial visual irregularity detection[C]. 14th Asian Conference on Computer Vision, Perth, Australia, 2019: 488–505.
[6]	刘天亮, 谯庆伟, 万俊伟, 等. 融合空间-时间双网络流和视觉注意的人体行为识别[J]. 电子与信息学报, 2018, 40(10): 2395–2401. doi: 10.11999/JEIT171116 LIU Tianliang, QIAO Qingwei, WAN Junwei, et al. Human action recognition via Spatio-temporal dual network flow and visual attention fusion[J]. Journal of Electronics &Information Technology, 2018, 40(10): 2395–2401. doi: 10.11999/JEIT171116
[7]	WANG Jie, ZHANG Xiao, GAO Qinhua, et al. Device-free wireless localization and activity recognition: A deep learning approach[J]. IEEE Transactions on Vehicular Technology, 2017, 66(7): 6258–6267. doi: 10.1109/TVT.2016.2635161
[8]	XU Shengzhi, KOOIJ B J, and YAROVOY A. Joint Doppler and DOA estimation using (Ultra-)Wideband FMCW signals[J]. Signal Processing, 2020, 168: 107259. doi: 10.1016/j.sigpro.2019.107259
[9]	LEE J, HWANG S, YOU S, et al. Joint angle, velocity, and range estimation using 2D MUSIC and successive interference cancellation in FMCW MIMO radar system[J]. IEICE Transactions on Communications, 2020, E103.B(3): 283–290. doi: 10.1587/transcom.2018EBP3330
[10]	王勇, 吴金君, 田增山, 等. 基于FMCW雷达的多维参数手势识别算法[J]. 电子与信息学报, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485 WANG Yong, WU Jinjun, TIAN Zengshan, et al. Gesture recognition with multi-dimensional parameter using FMCW radar[J]. Journal of Electronics &Information Technology, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485
[11]	ZHAO Yinan, ZHANG Zihao, and ZHANG Zhaolin. Multi-angle data cube action recognition based on millimeter wave radar[C]. 2020 Chinese Control and Decision Conference (CCDC), Hefei, China, 2020: 749–753.
[12]	ZHAO Mingmin, LI Tianhong, ABU ALSHEIKH M, et al. Through-wall human pose estimation using radio signals[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7356–7365.
[13]	刘皓, 郭立, 易波, 等. 基于3D骨架和MCRF模型的行为识别[J]. 中国科学技术大学学报, 2014, 44(4): 285–291. doi: 10.3969/j.issn.0253-2778.2014.04.005 LIU Hao, GUO Li, YI Bo, et al. Human activity recognition based on 3D skeletons and MCRF model[J]. Journal of University of Science and Technology of China, 2014, 44(4): 285–291. doi: 10.3969/j.issn.0253-2778.2014.04.005
[14]	ATREY P K, HOSSAIN M A, EL SADDIK A, et al. Multimodal fusion for multimedia analysis: A survey[J]. Multimedia Systems, 2010, 16(6): 345–379. doi: 10.1007/s00530-010-0182-0
[15]	MORENCY L P, MIHALCEA R, and DOSHI P. Towards multimodal sentiment analysis: Harvesting opinions from the web[C]. The 13th International Conference on Multimodal Interfaces, Alicante, Spain, 2011: 169–176.
[16]	XUE Hongfei, JIANG Wenjun, MIAO Chenglin, et al. DeepFusion: A deep learning framework for the fusion of heterogeneous sensory data[C]. The Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing, Catania, Italy, 2019: 151–160.
[17]	ZADEH A, CHEN Minghai, PORIA S, et al. Tensor fusion network for multimodal sentiment analysis[C]. The 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017: 1114–1125.
[18]	LIU Zhun, SHEN Ying, LAKSHMINARASIMHAN V B, et al. Efficient low-rank multimodal fusion with modality-specific factors[C]. The 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 2018: 2247–2256.
[19]	GANIN Y and LEMPITSKY V. Unsupervised domain adaptation by backpropagation[C]. The 32nd International Conference on Machine Learning, Lille, France, 2015: 1180–1189.
[20]	SCITOVSKI R, MAJSTOROVIĆ S, and SABO K. A combination of RANSAC and DBSCAN methods for solving the multiple geometrical object detection problem[J]. Journal of Global Optimization, 2021, 79(3): 669–686. doi: 10.1007/s10898-020-00950-8
[21]	DEKKER B, JACOBS S, KOSSEN A S, et al. Gesture recognition with a low power FMCW radar and a deep convolutional neural network[C]. 2017 European Radar Conference (EURAD), Nuremberg, Germany, 2017: 163–166.
[22]	SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2014.

施引文献

期刊类型引用(6)

1.	王昭, 娄昊, 范九伦. 一种基于模糊C均值的图像超像素分割方法. 微电子学与计算机. 2014(10): 18-21+27 . 百度学术
2.	吴涛. 图像阈值化的自适应粗糙熵方法. 中国图象图形学报. 2014(01): 1-10 . 百度学术
3.	罗亮, 冯象初, 张选德, 李小平. 基于非局部双边随机投影低秩逼近图像去噪算法. 电子与信息学报. 2013(01): 99-105 . 本站查看
4.	李伟斌, 高二, 宋松和. 一种全局最小化的图像分割方法. 电子与信息学报. 2013(04): 791-796 . 本站查看
5.	徐东, 彭真明. 基于C-V模型的改进快速水平集图像分割法. 强激光与粒子束. 2012(12): 2817-2821 . 百度学术
6.	董光辉, 席志红, 赵彦青. 基于文化算法的C-V水平集图像分割. 系统工程与电子技术. 2012(07): 1499-1504 . 百度学术