高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度学习的手语识别综述

张淑军 张群 李辉

张淑军, 张群, 李辉. 基于深度学习的手语识别综述[J]. 电子与信息学报, 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416
引用本文: 张淑军, 张群, 李辉. 基于深度学习的手语识别综述[J]. 电子与信息学报, 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416
Shujun ZHANG, Qun ZHANG, Hui LI. Review of Sign Language Recognition Based on Deep Learning[J]. Journal of Electronics & Information Technology, 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416
Citation: Shujun ZHANG, Qun ZHANG, Hui LI. Review of Sign Language Recognition Based on Deep Learning[J]. Journal of Electronics & Information Technology, 2020, 42(4): 1021-1032. doi: 10.11999/JEIT190416

基于深度学习的手语识别综述

doi: 10.11999/JEIT190416
基金项目: 国家自然科学基金(61702295, 61672305),山东省重点研发计划项目(2017GGX10127)
详细信息
    作者简介:

    张淑军:女,1980年生,副教授,研究方向为计算机视觉

    张群:女,1994年生,硕士生,研究方向为计算机视觉

    李辉:男,1984年生,副教授,研究方向为计算机视觉

    通讯作者:

    张淑军 lindazsj@163.com

  • 中图分类号: TP391

Review of Sign Language Recognition Based on Deep Learning

Funds: The National Natural Science Foundation of China (61702295, 61672305), The Key Research & Development Plan Project of Shandong Province (2017GGX10127)
  • 摘要:

    手语识别涉及计算机视觉、模式识别、人机交互等领域,具有重要的研究意义与应用价值。深度学习技术的蓬勃发展为更加精准、实时的手语识别带来了新的机遇。该文综述了近年来基于深度学习的手语识别技术,从孤立词与连续语句两个分支展开详细的算法阐述与分析。孤立词识别技术划分为基于卷积神经网络(CNN)、3维卷积神经网络(3D-CNN)和循环神经网络(RNN) 3种架构的方法;连续语句识别所用模型复杂度更高,通常需要辅助某种长时时序建模算法,按其主体结构分为双向长短时记忆网络模型、3维卷积网络模型和混合模型。归纳总结了目前国内外常用手语数据集,探讨了手语识别技术的研究挑战与发展趋势,高精度前提下的鲁棒性和实用化仍有待于推进。

  • 图  1  总体分类图

    图  2  RWTH德国手语数据样例

    图  3  CSL中国手语数据样例

    图  4  每帧的视觉方式

    表  1  基于深度学习的孤立词手语识别技术及代表性工作

    作者/单位年份技术特点准确率(%)数据集样本大小
    Tang Ao, Li HouQiang, Huang Jie, Li Xiaoxu, Huang Shiliang/中国科学技术大学2013卷积神经网络(基于RGB-D并对手部
    进行分割与追踪)[4]
    98.12American Sign Language(ASL)50700帧
    20153维卷积神经网络(多模态输入)[17]94.20Chinese Sign Language(CSL)25类
    2016循环神经网络(加入轨迹数据)[27]85.60500类
    2017长短时记忆网络(加入手型描述符)[28]86.20100类
    2018循环神经网络(关键帧视频序列筛选)[29]91.18310类
    3维卷积网络(基于注意力机制)[18]88.70500类
    Pigou L/根特大学2014卷积神经网络[5]91.70Chalearn20类
    20163维卷积网络(多模态数据的特征融合)[16]81.002014
    Molchanov P,Garcia B,Hardie Cate/斯坦福大学20153维卷积网络(多尺度数据)[15]77.50VIVA Dataset
    循环神经网络[25]90.80南威尔士大学数据集95类
    2016卷积神经网络[9]91.63ASL fingerspelling
    Kang B /加州大学2015卷积神经网络[6]99.99ASL fingerspelling31类
    Miao Qiguang /西安电子科技大学20163维卷积神经网络(基于RGB-D)[19]56.90Chalearn
    2017(基于显著性特征和RGB-D)[20]59.43
    (基于多模态数据和手部特征增强)[21]67.71
    Koller O/亚琛工业大学2016卷积神经网络(关注手型变化)[8]Danish Sign Language分辨率4730×22
    Chai Xiujuan/中科院计算所2017改进的RNN(对手部分割定位)[26]99.00Chinese Sign Language(CSL)40类
    Yang Su/北京工业大学2017RNN和CNN相结合[30]98.43CSL40类
    RNN(数据预处理)[31]99.00CSL40类
    Hossen M A /特斯瓦拉工程学院2017卷积神经网络[7]100.00Kinect录制10类
    ElBadawy M /埃及埃因萨姆斯大学20173维卷积网络[22]98.00阿拉伯数据集25类
    Kim S /韩国首尔大学2017卷积神经网络(帧间采样)[10]86.00摄像头采集20类
    2018卷积神经网络(手部分割)[11]98.0012类
    Kopuklu O/德国慕尼黑大学2018卷积神经网络(时空特征融合)[12]96.28Jester Chalearn
    57.40
    Konstantinidis D /希腊大学2018卷积神经网络(RGB和骨架数据)[13]98.09阿根廷数据集LSA64
    循环神经网络(多模态数据融合)[36]89.50印度手语数据集(IIT)
    Devineau G /巴黎圣米歇尔研究大学2018卷积神经网络(骨架数据、加入手部关节点位置序列)[14]84.35DHG Dataset28 类
    Ye Yuancheng /纽约城市大学20183维卷积网络(特征融合)[23]69.20American Sign Language27类
    Liang Zhijie /华中师范大学20183维卷积网络(骨架、轮廓、深度数据)[24]83.60Chalearn
    Lin Chi/中国科学院自动化所2018带有掩膜的ResC3D网络与RNN相结合[32]68.42Chalearn
    Halim K /印尼大学2018循环神经网络(基于SIBI词性变化手势的特征集)[33]96.15印尼手语数据集
    Masood S /新德里大学2018循环神经网络和卷积神经网络相结合[34]95.20阿根廷数据集LSA6446类
    Bantupalli K /美国肯尼索州立大学2018循环神经网络和卷积神经网络相结合[35]93.00American Sign Language(ASL)100类
    Hernandez V /东京农业大学2019卷积神经网络与长短时记忆网络相结合[37]89.30American Sign Language(ASL)19类
    Liao YanQiu/南昌大学2019循环神经网络和3维卷积网络相结合[38]86.90Chinese Sign Language(CSL)500类
    下载: 导出CSV

    表  2  基于深度学习的连续语句的手语识别技术及代表性工作

    作者/单位年份技术特点评估标准(%)数据集样本大小
    Camgoz NC, Koller O/亚琛工业大学20163维卷积网络(从RGB数据提取时序特征)[45]Jaccard系数:26.9Chalearn
    2016基于卷积神经网络和HMM的混合模型[49]WER:39.7RWTH-PHOENX-Weather
    2017基于CNN、HMM、CTC[50]WER:38.8
    2017双向长短时网络-BLSTM(基于CTC算法)[39]WER:43.1分辨率:5000×90
    2018基于CNN、HMM及RNN的混合模型[51]
    Pigou L /根特大学2017基于3维网络和LSTM混合模型(RGB-D)[52]Jaccard系数:31.6Chalearn
    Cui Runpeng/清华大学2017基于CNN和BLSTM(基于CTC算法)[53]WER:38.7RWTHPHOENIX-Weather分辨率:16000×20
    2018双向长短时网络-BLSTM(多模态数据)[40]WER:46.9
    Shi B /美国芝加哥大学2018基于注意力机制的长短时网络[41]WER:41.9AmericanSign Language (ASL)
    Ko S K /韩国电子研究所2018循环神经网络(加入骨架关节点数据)[42]Acc:89.5KETI韩国手语数据集100类
    Zhang Qian/上海交通大学2018双向长短时网络-BLSTM[43]Acc:93.1AmericanSign Language(ASL)100类
    Li Houqiang, Huang Jie /中国科学技术大学20183维卷积网络(时间分类的对齐算法)[46]WER:37.3RWTH-PHOENIX-Weather
    双流3维卷积网络(加入LSTM)[47]Acc:82.7ChineseSign Language100类
    Guo Dan/合肥工业大学,中国科学技术大学20183维卷积神经网络(时域卷积、CTC算法、后融合策略)[48]WER:37.8RWTH-PHOENIX-Weather
    3维卷积网络和RNN相结合(自适应变长在线关键片段挖掘关键帧)[55]Acc:92.9ChineseSign Language(CSL)100类
    Ariesta M C /雅加达大学20183维卷积网络和RNN相结合(基于CTC)[54]SIBI30类
    Mittal A /印尼科技大学2019改进的长短时记忆网络[44]Acc:72.3印度手语数据集(ISL)942类
    下载: 导出CSV

    表  3  手语数据集分类

    名称所属国家类别场景样本数据特点数据类型可用性
    RWTH-PHOENIX-Weather[56]德国1200945760RGB句子公开
    Chalearn[57]美国249750000RGB/深度单词部分公开
    DGS Kinect 40[58]德国40153000多视角孤立词
    CSL[47]中国500/100125000深度/骨架/RGB孤立词/句子公开
    SIGNUM[59]德国4502533210RGB句子公开
    GSL 20[60]希腊206840RGB单词
    Boston ASLLVD[61]美国3300+69800RGB单词公开
    PSL Kinect 30[62]波兰301300RGB/深度单词公开
    LSA64[63]阿根廷64103200RGB单词公开
    DEVISIGN-G[64]中国368432RGB单词
    DEVISIGN-D[64]5006000
    DEVISIGN-L[64]200024000
    CUNY ASL[65]美国8RGB句子
    SignsWorld Atlas[66]阿拉伯3210RGB单词公开
    ASL Fingerspelling[67]美国245131000RGB/深度单词公开
    下载: 导出CSV

    表  4  RWTH-PHOENIX-Weather参数

    参数2012年版2014年版
    # 操作者数量 7 9
    # 样例 190 645
    # 帧数 293077 965940
    # 语句数量 1980 6861
    # 词汇量 911 1558
    # 分辨率 210×260 720×576
    下载: 导出CSV

    表  5  CSL数据集参数

    参数名称数值
    RGB分辨率1920×1080
    深度数据分辨率512×424
    视频时长(s)10~14
    平均样例数7
    总样例25000
    # 操作者数量50
    词汇量178
    骨架关节点数21
    fps25
    总时长100+
    下载: 导出CSV
  • HINTON G E, OSINDERO S, and TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527–1554. doi: 10.1162/neco.2006.18.7.1527
    周宇. 中国手语识别中自适应问题的研究[D].[博士论文], 哈尔滨工业大学, 2009.

    ZHOU Yu. Research on signer adaptation in Chinese sign language recognition[D].[Ph.D. dissertation], Harbin Institute of Technology, 2009.
    CHEOK M J, OMAR Z, and JAWARD M H. A review of hand gesture and sign language recognition techniques[J]. International Journal of Machine Learning and Cybernetics, 2019, 10(1): 131–153. doi: 10.1007/s13042-017-0705-5
    TANG Ao, LU Ke, WANG Yufei, et al. A real-time hand posture recognition system using deep neural networks[J]. ACM Transactions on Intelligent Systems and Technology, 2015, 6(2): 1–23. doi: 10.1145/2735952
    PIGOU L, DIELEMAN S, KINDERMANS P J, et al. Sign language recognition using convolutional neural networks[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 572–578.
    KANG B, TRIPATHI S, and NGUYEN T Q. Real-time sign language fingerspelling recognition using convolutional neural networks from depth map[C]. The 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 2015: 136–140.
    HOSSEN M A, GOVINDAIAH A, SULTANA S, et al. Bengali sign language recognition using Deep Convolutional Neural Network[C]. The 7th Joint International Conference on Informatics, Electronics & Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Kitakyushu, Japan, 2018: 369–373.
    KOLLER O, BOWDEN R, and NEY H. Automatic alignment of hamNoSys subunits for continuous sign language recognition[C]. The 10th Edition of the Language Resources and Evaluation Conference, Portorož, Slovenia, 2016: 121–128.
    GARCIA B and VIESCA S A. Real-time American sign language recognition with convolutional neural networks[J]. Convolutional Neural Networks for Visual Recognition, 2016, 2: 225–232.
    JI Y, KIM S, and LEE K B. Sign language learning system with image sampling and convolutional neural network[C]. The 1st IEEE International Conference on Robotic Computing (IRC), Taichung, China, 2017: 371–375.
    KIM S, JI Y, and LEE K B. An effective sign language learning with object detection based ROI segmentation[C]. The 2nd IEEE International Conference on Robotic Computing (IRC), Laguna Hills, USA, 2018: 330–333.
    KÖPÜKLÜ O, KÖSE N, and RIGOLL G. Motion fused frames: Data level fusion strategy for hand gesture recognition[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA, 2018: 2103–2111.
    KONSTANTINIDIS D, DIMITROPOULOS K, and DARAS P. Sign language recognition based on hand and body skeletal data[C]. 2018-3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), Helsinki, Finland, 2018: 1–4.
    DEVINEAU G, MOUTARDE F, WANG Xi, et al. Deep learning for hand gesture recognition on skeletal data[C]. The 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xian, China, 2018: 106–113.
    MOLCHANOV P, GUPTA S, KIM K, et al. Hand gesture recognition with 3D convolutional neural networks[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition workshops, Boston, USA, 2015: 1–7.
    WU Di, PIGOU L, KINDERMANS P J, et al. Deep dynamic neural networks for multimodal gesture segmentation and recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(8): 1583–1597. doi: 10.1109/TPAMI.2016.2537340
    HUANG Jie, ZHOU Wengang, LI Houqiang, et al. Sign language recognition using 3D convolutional neural networks[C]. 2015 IEEE International Conference on Multimedia and Expo (ICME), Turin, Italy, 2015: 1–6.
    HUANG Jie, ZHOU Wengang, LI Houqiang, et al. Attention-based 3D-CNNs for large-vocabulary sign language recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(9): 2822–2832. doi: 10.1109/TCSVT.2018.2870740
    LI Yunan, MIAO Qiguang, TIAN Kuan, et al. Large-scale gesture recognition with a fusion of RGB-D data based on the C3D model[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 25–30.
    LI Yunan, MIAO Qiguang, TIAN Kuan, et al. Large-scale gesture recognition with a fusion of RGB-D data based on saliency theory and C3D model[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(10): 2956–2964. doi: 10.1109/TCSVT.2017.2749509
    MIAO Qiguang, LI Yunan, OUYANG Wanli, et al. Multimodal gesture recognition based on the resc3d network[C]. 2017 IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 3047–3055.
    ELBADAWY M, ELONS A S, SHEDEED H A, et al. Arabic sign language recognition with 3d convolutional neural networks[C]. The 8th International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 2017: 66–71.
    YE Yuancheng, TIAN Yingli, HUENERFAUTH M, et al. Recognizing American sign language gestures from within continuous videos[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA, 2018: 2064–2073.
    LIANG Zhijie, LIAO Shengbin, and HU Bingzhang. 3D convolutional neural networks for dynamic sign language recognition[J]. The Computer Journal, 2018, 61(11): 1724–1736. doi: 10.1093/comjnl/bxy049
    CATE H, DALVI F, and HUSSAIN Z. Sign language recognition using temporal classification[EB/OL]. http://arxiv.org/abs/1701.01875v1, 2017.
    CHAI Xiujuan, LIU Zhipeng, YIN Fang, et al. Two streams recurrent neural networks for large-scale continuous gesture recognition[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 31–36.
    LIU Tao, ZHOU Wengang, and LI Houqiang. Sign language recognition with long short-term memory[C]. 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, USA, 2016: 2871–2875.
    LI Xiaoxu, MAO Chensi, HUANG Shiliang, et al. Chinese sign language recognition based on SHS descriptor and encoder-decoder LSTM model[C]. The 12th Chinese Conference on Biometric Recognition. Shenzhen, China, 2017: 719–728.
    HUANG Shiliang, MAO Chensi, TAO Jinxu, et al. A novel chinese sign language recognition method based on keyframe-centered clips[J]. IEEE Signal Processing Letters, 2018, 25(3): 442–446. doi: 10.1109/LSP.2018.2797228
    YANG Su and ZHU Qing. Continuous Chinese sign language recognition with CNN-LSTM[J]. SPIE, 2017, 10420.
    YANG Su and ZHU Qing. Video-based Chinese sign language recognition using convolutional neural network[C]. The 9th IEEE International Conference on Communication Software and Networks (ICCSN), Guangzhou, China, 2017: 929–934.
    LIN Chi, WAN Jun, LIANG Yanyan, et al. Large-scale isolated gesture recognition using a refined fused model based on masked Res-C3D network and skeleton LSTM[C]. The 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 2018: 52–58.
    HALIM K and RAKUN E. Sign language system for Bahasa Indonesia (Known as SIBI) recognizer using TensorFlow and Long Short-Term Memory[C]. 2018 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Yogyakarta, Indonesia, 2018: 403–407.
    BHATEJA V, COELLO C A C, and SATAPATHY S C. Intelligent Engineering Informatics[C]. The 6th International Conference on FICTA. Singapore: 2018: 623–632.
    BANTUPALLI K and XIE Ying. American Sign Language recognition using deep learning and computer vision[C]. 2018 IEEE International Conference on Big Data (Big Data), Seattle, USA, 2018: 4896–4899.
    KONSTANTINIDIS D, DIMITROPOULOS K, and DARAS P. A deep learning approach for analyzing video and skeletal features in sign language recognition[C]. 2018 IEEE International Conference on Imaging Systems and Techniques (IST), Krakow, Poland, 2018: 1–6.
    VINCENT H, TOMOYA S, and GENTIANE V. Convolutional and recurrent neural network for human action recognition: Application on American sign language[EB/OL]. http://biorxiv.org/content/10.1101/535492v1, 2019.
    LIAO Yanqiu, XIONG Pengwen, MIN Weidong, et al. Dynamic sign language recognition based on video sequence with BLSTM-3D residual networks[J]. IEEE Access, 2019, 7: 38044–38054. doi: 10.1109/ACCESS.2019.2904749
    CAMGOZ N C, HADFIELD S, KOLLER O, et al. SubUNets: End-to-end hand shape and continuous sign language recognition[C]. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 3075–3084.
    CUI Runpeng, LIU Hu, and ZHANG Changshui. A deep neural framework for continuous sign language recognition by iterative training[J]. IEEE Transactions on Multimedia, 2019, 21(7): 1880–1891. doi: 10.1109/TMM.2018.2889563
    SHI Bowen, DEL RIO A M, KEANE J, et al. American Sign Language fingerspelling recognition in the wild[C]. 2018 IEEE Spoken Language Technology Workshop (SLT), Athens, Greece, 2018: 145–152.
    KO S K, SON J G, and JUNG H. Sign language recognition with recurrent neural network using human keypoint detection[C]. 2018 Conference on Research in Adaptive and Convergent Systems, Honolulu, USA, 2018: 326–328.
    ZHANG Qian, WANG Dong, ZHAO Run, et al. MyoSign: Enabling end-to-end sign language recognition with wearables[C]. The 24th International Conference on Intelligent User Interfaces, Marina del Ray, USA, 2019: 650–660.
    MITTAL A, KUMAR P, ROY P P, et al. A modified LSTM model for continuous sign language recognition using leap motion[J]. IEEE Sensors Journal, 2019, 19(16): 7056–7063. doi: 10.1109/JSEN.2019.2909837
    CAMGOZ N C, HADFIELD S, KOLLER O, et al. Using convolutional 3d neural networks for user-independent continuous gesture recognition[C]. The 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 49–54.
    PU Junfu, ZHOU Wengang, and LI Houqiang. Dilated convolutional network with iterative optimization for continuous sign language recognition[C]. The 27th International Joint Conference on Artificial Intelligence, Wellington, New Zealand, 2018: 885–891.
    HUANG Jie, ZHOU Wengang, ZHANG Qilin, et al. Video-based sign language recognition without temporal segmentation[C]. The 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018: 2257–2264.
    WANG Shuo, GUO Dan, ZHOU Wengang, et al. Connectionist temporal fusion for sign language translation[C]. The 26th ACM International Conference on Multimedia, Seoul, Korea, 2018: 1483–1491.
    KOLLER O, ZARGARAN O, NEY H, et al. Deep sign: Hybrid CNN-HMM for continuous sign language recognition[C]. 2016 British Machine Vision Conference, York, UK, 2016: 1–2.
    KOLLER O, ZARGARAN S, and NEY H. Re-sign: Re-aligned end-to-end sequence modelling with deep recurrent CNN-HMMs[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 2017: 4297–4305.
    KOLLER O, ZARGARAN S, NEY H, et al. Deep sign: Enabling robust statistical continuous sign language recognition via hybrid CNN-HMMs[J]. International Journal of Computer Vision, 2018, 126(12): 1311–1325. doi: 10.1007/s11263-018-1121-3
    PIGOU L, VAN HERREWEGHE M, and DAMBRE J. Gesture and sign language recognition with temporal residual networks[C]. 2017 IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 3086–3093.
    CUI Runpeng, LIU Hu, and ZHANG Changshui. Recurrent convolutional neural networks for continuous sign language recognition by staged optimization[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 7361–7369.
    ARIESTA M C, WIRYANA F, SUHARJITO, et al. Sentence level Indonesian sign language recognition using 3D convolutional neural network and bidirectional recurrent neural network[C]. 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), Jakarta, Indonesia, 2018: 16–22.
    GUO Dan, ZHOU Wengang, LI Houqiang, et al. Hierarchical LSTM for sign language translation[C]. The 32nd AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, USA, 2018: 6845–6852.
    FORSTER J, SCHMIDT C, HOYOUX T, et al. RWTH-PHOENIX-Weather: A large vocabulary sign language recognition and translation corpus[C]. The 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey, 2012: 3785–3789.
    ESCALERA S, BARÓ X, GONZÀLEZ J, et al. Chalearn looking at people challenge 2014: Dataset and results[C]. European Conference on Computer Vision, Zurich, Switzerland, 2014: 459–473.
    ONG E J, COOPER H, PUGEAULT N, et al. Sign language recognition using sequential pattern trees[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 2200–2207.
    VON AGRIS U, ZIEREN J, CANZLER U, et al. Recent developments in visual sign language recognition[J]. Universal Access in the Information Society, 2008, 6(4): 323–362. doi: 10.1007/s10209-007-0104-x
    EFTHIMIOU E and FOTINEA S E. GSLC: Creation and annotation of a Greek sign language corpus for HCI[C]. The 4th International Conference on Universal Access in Human-Computer Interaction, Beijing, China, 2007: 657–666.
    NEIDLE C, THANGALI A, and SCLAROFF S. Challenges in development of the American Sign Language lexicon video dataset (ASLLVD) corpus[C]. The 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, Istanbul, Turkey, 2012: 1–8.
    OSZUST M and WYSOCKI M. Polish sign language words recognition with Kinect[C]. The 6th International Conference on Human System Interactions (HSI), Sopot, Poland, 2013: 219–226.
    RONCHETT F, QUIROGA F, ESTREBOU C A, et al. LSA64: An Argentinian sign language dataset[C]. The 22nd Congreso Argentino de Ciencias de la Computación (CACIC 2016), San Luis, USA, 2016: 794–803.
    CHAI Xiujuan, WANG Hanjie, and CHEN Xilin. The DEVISIGN large vocabulary of Chinese sign language database and baseline evaluations[R]. Technical Report VIPL-TR-14-SLR-001, 2014.
    LU Pengfei and HUENERFAUTH M. Collecting and evaluating the CUNY ASL corpus for research on American sign language animation[J]. Computer Speech & Language, 2014, 28(3): 812–831. doi: 10.1016/j.csl.2013.10.004
    SHOHIEB S M, ELMINIR H K, and RIAD A M. Signsworld atlas; a benchmark Arabic sign language database[J]. Journal of King Saud University-Computer and Information Sciences, 2015, 27(1): 68–76. doi: 10.1016/j.jksuci.2014.03.011
    PUGEAULT N and BOWDEN R. Spelling it out: Real-time ASL fingerspelling recognition[C]. 2011 IEEE International Conference on Computer Vision workshops (ICCV Workshops), Barcelona, Spain, 2011: 1114–1119.
    PRABHAVALKAR R, SAINATH T N, WU Yonghui, et al. Minimum word error rate training for attention-based sequence-to-sequence models[C]. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018: 4839–4843.
    KOLLER O, FORSTER J, and NEY H. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers[J]. Computer Vision and Image Understanding, 2015, 141: 108–125. doi: 10.1016/j.cviu.2015.09.013
  • 加载中
图(4) / 表(5)
计量
  • 文章访问数:  14961
  • HTML全文浏览量:  6073
  • PDF下载量:  1239
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-06-06
  • 修回日期:  2019-11-20
  • 网络出版日期:  2020-01-18
  • 刊出日期:  2020-06-04

目录

    /

    返回文章
    返回