高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于图像分割网络的深度假脸视频篡改检测

胡永健 高逸飞 刘琲贝 廖广军

胡永健, 高逸飞, 刘琲贝, 廖广军. 基于图像分割网络的深度假脸视频篡改检测[J]. 电子与信息学报, 2021, 43(1): 162-170. doi: 10.11999/JEIT200077
引用本文: 胡永健, 高逸飞, 刘琲贝, 廖广军. 基于图像分割网络的深度假脸视频篡改检测[J]. 电子与信息学报, 2021, 43(1): 162-170. doi: 10.11999/JEIT200077
Yongjian HU, Yifei GAO, Beibei LIU, Guangjun LIAO. Deepfake Videos Detection Based on Image Segmentation with Deep Neural Networks[J]. Journal of Electronics & Information Technology, 2021, 43(1): 162-170. doi: 10.11999/JEIT200077
Citation: Yongjian HU, Yifei GAO, Beibei LIU, Guangjun LIAO. Deepfake Videos Detection Based on Image Segmentation with Deep Neural Networks[J]. Journal of Electronics & Information Technology, 2021, 43(1): 162-170. doi: 10.11999/JEIT200077

基于图像分割网络的深度假脸视频篡改检测

doi: 10.11999/JEIT200077
基金项目: 国家重点研发计划项目(2019QY2202),广州市开发区国际合作项目(201902010028),中新国际联合研究院项目(206-A017023, 206-A018001),广东省自然科学基金博士科研启动项目(2017A030310320),中央高校基本科研业务费专项资金(2019MS025),广东省教育厅特色创新类项目(2017KTSCX132)
详细信息
    作者简介:

    胡永健:男,1962年生,教授,博士生导师,研究方向为多媒体信息安全、图像处理、人工智能及其应用

    高逸飞:男,1996年生,硕士生,研究方向为多媒体信息安全、图像处理和机器学习

    刘琲贝:女,1980年生,讲师,研究方向为多媒体信息安全、图像处理和机器学习

    廖广军:男,1981年生,副教授,研究方向为多媒体信息安全、图像处理和机器学习

    通讯作者:

    胡永健 eeyjhu@scut.edu.cn

  • 1) DeepFake Detection Challenge: < https://www.kaggle.com/c/deepfake-detection-challenge>
  • 中图分类号: TN911.73

Deepfake Videos Detection Based on Image Segmentation with Deep Neural Networks

Funds: The National Key R & D Program (2019QY2202), The International Cooperation Project of Guangzhou Development Zone (201902010028), The Sino Singapore International Joint Research Institute Project (206-A017023, 206-A018001), The Doctoral Research Project of Natural Science Foundation of Guangdong Province (2017A030310320), The Special Fund for Basic Scientific Research of Central University (2019MS025), The Department of Education of Guangdong Province Characteristic Innovation Project (2017KTSCX132)
  • 摘要:

    随着深度学习技术的快速发展,利用深度神经网络模型伪造出的深度假脸(deepfake)视频越来越逼真,假脸视频造成的威胁也越来越大。文献中已出现一些基于卷积神经网络的换脸视频检测算法,他们在库内获得较好的检测效果,但跨库检测性能急剧下降,存在泛化能力不足的问题。该文从假脸篡改的机制出发,将视频换脸视为特殊的拼接篡改问题,利用流行的神经分割网络首先预测篡改区域,得到预测掩膜概率图,去噪并二值化,然后根据换脸主要发生在人脸区域的前提,提出一种计算人脸交并比的新方法,并进一步根据换脸处理的先验知识改进人脸交并比的计算,将其作为篡改检测的分类准则。所提出方法分别在3个不同的基础分割网络上实现,并在TIMIT, FaceForensics++, FFW数据库上进行了实验,与文献中流行的同类方法相比,在保持库内检测的高准确率同时,跨库检测的平均错误率显著下降。在近期发布的合成质量较高的DFD数据库上也获得了很好的检测性能,充分证明了所提出方法的有效性和通用性。

  • 图  1  待检测区域、实际篡改区域和预测篡改区域示例及其广义示意图

    图  2  FaceForensics++数据库视频检测结果示例图

    图  3  同时含有真脸和假脸的检测热力图示例

    表  1  网络训练

     输入:训练集数据$X$,验证集数据$Z$
     输出:训练好的分割网络模型${\rm{Model}}$、二值化阈值$T_1^*$和判决阈值
     ${\kern 1pt} T_2^*$
     (1) Begin(算法开始)
     (2) 初始化${\rm{Model}}$的权重
     (3) 将$X$输入${\rm{Model}}$中进行训练,更新得到训练好的权重模型
     (4) 将$Z$输入${\rm{Model}}$中,计算最小等错误率下的$T_1^*$和${\kern 1pt} T_2^*$
     (5) End(算法结束)
    下载: 导出CSV

    表  2  样本测试

     输入:测试样本视频$V$,训练好的分割网络模型${\rm{Model}}$、二值化
     阈值$T_1^*$和判决阈值${\kern 1pt} T_2^*$
     输出:检测结果$Y$
     (1) Begin(算法开始)
     (2) 对$V$分帧并定位裁剪出人脸区域,得到$I = \left\{ { {I_1},{I_2},···,{I_Q} } \right\}$。
     (3) For q=1 to Q do:
     (4)  将${I_q}$输入${\rm{Model}}$,得到预测篡改区域掩膜${M_q}$
     (5)  对${M_q}$滤波得到${\rm{M} }{ {\rm{F} }_{{q} } }$
     (6)  根据$T_1^*$对${\rm{M}}{{\rm{F}}_{\rm{q}}}$二值化,得到${\rm{M} }{ {\rm{B} }_{{q} } }$
     (7)  对${\rm{M} }{ {\rm{B} }_{{q} } }$计算${\rm{Face - IoU} }{ {\rm{P} }_{{q} } }$
     (8)  根据${\kern 1pt} T_2^*$对${\rm{Face - IoU} }{ {\rm{P} }_{{q} } }$进行二分类判决,得到${y_q}$
     (9) end For
     (10) End(算法结束)
    下载: 导出CSV

    表  3  检测模型在不同滤波器下的平均错误率(%) $p$=1

    网络训练数据库
    测试数据库
    滤波器类型
    核大小TIMITFaceForensics++
    TIMIT(库内)FaceForensics++(跨库)FFW(跨库)FaceForensics++(库内)TIMIT(跨库)
    FCN-8s2.523.224.42.224.1
    均值3×32.623.423.02.124.8
    5×52.523.222.92.325.2
    中值3×32.623.122.92.125.3
    5×52.622.923.42.224.0
    高斯3×32.422.722.91.822.6
    5×52.523.222.91.924.8
    FCN-32s5.827.220.81.929.2
    均值3×35.826.020.11.929.6
    5×55.226.320.41.929.7
    中值3×35.927.420.71.930.4
    5×55.627.020.31.829.7
    高斯3×35.726.820.51.827.5
    5×56.027.020.71.730.4
    下载: 导出CSV

    表  4  检测模型在不同惩罚因子下的平均错误率(%)

    训练数据库TIMITFaceForensics++
    测试数据库TIMIT(库内)FaceForensics++(跨库)FFW(跨库)FaceForensics++(库内)TIMIT(跨库)
    网络惩罚因子
    FCN-8s02.523.623.31.924.3
    0.52.523.123.51.824.5
    1.02.422.722.91.822.6
    1.52.522.723.11.923.7
    FCN-32s06.027.220.52.229.8
    0.55.827.020.81.930.6
    1.05.726.820.51.827.5
    1.55.927.220.61.829.6
    下载: 导出CSV

    表  5  以TIMIT数据库训练模型所得到的测试结果(%)

    测试数据库TIMIT(库内)FaceForensics++(跨库)FFW(跨库)
    网络等错误率平均错误率准确率平均错误率平均错误率
    MesoInception-4[10]11.214.486.137.740.1
    ShallowNetV1[11]1.44.395.838.242.3
    MISLnet[12]5.45.294.830.341.0
    ResNet-50[8,14]0.82.597.644.945.7
    Xception[9]1.62.497.835.435.7
    FCN-8s(本文算法)4.02.497.722.722.9
    FCN-32s(本文算法)6.25.794.426.820.5
    DeepLabv3(本文算法)1.13.796.430.025.0
    下载: 导出CSV

    表  6  以FaceForensics++数据库训练模型所得到的测试结果(%)

    测试数据库FaceForensics++(库内)TIMIT(跨库)
    网络等错误率平均错误率准确率平均错误率
    MesoInception-4[10]4.65.694.428.2
    ShallowNetV1[11]0.82.196.435.1
    MISLnet[12]3.03.596.419.3
    ResNet-50[8,14]2.83.596.438.3
    FCN-8s(本文算法)2.11.898.222.6
    FCN-32s(本文算法)1.01.898.327.5
    DeepLabv3(本文算法)0.81.099.022.5
    下载: 导出CSV

    表  7  通过DFD的C23数据库训练模型所得到的平均错误率(%)

    测试数据库DFD(C23)(库内)TIMIT(跨库)FaceForensics++(C0)(跨库)FaceForensics++(C23)(跨库)FFW(跨库)
    FCN-8s(本文算法)1.715.914.216.921.5
    FCN-32s(本文算法)1.917.97.911.420.2
    下载: 导出CSV

    表  8  算法复杂度与时间对比

    网络FLOPs(M)时长(s/100段视频)
    MesoInception-4[10]0.52402.6
    ShallowNetV1[11]65.12535.1
    MISLnet[12]31.52351.8
    ResNet-50[8,14]47.32680.5
    Xception[9]41.92564.3
    FCN-8s(本文算法)268.53780.0
    FCN-32s(本文算法)268.53782.9
    DeepLabv3(本文算法)37.72510.1
    下载: 导出CSV
  • RÖSSLER A, COZZOLINO D, VERDOLIVA L, et al. FaceForensics++: Learning to detect manipulated facial images[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 1–11. doi: 10.1109/iccv.2019.00009.
    KORSHUNOV P and MARCEL S. DeepFakes: A new threat to face recognition? Assessment and detection[EB/OL]. https://arxiv.org/abs/1812.08685, 2018.
    KHODABAKHSH A, RAMACHANDRA R, RAJA K, et al. Fake face detection methods: Can they be generalized?[C]. 2018 International Conference of the Biometrics Special Interest Group, Darmstadt, Germany, 2018: 1–6. doi: 10.23919/BIOSIG.2018.8553251.
    YANG Xin, LI Yuezun, and LÜ Siwei. Exposing deep fakes using inconsistent head poses[C]. ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, England, 2019: 8261–8265. doi: 10.1109/icassp.2019.8683164.
    MATERN F, RIESS C, and STAMMINGER M. Exploiting visual artifacts to expose deepfakes and face manipulations[C]. 2019 IEEE Winter Applications of Computer Vision Workshops, Waikoloa Village, USA, 2019: 83–92. doi: 10.1109/WACVW.2019.00020.
    KORSHUNOV P and MARCEL S. Speaker inconsistency detection in tampered video[C]. The 26th European Signal Processing Conference, Rome, Italy, 2018: 2375–2379. doi: 10.23919/eusipco.2018.8553270.
    AGARWAL S, FARID H, GU Yuming, et al. Protecting world leaders against deep fakes[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, California, USA, 2019: 38–45.
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.
    CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 1251–1258. doi: 10.1109/CVPR.2017.195.
    AFCHAR D, NOZICK V, YAMAGISHI J, et al. MesoNet: A compact facial video forgery detection network[C]. 2018 IEEE International Workshop on Information Forensics and Security, Hong Kong, China, 2018: 1–7. doi: 10.1109/WIFS.2018.8630761.
    TARIQ S, LEE S, KIM H, et al. Detecting both machine and human created fake face images in the wild[C]. The 2nd International Workshop on Multimedia Privacy and Security, Toronto, Canada, 2018: 81–87. doi: 10.1145/3267357.3267367.
    BAYAR B and STAMM M C. Constrained convolutional neural networks: A new approach towards general purpose image manipulation detection[J]. IEEE Transactions on Information Forensics and Security, 2018, 13(11): 2691–2706. doi: 10.1109/TIFS.2018.2825953
    GÜERA D and DELP E J. Deepfake video detection using recurrent neural networks[C]. The 15th IEEE International Conference on Advanced Video and Signal Based Surveillance, Auckland, New Zealand, 2018: 1–6. doi: 10.1109/AVSS.2018.8639163.
    WANG Shengyu, WANG O, ZHANG R, et al. CNN-generated images are surprisingly easy to spot... for now[EB/OL]. https://arxiv.org/abs/1912.11035, 2019.
    SHELHAMER E, LONG J, and DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683
    CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. https://arxiv.org/abs/1706.05587, 2017.
    毕秀丽, 魏杨, 肖斌, 等. 基于级联卷积神经网络的图像篡改检测算法[J]. 电子与信息学报, 2019, 41(12): 2987–2994. doi: 10.11999/JEIT190043

    BI Xiuli, WEI Yang, XIAO Bin, et al. Image forgery detection algorithm based on cascaded convolutional neural network[J]. Journal of Electronics &Information Technology, 2019, 41(12): 2987–2994. doi: 10.11999/JEIT190043
    LI Haodong, LI Bin, TAN Shunquan, et al. Detection of deep network generated images using disparities in color components[EB/OL]. https://arxiv.org/abs/1808.07276, 2018.
    NATARAJ L, MOHAMMED T M, MANJUNATH B S, et al. Detecting GAN generated fake images using co-occurrence matrices[J]. Electronic Imaging, 2019(5): 532-1–532-7. doi: 10.2352/ISSN.2470-1173.2019.5.MWSF-532
    杨宏宇, 王峰岩. 基于深度卷积神经网络的气象雷达噪声图像语义分割方法[J]. 电子与信息学报, 2019, 41(10): 2373–2381. doi: 10.11999/JEIT190098

    YANG Hongyu and WANG Fengyan. Meteorological radar noise image semantic segmentation method based on deep convolutional neural network[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2373–2381. doi: 10.11999/JEIT190098
    高逸飞, 胡永健, 余泽琼, 等. 5种流行假脸视频检测网络性能分析和比较[J]. 应用科学学报, 2019, 37(5): 590–608. doi: 10.3969/j.issn.0255-8297.2019.05.002

    GAO Yifei, HU Yongjian, YU Zeqiong, et al. Evaluation and comparison of five popular fake face detection networks[J]. Journal of Applied Sciences, 2019, 37(5): 590–608. doi: 10.3969/j.issn.0255-8297.2019.05.002
  • 加载中
图(3) / 表(8)
计量
  • 文章访问数:  4064
  • HTML全文浏览量:  2033
  • PDF下载量:  331
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-01-17
  • 修回日期:  2020-07-10
  • 网络出版日期:  2020-07-22
  • 刊出日期:  2021-01-15

目录

    /

    返回文章
    返回