高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

FastProtector: 一种支持梯度隐私保护的高效联邦学习方法

林莉 张笑盈 沈薇 王万祥

林莉, 张笑盈, 沈薇, 王万祥. FastProtector: 一种支持梯度隐私保护的高效联邦学习方法[J]. 电子与信息学报, 2023, 45(4): 1356-1365. doi: 10.11999/JEIT220161
引用本文: 林莉, 张笑盈, 沈薇, 王万祥. FastProtector: 一种支持梯度隐私保护的高效联邦学习方法[J]. 电子与信息学报, 2023, 45(4): 1356-1365. doi: 10.11999/JEIT220161
LIN Li, ZHANG Xiaoying, SHEN Wei, WANG Wanxiang. FastProtector: An Efficient Federated Learning Method Supporting Gradient Privacy Protection[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1356-1365. doi: 10.11999/JEIT220161
Citation: LIN Li, ZHANG Xiaoying, SHEN Wei, WANG Wanxiang. FastProtector: An Efficient Federated Learning Method Supporting Gradient Privacy Protection[J]. Journal of Electronics & Information Technology, 2023, 45(4): 1356-1365. doi: 10.11999/JEIT220161

FastProtector: 一种支持梯度隐私保护的高效联邦学习方法

doi: 10.11999/JEIT220161
基金项目: 国家自然科学基金(61502017),北京市教委科技计划一般项目(KM201710005024)
详细信息
    作者简介:

    林莉:女,博士,副教授,主要研究方向为云计算与边缘计算安全、隐私保护和人工智能安全

    张笑盈:男,硕士生,研究方向为隐私保护和联邦学习

    沈薇:女,硕士生,研究方向为联邦学习安全与应用

    王万祥:男,硕士生,研究方向为联邦学习安全

    通讯作者:

    林莉 linli_2009@bjut.edu.cn

  • 中图分类号: TN918; TP181

FastProtector: An Efficient Federated Learning Method Supporting Gradient Privacy Protection

Funds: The National Natural Science Foundation of China (61502017), The Scientific Research Common Program of Beijing Municipal Commission of Education (KM201710005024)
  • 摘要: 联邦学习存在来自梯度的参与方隐私泄露,现有基于同态加密的梯度保护方案产生较大时间开销且潜在参与方与聚合服务器合谋导致梯度外泄的风险,为此,该文提出一种新的联邦学习方法FastProtector,在采用同态加密保护参与方梯度时引入符号随机梯度下降(SignSGD)思想,利用梯度中正负的多数决定聚合结果也能使模型收敛的特性,量化梯度并改进梯度更新机制,降低梯度加密的开销;同时给出一种加性秘密共享方案保护梯度密文以抵抗恶意聚合服务器和参与方之间共谋攻击;在MNIST和CIFAR-10数据集上进行了实验,结果表明所提方法在降低80%左右加解密总时间的同时仍可保证较高的模型准确率。
  • 图  1  联邦学习场景

    图  2  FastProtector方法架构

    图  3  FastProtector的工作流程

    图  4  训练数据集的神经网络结构

    图  5  MNIST数据集上不同方法的模型准确率

    图  6  CIFAR-10数据集上不同方法的模型准确率

    图  7  MNIST数据集上不同方法的加密开销

    图  8  CIFAR-10数据集上不同方法的加密开销

    图  9  不同梯度数量的加密开销

    图  10  不同密钥长度的加密开销

    图  11  MNIST数据集上不同方法的解密开销

    图  12  CIFAR-10数据集上不同方法的解密开销

    图  13  MNIST数据集上不同方法的一轮训练时间开销

    表  1  变量符号说明

    符号类型物理含义
    n标量参与方个数
    m标量聚合服务器个数
    M张量训练模型
    α标量学习率
    epoch标量训练轮数
    pk标量公钥
    sk标量私钥
    Di张量参与方i的数据集
    Disub张量参与方i的数据集的子集
    Gi向量参与方i的梯度,经过SignSGD方法处理后的梯度
    ${\boldsymbol{G}}_j^i $向量参与方i的第j个梯度共享
    [[${\boldsymbol{G} }_j^i $]]pk列表参与方i的第j个梯度共享密文
    [[${\boldsymbol{G} }^j_{\rm{agg}} $]]pk列表聚合服务器j的共享加和结果
    [[Gagg]]pk列表特定聚合服务器下发的聚合梯度密文
    pg标量正梯度量化的值
    ng标量负梯度量化的值
    下载: 导出CSV
     算法1 模型训练和基于SignSGD的梯度共享加密
     输入:数据集$ {{\boldsymbol{D}}^i} $,模型$ {\boldsymbol{M}} $,训练轮数epoch,正梯度量化的值pg,负梯度量化的值ng,公钥pk;
     输出:梯度共享密文${\left[ {\left[ { {\boldsymbol{G} }_{1}^i} \right]} \right]_{ { {\rm{pk} } } }},{\left[ {\left[ { {\boldsymbol{G} }_{ 2}^i} \right]} \right]_{ { {\rm{pk} } } }}, \cdots ,{\left[ {\left[ { {\boldsymbol{G} }_{ {\text{ } }{m} }^i} \right]} \right]_{ { {\rm{pk} } } }}$。
     (1) for ep=1 to epoch
     (2) ${ {\boldsymbol{D} } }^{i}_{ {\rm{sub} } } ={\rm{GetSubset} }\left({ {\boldsymbol{D} } }^{i}\right)$;/*求解${{\boldsymbol{D}}^i}$的随机子集*/
     (3) ${ {F} }^{{i} } _{\text{loss} }=\dfrac{1}{{b} }{\displaystyle\sum }_{({{x} }_{\mathrm{l} },{{y} }_{{l} })\in {{D} }^{{i} } _{\mathrm{s}\mathrm{u}\mathrm{b} } }{f}({ {x} }_{ {l} },{M}\text{,}{ {y} }_{ {l} })$;/*计算损失值*/
     (4) ${ {\boldsymbol{G} }^i} = \dfrac{ {\delta F^i_{ {\rm{loss} } } }}{ {\delta M} }$;/*计算梯度*/
     (5) ${{\boldsymbol{G}}^i} = {\rm{Replace}}\left( {{{\boldsymbol{G}}^i}} \right)$;/*将梯度${ {{G} }^i}$中的正值替换为pg,负值替换为ng*/
     (6) ${{\rm{pg}}_{{\rm{share}}}} = {\rm{GetAdditiveShares}}\left( {{\rm{pg}}} \right)$;/*计算得到pg的m份共享*/
     (7) ${{\rm{ng}}_{{\rm{share}}}} = {\rm{GetAdditiveShares}}\left( {{\rm{ng}}} \right)$;/*计算得到ng的m份共享*/
     (8) ${\left[ {\left[ {{{\rm{pg}}_{{\rm{share}}}}} \right]} \right]_{{\rm{pk}}}} = {\rm{Encrypt}}\left( {{\rm{pk}},{{\rm{pg}}_{{\rm{share}}}}} \right)$;/*加密pg的m份共享*/
     (9) ${\left[ {\left[ {{{\rm{ng}}_{{\rm{share}}}}} \right]} \right]_{{\rm{pk}}}} = {\rm{Encrypt}}\left( {{\rm{pk}},{{\rm{ng}}_{{\rm{share}}}}} \right)$;/*加密ng的m份共享*/
     (10) ${\boldsymbol{G}}_{ 1}^i = {{\boldsymbol{G}}^i},{\boldsymbol{G}}_{ 2}^i = {{\boldsymbol{G}}^i}, \cdots ,{\boldsymbol{G}}_{ m}^i = {{\boldsymbol{G}}^i}$;/*复制m份相同的${{\boldsymbol{G}}^i}$ */
     (11) ${{\boldsymbol{G}}^i_1{'} } = {{\boldsymbol{G}}^i_1{'} }.{\rm{tolist}}(),{{\boldsymbol{G}}^i_2{'} } = {{\boldsymbol{G}}^i_2{'} }.{\rm{tolist}}(), \cdots ,{{\boldsymbol{G}}^i_m{'} } = {{\boldsymbol{G}}^i_m{'} }.{\rm{tolist}}()$;/*将${\boldsymbol{G}}{_{ 1}^i{'} },{\boldsymbol{G}}{_{ 2}^i{'} }, \cdots ,{\boldsymbol{G}}{_{ m}^i{'} }$转换为列表类型*/
     (12) ${\rm{datanum}} = {\rm{len}}({{\boldsymbol{G}}^i})$ ;/*获得$ {G^i} $中元素个数*/
     (13) for ${\rm{num}} = 0$ to ${\rm{datanum}} - 1$ /*根据$ {G^i} $中的正负将共享密文替换到对应位置*/
     (14)  if ${{\boldsymbol{G}}^i}\left[ {{\rm{num}}} \right] > 0$
     (15)   ${G}_{1}^{i}{}^{\prime }\left[{\rm{num}}\right]={\left[\left[{{\rm{pg}}}_{{\rm{share}}}\right]\right]}_{{\rm{pk}}}[0],{G}_{2}^{i}{}^{\prime }\left[{\rm{num}}\right]={\left[\left[{{\rm{pg}}}_{{\rm{share}}}\right]\right]}_{{\rm{pk}}}[1],\cdots ,{G}_{m}^{i}{}^{\prime }[{\rm{num}}]={\left[\left[{{\rm{pg}}}_{{\rm{share}}}\right]\right]}_{{\rm{pk}}}[m-1]$;
     (16)  else
     (17)   ${G}_{1}^{i}{}^{\prime }\left[{\rm{num}}\right]={\left[\left[{{\rm{ng}}}_{{\rm{share}}}\right]\right]}_{{\rm{pk}}}[0],{G}_{2}^{i}{}^{\prime }\left[{\rm{num}}\right]={\left[\left[{{\rm{ng}}}_{{\rm{share}}}\right]\right]}_{pk}[1],\cdots ,{G}_{m}^{i}{}^{\prime }[{\rm{num}}]={\left[\left[n{g}_{{\rm{share}}}\right]\right]}_{{\rm{pk}}}[m-1]$;
     (18)  end if
     (19) end for
     (20) ${\left[ {\left[ { {\boldsymbol{G} }_{ 1}^i} \right]} \right]_{ {\rm{pk} } } } = {\boldsymbol{G} }{_{ 1}^i{'} },{\left[ {\left[ { {\boldsymbol{G} }_{ 2}^i} \right]} \right]_{ {\rm{pk} } } } = {\boldsymbol{G} }{_{ 2}^i{'} }, \cdots ,{\left[ {\left[ { {\boldsymbol{G} }_{ m}^i} \right]} \right]_{ {\rm{pk} } } } = {\boldsymbol{G} }{_{ m}^i{'} }$;
     (21) Return ${\left[ {\left[ {{\boldsymbol{G}}_{ 1}^i} \right]} \right]_{{\rm{pk}}}},{\left[ {\left[ {{\boldsymbol{G}}_{ 2}^i} \right]} \right]_{{\rm{pk}}}}, \cdots ,{\left[ {\left[ {{\boldsymbol{G}}_{{\text{ }} m}^i} \right]} \right]_{{\rm{pk}}}}$;
     (22)end for
    下载: 导出CSV
     算法2 解密和更新
     输入:聚合梯度密文${\left[ {\left[ {{{\boldsymbol{G}}_{{\rm{agg}}}}} \right]} \right]_{{{\rm{pk}}}}} $,公钥pk,私钥sk,模型${\boldsymbol{M}}$,学习率$\alpha $,训练轮数epoch;
     输出:更新后的模型${\boldsymbol{M}}$。
     (1) for ep=1 to epoch
     (2)  ${G'_{{\rm{agg}}} } = {\rm{Decrypt}}({\rm{pk,sk}},{\left[ {\left[ { {{\boldsymbol{G}}_{{\rm{agg}}} } } \right]} \right]_{ {{\rm{pk}}} } })$;/*聚合梯度密文的解密*/
     (3)  ${{\boldsymbol{G}}_{{\rm{agg}}} } = {\rm{torch}}. {\rm{tensor}}({{\boldsymbol{G}}_{ {\text{agg} } } }^\prime )$;/*将${\boldsymbol{{G}}_{ {\text{agg} } } }^\prime$转换为张量类型*/
     (4)  ${\boldsymbol{M}} = {\boldsymbol{M}} - \alpha \cdot {{\boldsymbol{G}}_{{\rm{agg}}} }/n$;/*更新模型*/
     (5)  Return ${\boldsymbol{M}}$;
     (6)end for
    下载: 导出CSV

    表  2  MNIST和CIFAR-10数据集上最终模型准确率(%)

    方法MNIST数据集
    模型准确率
    CIFAR-10数据集
    模型准确率
    Original-FL9984
    Paillier9984
    本文FastProtector9984
    下载: 导出CSV
  • [1] JEON J, KIM J, KIM J, et al. Privacy-preserving deep learning computation for geo-distributed medical big-data platforms[C]. 2019 49th IEEE/IFIP International Conference on Dependable Systems and Networks-Supplemental Volume, Portland, USA, 2019: 3–4.
    [2] LIU Yang, MA Zhuo, LIU Ximeng, et al. Privacy-preserving object detection for medical images with faster R-CNN[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 69–84. doi: 10.1109/TIFS.2019.2946476
    [3] VIZITIU A, NIŢĂ C I, PUIU A, et al. Towards privacy-preserving deep learning based medical imaging applications[C]. 2019 IEEE International Symposium on Medical Measurements and Applications, Istanbul, Turkey, 2019: 1–6.
    [4] Intersoft consulting. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data and repealing Directive 95/46/EC (General Data Protection Regulation) [EB/OL]. https://gdpr-info.eu, 2020.
    [5] DLA Piper. Data protection laws of the world: Full handbook[EB/OL]. https://www.dlapiperdataprotection.com, 2021.
    [6] 中华人民共和国网络安全法(全文)[EB/OL]. http://www.zgyq.gov.cn/zwzxrdzt/xfzl/202208/t20220819_76128304.html, 2022.

    Cybersecurity law of the People's Republic of China[EB/OL]. http://www.zgyq.gov.cn/zwzxrdzt/xfzl/202208/t20220819_76128304.html, 2022.
    [7] 中华人民共和国数据安全法[EB/OL]. http://www.npc.gov.cn/npc/c30834/202106/7c9af12f51334a73b56d7938f99a788a.shtml, 2021.

    Data security law of the People's Republic of China[EB/OL]. http://www.npc.gov.cn/npc/c30834/202106/7c9af12f51334a73b56d7938f99a788a.shtml, 2021.
    [8] 中华人民共和国个人信息保护法[EB/OL]. http://www.npc.gov.cn/npc/c30834/202108/a8c4e3672c74491a80b53a172bb753fe.shtml, 2021.

    Personal information protection law of the People's Republic of China[EB/OL]. http://www.npc.gov.cn/npc/c30834/202108/a8c4e3672c74491a80b53a172bb753fe.shtml, 2021.
    [9] MCMAHAN H B, MOORE E, RAMAGE D, et al. Federated learning of deep networks using model averaging[EB/OL]. https://arxiv.org/abs/1602.05629v1, 2016.
    [10] ZHU Ligeng, LIU Zhijian, and HAN Song. Deep leakage from gradients[EB/OL]. https://arxiv.org/abs/1906.08935, 2019.
    [11] MA Chuan, LI Jun, DING Ming, et al. On safeguarding privacy and security in the framework of federated learning[EB/OL]. https://arxiv.org/abs/1909.06512, 2019.
    [12] ZHOU Chunyi, FU Anmin, YU Shui, et al. Privacy-preserving federated learning in fog computing[J]. IEEE Internet of Things Journal, 2020, 7(11): 10782–10793. doi: 10.1109/JIOT.2020.2987958
    [13] PHONG L T, AONO Y, HAYASHI T, et al. Privacy-preserving deep learning via additively homomorphic encryption[J]. IEEE Transactions on Information Forensics and Security, 2018, 13(5): 1333–1345. doi: 10.1109/TIFS.2017.2787987
    [14] ZHANG Xianglong, FU Anmin, WANG Huaqun, et al. A privacy-preserving and verifiable federated learning scheme[C]. 2020 IEEE International Conference on Communications, Dublin, Ireland, 2020: 1–6.
    [15] LOHANA A, RUPANI A, RAI S, et al. Efficient privacy-aware federated learning by elimination of downstream redundancy[J]. IEEE Design & Test, 2022, 39(3): 73–81. doi: 10.1109/MDAT.2021.3063373
    [16] MENG Dan, LI Hongyu, ZHU Fan, et al. FedMONN: Meta operation neural network for secure federated aggregation[C]. 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Yanuca Island, Fiji, 2020: 579–584.
    [17] 董业, 侯炜, 陈小军, 等. 基于秘密分享和梯度选择的高效安全联邦学习[J]. 计算机研究与发展, 2020, 57(10): 2241–2250. doi: 10.7544/issn1000-1239.2020.20200463

    DONG Ye, HOU Wei, CHEN Xiaojun, et al. Efficient and secure federated learning based on secret sharing and gradients selection[J]. Journal of Computer Research and Development, 2020, 57(10): 2241–2250. doi: 10.7544/issn1000-1239.2020.20200463
    [18] FANG Minghong, CAO Xiaoyu, JIA Jinyuan, et al. Local model poisoning attacks to Byzantine-Robust federated learning[EB/OL]. https://arxiv.org/abs/1911.11815, 2021.
    [19] 夏家骏, 鲁颖, 张子扬, 等. 基于秘密共享与同态加密的纵向联邦学习方案研究[J]. 信息通信技术与政策, 2021, 47(6): 19–26. doi: 10.12267/j.issn.2096-5931.2021.06.003

    XIA Jiajun, LU Ying, ZHANG Ziyang, et al. Research on vertical federated learning based on secret sharing and homomorphic encryption[J]. Information and Communications Technology and Policy, 2021, 47(6): 19–26. doi: 10.12267/j.issn.2096-5931.2021.06.003
    [20] HAO Meng, LI Hongwei, XU Guowen, et al. Towards efficient and privacy-preserving federated deep learning[C]. 2019 IEEE International Conference on Communications, Shanghai, China, 2019: 1–6.
    [21] XIANG Liyao, YANG Jingbo, and LI Baochun. Differentially-private deep learning from an optimization perspective[C]. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, Paris, France, 2019: 559–567.
    [22] BERNSTEIN J, ZHAO J W, AZIZZADENESHELI K, et al. SignSGD with majority vote is communication efficient and fault tolerant[C]. The 7th International Conference on Learning Representations, New Orleans, USA, 2019: 1–20.
  • 加载中
图(13) / 表(4)
计量
  • 文章访问数:  825
  • HTML全文浏览量:  335
  • PDF下载量:  164
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-02-22
  • 修回日期:  2022-11-16
  • 网络出版日期:  2022-11-21
  • 刊出日期:  2023-04-10

目录

    /

    返回文章
    返回