隐私保护机器学习的密码学方法

蒋瀚; 刘怡然; 宋祥福; 王皓; 郑志华; 徐秋亮

doi:10.11999/JEIT190887

隐私保护机器学习的密码学方法

doi: 10.11999/JEIT190887 cstr: 32379.14.JEIT190887

1.
山东大学软件学院济南 250101
2.
山东师范大学信息科学与工程学院济南 250358

基金项目: 国家自然科学基金(61632020, 61572294)；山东省自然科学基金(ZR2017MF021)；山东省科技重大创新工程项目(2018CXGC0702)；山东半岛国家自主创新示范区发展建设项目(S190101010001)

详细信息

作者简介:
蒋瀚：男，1974年生，讲师，研究方向为密码学与信息安全

刘怡然：女，1996年生，博士生，研究方向为密码学与信息安全

宋祥福：男，1992年生，博士生，研究方向为密码学与信息安全

王皓：男，1984年生，副教授，研究方向为密码学与信息安全

郑志华：女，1962年生，副教授，研究方向为密码学与信息安全

徐秋亮：男，1960年生，教授，研究方向为密码学与信息安全

通讯作者:
徐秋亮　xql@sdu.edu.cn

中图分类号: TN918; TP309
计量
- 文章访问数: 6510
- HTML全文浏览量: 4427
- PDF下载量: 545
- 被引次数: 0
出版历程
- 收稿日期: 2019-11-06
- 修回日期: 2020-03-08
- 网络出版日期: 2020-04-03
- 刊出日期: 2020-06-04

Cryptographic Approaches for Privacy-Preserving Machine Learning

1.
School of Software, Shandong University, Jinan 250101, China
2.
School of Information Science and Technology, Shandong Normal University, Jinan 250358, China

Funds: The National Natural Science Foundation of China (61632020, 61572294); The Natural Science Foundation of Shandong Province (ZR2017MF021); The Major Innovation Project of Science and Technology of Shandong Province (2018CXGC0702); The Funds Project of National Independent Innovation Demonstration Zone in Shandong Peninsula (S190101010001)

摘要

摘要: 新一代人工智能技术的特征，表现为借助GPU计算、云计算等高性能分布式计算能力，使用以深度学习算法为代表的机器学习算法，在大数据上进行学习训练，来模拟、延伸和扩展人的智能。不同数据来源、不同的计算物理位置，使得目前的机器学习面临严重的隐私泄露问题，因此隐私保护机器学习(PPM)成为目前广受关注的研究领域。采用密码学工具来解决机器学习中的隐私问题，是隐私保护机器学习重要的技术。该文介绍隐私保护机器学习中常用的密码学工具，包括通用安全多方计算(SMPC)、隐私保护集合运算、同态加密(HE)等，以及应用它们来解决机器学习中数据整理、模型训练、模型测试、数据预测等各个阶段中存在的隐私保护问题的研究方法与研究现状。
- 隐私保护机器学习 /
- 安全多方计算 /
- 同态加密 /
- 隐私保护集合求交
Abstract: The characteristics of the new generation of artificial intelligence technology are shown as follows: with the help of GPU computing, cloud computing and other high-performance distributed computing capabilities, machine learning algorithms represented by deep learning algorithms are used for learning and training on big data to simulate, extend and expand human intelligence. Different data sources and computing physical locations make the current machine learning face serious privacy leakage problem, so the Privacy Protection of Machine (PPM) Learning has become a widely concerned research area. Using cryptography technology to solve the problem of privacy in machine learning is an important technology to protect the privacy of machine learning. Cryptographic tools used in privacy-preserving machine learning are introduced, such as general Secure Multi-Party Computing (SMPC), privacy protection set operation and Homomorphic Encryption (HE), describes the status and developments applying the tools to solving the problems of privacy protection in various stages of machine learning, such as data processing, model training, model testing, and data prediction.
- Privacy-Preserving Machine (PPM) learning /
- Secure MultiParty Computation(SMPC) /
- Homomorphic Encryption(HE) /
- Private Set Intersection(PSI)

HTML全文

图 1 机器学习的一般过程

下载: 全尺寸图片幻灯片

图 2 Sigmoid函数与分段激活函数图像

下载: 全尺寸图片幻灯片

图 3 联邦学习

下载: 全尺寸图片幻灯片

图 4 安全数据聚合协议

下载: 全尺寸图片幻灯片

表 1 MiniONN效率实验结果

数据集	MNIST	CIFAR10
精确度(%)	99.52	91.5
运行时间(s)	320	11686
数据传输(MB)	336.7	1803
#p/h	163840	2524

下载: 导出CSV

表 2 文献[42]效率分析

	通信	计算	存储
用户端	$O\left( {\left( {\lambda + \mu } \right)m + nr} \right)$	$O\left( {m{ {\left( {\lg\left( m \right)} \right)}^2} + \left( {l + 1} \right) \cdot c\left( {r,n} \right)} \right)$	$O\left( {4k\lambda + \mu \left( {m + 3\left\lceil {\dfrac{l}{2} } \right\rceil } \right) + nr} \right)$
服务器端	$O\left( { {m^2}\mu + nmr + \dfrac{ {ml} }{2} } \right)$	$O\left( {{m^2} + \left( {m - \zeta } \right) \cdot l \cdot c\left( {r,n} \right)} \right)$	$O\left( {mnr + {m^2}\mu + \dfrac{ {ml\mu } }{2} } \right)ght)$

下载: 导出CSV

参考文献(43)

YAO A C. Protocols for secure computations[C]. The 23rd Annual Symposium on Foundations of Computer Science, Chicago, USA, 1982: 160–164.

YAO A C C. How to generate and exchange secrets[C]. The 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 1986: 162–167.

GOLDREICH O, MICALI S, and WIGDERSON A. How to play ANY mental game[C]. The 19th Annual ACM Symposium on Theory of Computing, New York, USA, 1987: 218–229.

BEN-OR M, GOLDWASSER S, and WIGDERSON A. Completeness theorems for non-cryptographic fault-tolerant distributed computation[C]. The 20th Annual ACM Symposium on Theory of Computing, Chicago, USA, 1988: 1–10.

蒋瀚, 徐秋亮. 实用安全多方计算协议关键技术研究进展[J]. 计算机研究与发展, 2015, 52(10): 2247–2257. doi: 10.7544/issn1000-1239.2015.20150763

JIANG Han and XU Qiuliang. Advances in key techniques of practical secure multi-party computation[J]. Journal of Computer Research and Development, 2015, 52(10): 2247–2257. doi: 10.7544/issn1000-1239.2015.20150763

BEAVER D. Efficient multiparty protocols using circuit randomization[C]. Annual International Cryptology Conference, Santa Barbara, USA, 1992: 420–432.

BENDLIN R, DAMGÅRD I, ORLANDI C, et al. Semi-homomorphic encryption and multiparty computation[C]. The 30th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tallinn, Estonia, 2011: 169–188.

DAMGÅRD I, PASTRO V, SMART N, et al. Multiparty computation from somewhat homomorphic encryption[C]. The 32nd Annual International Cryptology Conference, Santa Barbara, USA, 2012: 643–662.

PAILLIER P. Public-key cryptosystems based on composite degree residuosity classes[C]. International Conference on the Theory and Applications of Cryptographic Techniques, Prague, Czech Republic, 1999: 223–238.

GENTRY C. A fully homomorphic encryption scheme[D]. [Ph.D. dissertation], Stanford University, 2009.

VAN DIJK M, GENTRY C, HALEVI S, et al. Fully homomorphic encryption over the integers[C]. The 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, France, 2010: 24–43.

BRAKERSKI Z, GENTRY C, and VAIKUNTANATHAN V. (Leveled) fully homomorphic encryption without bootstrapping[J]. ACM Transactions on Computation Theory, 2014, 6(3): No.13.

DUCAS L and MICCIANCIO D. FHEW: Bootstrapping homomorphic encryption in less than a second[C]. The 34th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria, 2015: 617–640.

孙茂华, 胡磊, 朱洪亮, 等. 布尔电路上保护隐私集合并集运算的研究与实现[J]. 电子与信息学报, 2016, 38(6): 1412–1418. doi: 10.11999/JEIT150911

SUN Maohua, HU Lei, ZHU Hongliang, et al. Research and implementation of privacy preserving set union in Boolean circuits[J]. Journal of Electronics &Information Technology, 2016, 38(6): 1412–1418. doi: 10.11999/JEIT150911

KOLESNIKOV V, KUMARESAN R, ROSULEK M, et al. Efficient batched oblivious PRF with applications to private set intersection[C]. 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 2016: 818–829.

KOLESNIKOV V, ROSULEK M, TRIEU N, et al. Scalable private set union from symmetric-key techniques[EB/OL]. https://eprint.iacr.org/2019/776, 2019.

BARNI M, ORLANDI C, and PIVA A. A privacy-preserving protocol for neural-network-based computation[C]. The 8th Workshop on Multimedia and Security, Geneva, Switzerland, 2006: 146–151.

ORLANDI C, PIVA A, and BARNI M. Oblivious neural network computing via homomorphic encryption[J]. EURASIP Journal on Information Security, 2007, 2007(1): 037343. doi: 10.1186/1687-417X-2007-037343

GRAEPEL T, LAUTER K, and NAEHRIG M. ML confidential: Machine learning on encrypted data[C]. The 15th International Conference on Information Security and Cryptology, Seoul, Korea, 2012: 1–21.

ZHANG Qingchen, YANG L T, and CHEN Zhikui. Privacy preserving deep computation model on cloud for big data feature learning[J]. IEEE Transactions on Computers, 2016, 65(5): 1351–1362. doi: 10.1109/TC.2015.2470255

HESAMIFARD E, TAKABI H, GHASEMI M, et al. Privacy-preserving machine learning in cloud[C]. 2017 ACM Cloud Computing Security Workshop, Dallas, USA, 2017: 39–43.

ROUHANI B D, RIAZI M S, and KOUSHANFAR F. DeepSecure: Scalable provably-secure deep learning[C]. The 55th ACM/ESDA/IEEE Design Automation Conference, San Francisco, USA, 2018: 1–6.

NIKOLAENKO V, WEINSBERG U, IOANNIDIS S, et al. Privacy-preserving ridge regression on hundreds of millions of records[C]. 2013 IEEE Symposium on Security and Privacy, Berkeley, USA, 2013: 334–348.

MOHASSEL P and ZHANG Yupeng. SecureML: A system for scalable privacy-preserving machine learning[C]. 2017 IEEE Symposium on Security and Privacy, San Jose, USA, 2017: 19–38.

CHANDRAN N, GUPTA D, RASTOGI A, et al. EzPC: Programmable, efficient, and scalable secure two-party computation for machine learning[EB/OL]. https://eprint.iacr.org/2017/1109, 2017.

DEMMLER D, SCHNEIDER T, and ZOHNER M. ABY-A framework for efficient mixed-protocol secure two-party computation[C]. 2015 Network and Distributed System Security, San Diego, USA, 2015: 1–15.

XIE Pengtao, BILENKO M, FINLEY T, et al. Crypto-nets: Neural networks over encrypted data[EB/OL]. https://arxiv.org/abs/1412.6181, 2014.

DOWLIN N, GILAD-BACHRACH R, LAINE K, et al. CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy[C]. The 33rd International Conference on Machine Learning, New York, USA, 2016: 201–210.

BOS J W, LAUTER K, LOFTUS J, et al. Improved security for a ring-based fully homomorphic encryption scheme[C]. The 14th IMA International Conference on Cryptography and Coding, Oxford, England, 2013: 45–64.

CHABANNE H, DE WARGNY A, MILGRAM J, et al. Privacy-preserving classification on deep neural network[EB/OL]. https://eprint.iacr.org/2017/035, 2017.

HESAMIFARD E, TAKABI H, and GHASEMI M. CryptoDL: Deep neural networks over encrypted data[EB/OL]. https://arxiv.org/abs/1711.05189, 2017.

LIU Jian, JUUTI M, LU Yao, et al. Oblivious neural network predictions via MiniONN transformations[C]. 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, USA, 2017: 619–631.

COURBARIAUX M, HUBARA I, SOUDRY D, et al. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or –1[EB/OL]. https://arxiv.org/abs/1602.02830, 2016.

BOURSE F, MINELLI M, MINIHOLD M, et al. Fast homomorphic evaluation of deep discretized neural networks[C]. The 38th Annual International Cryptology Conference, Santa Barbara, USA, 2018: 483–512.

SANYAL A, KUSNER M J, GASCÓN A, et al. TAPAS: Tricks to accelerate (encrypted) prediction as a service[EB/OL]. https://arxiv.org/abs/1806.03461, 2018.

SADEGH RIAZI M, SAMRAGH M, CHEN Hao, et al. XONN: XNOR-based oblivious deep neural network inference[C]. The 28th USENIX Conference on Security Symposium, Santa Clara, USA, 2019: 1501–1518.

MCMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]. The 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, 2017: 1272–1282.

KONEČNÝ J, MCMAHAN B, and RAMAGE D. Federated optimization: Distributed optimization beyond the datacenter[EB/OL]. https://arxiv.org/abs/1511.03575, 2015,

KONEČNÝ J, MCMAHAN H B, YU F X, et al. Federated learning: Strategies for improving communication efficiency[EB/OL]. https://arxiv.org/abs/1610.05492, 2016.

杨立君, 丁超, 吴蒙. 一种同时保障隐私性与完整性的无线传感器网络可恢复数据聚合方案[J]. 电子与信息学报, 2015, 37(12): 2808–2814. doi: 10.11999/JEIT150208

YANG Lijun, DING Chao, and WU Meng. A recoverable privacy-preserving integrity-assured data aggregation scheme for wireless sensor networks[J]. Journal of Electronics &Information Technology, 2015, 37(12): 2808–2814. doi: 10.11999/JEIT150208

BONAWITZ K, IVANOV V, KREUTER B, et al. Practical secure aggregation for privacy-preserving machine learning[C]. 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, USA, 2017: 1175–1191.

MANDAL K, GONG Guang, and LIU Chuyi. NIKE-based fast privacy-preserving high-dimensional data aggregation for mobile devices[R]. CACR Technical Report, CACR 2018–10, 2018.

MANDAL K and GONG Guang. PrivFL: Practical privacy-preserving federated regressions on high-dimensional data over mobile networks[EB/OL]. https://eprint.iacr.org/2019/979, 2019.

施引文献

资源附件(0)

访问统计