高级搜索

隐私保护机器学习的密码学方法

蒋瀚 刘怡然 宋祥福 王皓 郑志华 徐秋亮

引用本文: 蒋瀚, 刘怡然, 宋祥福, 王皓, 郑志华, 徐秋亮. 隐私保护机器学习的密码学方法[J]. 电子与信息学报, 2020, 42(5): 1068-1078. doi: 10.11999/JEIT190887 shu
Citation:  Han JIANG, Yiran LIU, Xiangfu SONG, Hao WANG, Zhihua ZHENG, Qiuliang XU. Cryptographic Approaches for Privacy-Preserving Machine Learning[J]. Journal of Electronics and Information Technology, 2020, 42(5): 1068-1078. doi: 10.11999/JEIT190887 shu

隐私保护机器学习的密码学方法

    作者简介: 蒋瀚: 男,1974年生,讲师,研究方向为密码学与信息安全;
    刘怡然: 女,1996年生,博士生,研究方向为密码学与信息安全;
    宋祥福: 男,1992年生,博士生,研究方向为密码学与信息安全;
    王皓: 男,1984年生,副教授,研究方向为密码学与信息安全;
    郑志华: 女,1962年生,副教授,研究方向为密码学与信息安全;
    徐秋亮: 男,1960年生,教授,研究方向为密码学与信息安全
    通讯作者: 徐秋亮,xql@sdu.edu.cn
  • 基金项目: 国家自然科学基金(61632020, 61572294);山东省自然科学基金(ZR2017MF021);山东省科技重大创新工程项目(2018CXGC0702);山东半岛国家自主创新示范区发展建设项目(S190101010001)

摘要: 新一代人工智能技术的特征,表现为借助GPU计算、云计算等高性能分布式计算能力,使用以深度学习算法为代表的机器学习算法,在大数据上进行学习训练,来模拟、延伸和扩展人的智能。不同数据来源、不同的计算物理位置,使得目前的机器学习面临严重的隐私泄露问题,因此隐私保护机器学习(PPM)成为目前广受关注的研究领域。采用密码学工具来解决机器学习中的隐私问题,是隐私保护机器学习重要的技术。该文介绍隐私保护机器学习中常用的密码学工具,包括通用安全多方计算(SMPC)、隐私保护集合运算、同态加密(HE)等,以及应用它们来解决机器学习中数据整理、模型训练、模型测试、数据预测等各个阶段中存在的隐私保护问题的研究方法与研究现状。

English

    1. [1]

      YAO A C. Protocols for secure computations[C]. The 23rd Annual Symposium on Foundations of Computer Science, Chicago, USA, 1982: 160–164.

    2. [2]

      YAO A C C. How to generate and exchange secrets[C]. The 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 1986: 162–167.

    3. [3]

      GOLDREICH O, MICALI S, and WIGDERSON A. How to play ANY mental game[C]. The 19th Annual ACM Symposium on Theory of Computing, New York, USA, 1987: 218–229.

    4. [4]

      BEN-OR M, GOLDWASSER S, and WIGDERSON A. Completeness theorems for non-cryptographic fault-tolerant distributed computation[C]. The 20th Annual ACM Symposium on Theory of Computing, Chicago, USA, 1988: 1–10.

    5. [5]

      蒋瀚, 徐秋亮. 实用安全多方计算协议关键技术研究进展[J]. 计算机研究与发展, 2015, 52(10): 2247–2257. doi: 10.7544/issn1000-1239.2015.20150763
      JIANG Han and XU Qiuliang. Advances in key techniques of practical secure multi-party computation[J]. Journal of Computer Research and Development, 2015, 52(10): 2247–2257. doi: 10.7544/issn1000-1239.2015.20150763

    6. [6]

      BEAVER D. Efficient multiparty protocols using circuit randomization[C]. Annual International Cryptology Conference, Santa Barbara, USA, 1992: 420–432.

    7. [7]

      BENDLIN R, DAMGÅRD I, ORLANDI C, et al. Semi-homomorphic encryption and multiparty computation[C]. The 30th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tallinn, Estonia, 2011: 169–188.

    8. [8]

      DAMGÅRD I, PASTRO V, SMART N, et al. Multiparty computation from somewhat homomorphic encryption[C]. The 32nd Annual International Cryptology Conference, Santa Barbara, USA, 2012: 643–662.

    9. [9]

      PAILLIER P. Public-key cryptosystems based on composite degree residuosity classes[C]. International Conference on the Theory and Applications of Cryptographic Techniques, Prague, Czech Republic, 1999: 223–238.

    10. [10]

      GENTRY C. A fully homomorphic encryption scheme[D]. [Ph.D. dissertation], Stanford University, 2009.

    11. [11]

      VAN DIJK M, GENTRY C, HALEVI S, et al. Fully homomorphic encryption over the integers[C]. The 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, France, 2010: 24–43.

    12. [12]

      BRAKERSKI Z, GENTRY C, and VAIKUNTANATHAN V. (Leveled) fully homomorphic encryption without bootstrapping[J]. ACM Transactions on Computation Theory, 2014, 6(3): No.13.

    13. [13]

      DUCAS L and MICCIANCIO D. FHEW: Bootstrapping homomorphic encryption in less than a second[C]. The 34th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Sofia, Bulgaria, 2015: 617–640.

    14. [14]

      孙茂华, 胡磊, 朱洪亮, 等. 布尔电路上保护隐私集合并集运算的研究与实现[J]. 电子与信息学报, 2016, 38(6): 1412–1418. doi: 10.11999/JEIT150911
      SUN Maohua, HU Lei, ZHU Hongliang, et al. Research and implementation of privacy preserving set union in Boolean circuits[J]. Journal of Electronics &Information Technology, 2016, 38(6): 1412–1418. doi: 10.11999/JEIT150911

    15. [15]

      KOLESNIKOV V, KUMARESAN R, ROSULEK M, et al. Efficient batched oblivious PRF with applications to private set intersection[C]. 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 2016: 818–829.

    16. [16]

      KOLESNIKOV V, ROSULEK M, TRIEU N, et al. Scalable private set union from symmetric-key techniques[EB/OL]. https://eprint.iacr.org/2019/776, 2019.

    17. [17]

      BARNI M, ORLANDI C, and PIVA A. A privacy-preserving protocol for neural-network-based computation[C]. The 8th Workshop on Multimedia and Security, Geneva, Switzerland, 2006: 146–151.

    18. [18]

      ORLANDI C, PIVA A, and BARNI M. Oblivious neural network computing via homomorphic encryption[J]. EURASIP Journal on Information Security, 2007, 2007(1): 037343. doi: 10.1186/1687-417X-2007-037343

    19. [19]

      GRAEPEL T, LAUTER K, and NAEHRIG M. ML confidential: Machine learning on encrypted data[C]. The 15th International Conference on Information Security and Cryptology, Seoul, Korea, 2012: 1–21.

    20. [20]

      ZHANG Qingchen, YANG L T, and CHEN Zhikui. Privacy preserving deep computation model on cloud for big data feature learning[J]. IEEE Transactions on Computers, 2016, 65(5): 1351–1362. doi: 10.1109/TC.2015.2470255

    21. [21]

      HESAMIFARD E, TAKABI H, GHASEMI M, et al. Privacy-preserving machine learning in cloud[C]. 2017 ACM Cloud Computing Security Workshop, Dallas, USA, 2017: 39–43.

    22. [22]

      ROUHANI B D, RIAZI M S, and KOUSHANFAR F. DeepSecure: Scalable provably-secure deep learning[C]. The 55th ACM/ESDA/IEEE Design Automation Conference, San Francisco, USA, 2018: 1–6.

    23. [23]

      NIKOLAENKO V, WEINSBERG U, IOANNIDIS S, et al. Privacy-preserving ridge regression on hundreds of millions of records[C]. 2013 IEEE Symposium on Security and Privacy, Berkeley, USA, 2013: 334–348.

    24. [24]

      MOHASSEL P and ZHANG Yupeng. SecureML: A system for scalable privacy-preserving machine learning[C]. 2017 IEEE Symposium on Security and Privacy, San Jose, USA, 2017: 19–38.

    25. [25]

      CHANDRAN N, GUPTA D, RASTOGI A, et al. EzPC: Programmable, efficient, and scalable secure two-party computation for machine learning[EB/OL]. https://eprint.iacr.org/2017/1109, 2017.

    26. [26]

      DEMMLER D, SCHNEIDER T, and ZOHNER M. ABY-A framework for efficient mixed-protocol secure two-party computation[C]. 2015 Network and Distributed System Security, San Diego, USA, 2015: 1–15.

    27. [27]

      XIE Pengtao, BILENKO M, FINLEY T, et al. Crypto-nets: Neural networks over encrypted data[EB/OL]. https://arxiv.org/abs/1412.6181, 2014.

    28. [28]

      DOWLIN N, GILAD-BACHRACH R, LAINE K, et al. CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy[C]. The 33rd International Conference on Machine Learning, New York, USA, 2016: 201–210.

    29. [29]

      BOS J W, LAUTER K, LOFTUS J, et al. Improved security for a ring-based fully homomorphic encryption scheme[C]. The 14th IMA International Conference on Cryptography and Coding, Oxford, England, 2013: 45–64.

    30. [30]

      CHABANNE H, DE WARGNY A, MILGRAM J, et al. Privacy-preserving classification on deep neural network[EB/OL]. https://eprint.iacr.org/2017/035, 2017.

    31. [31]

      HESAMIFARD E, TAKABI H, and GHASEMI M. CryptoDL: Deep neural networks over encrypted data[EB/OL]. https://arxiv.org/abs/1711.05189, 2017.

    32. [32]

      LIU Jian, JUUTI M, LU Yao, et al. Oblivious neural network predictions via MiniONN transformations[C]. 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, USA, 2017: 619–631.

    33. [33]

      COURBARIAUX M, HUBARA I, SOUDRY D, et al. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or –1[EB/OL]. https://arxiv.org/abs/1602.02830, 2016.

    34. [34]

      BOURSE F, MINELLI M, MINIHOLD M, et al. Fast homomorphic evaluation of deep discretized neural networks[C]. The 38th Annual International Cryptology Conference, Santa Barbara, USA, 2018: 483–512.

    35. [35]

      SANYAL A, KUSNER M J, GASCÓN A, et al. TAPAS: Tricks to accelerate (encrypted) prediction as a service[EB/OL]. https://arxiv.org/abs/1806.03461, 2018.

    36. [36]

      SADEGH RIAZI M, SAMRAGH M, CHEN Hao, et al. XONN: XNOR-based oblivious deep neural network inference[C]. The 28th USENIX Conference on Security Symposium, Santa Clara, USA, 2019: 1501–1518.

    37. [37]

      MCMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]. The 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, 2017: 1272–1282.

    38. [38]

      KONEČNÝ J, MCMAHAN B, and RAMAGE D. Federated optimization: Distributed optimization beyond the datacenter[EB/OL]. https://arxiv.org/abs/1511.03575, 2015,

    39. [39]

      KONEČNÝ J, MCMAHAN H B, YU F X, et al. Federated learning: Strategies for improving communication efficiency[EB/OL]. https://arxiv.org/abs/1610.05492, 2016.

    40. [40]

      杨立君, 丁超, 吴蒙. 一种同时保障隐私性与完整性的无线传感器网络可恢复数据聚合方案[J]. 电子与信息学报, 2015, 37(12): 2808–2814. doi: 10.11999/JEIT150208
      YANG Lijun, DING Chao, and WU Meng. A recoverable privacy-preserving integrity-assured data aggregation scheme for wireless sensor networks[J]. Journal of Electronics &Information Technology, 2015, 37(12): 2808–2814. doi: 10.11999/JEIT150208

    41. [41]

      BONAWITZ K, IVANOV V, KREUTER B, et al. Practical secure aggregation for privacy-preserving machine learning[C]. 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, USA, 2017: 1175–1191.

    42. [42]

      MANDAL K, GONG Guang, and LIU Chuyi. NIKE-based fast privacy-preserving high-dimensional data aggregation for mobile devices[R]. CACR Technical Report, CACR 2018–10, 2018.

    43. [43]

      MANDAL K and GONG Guang. PrivFL: Practical privacy-preserving federated regressions on high-dimensional data over mobile networks[EB/OL]. https://eprint.iacr.org/2019/979, 2019.

    1. [1]

      杨亚涛, 赵阳, 张卷美, 黄洁润, 高原. 同态密码理论与应用进展. 电子与信息学报, 2020, 42(0): 1-13.

    2. [2]

      李攀攀, 谢正霞, 周志刚, 乐光学, 郑仕链, 杨小牛. 基于Hilbert填充曲线的海洋无线传感网源节点位置隐私保护方法. 电子与信息学报, 2020, 42(6): 1510-1518.

    3. [3]

      刘雪艳, 芦婷婷, 杨晓涛. 具有隐私保护的完整性可验证的关键字搜索方案. 电子与信息学报, 2020, 41(0): 1-8.

    4. [4]

      方维维, 刘梦然, 王云鹏, 李阳阳, 安竹林. 面向物联网隐私数据分析的分布式弹性网络回归学习算法. 电子与信息学报, 2020, 42(0): 1-9.

    5. [5]

      牛淑芬, 谢亚亚, 杨平平, 王彩芬, 杜小妮. 加密邮件系统中基于身份的可搜索加密方案. 电子与信息学报, 2020, 42(7): 1803-1810.

    6. [6]

      田俊峰, 井宣. 多方参与高效撤销组成员的共享数据审计方案. 电子与信息学报, 2020, 42(6): 1534-1541.

    7. [7]

      张玉磊, 陈文娟, 张永洁, 张雪微, 王彩芬. 支持关键字搜索的无证书密文等值测试加密方案. 电子与信息学报, 2020, 41(0): 1-7.

    8. [8]

      张玉磊, 文龙, 王浩浩, 张永洁, 王彩芬. 多用户环境下无证书认证可搜索加密方案. 电子与信息学报, 2020, 42(5): 1094-1101.

    9. [9]

      牛莹, 张勋才. 基于变步长约瑟夫遍历和DNA动态编码的图像加密算法. 电子与信息学报, 2020, 42(6): 1383-1391.

    10. [10]

      王立辉, 闫守礼, 李清. 一种轻量级数据加密标准循环掩码实现方案. 电子与信息学报, 2020, 41(0): 1-8.

    11. [11]

      高巍, 蒋刚毅, 郁梅, 骆挺. 基于熵编码的立体视频加密与信息隐藏算法. 电子与信息学报, 2020, 41(0): 1-8.

    12. [12]

      王君珂, 印珏, 牛人杰, 任少康, 晁洁. DNA计算与DNA纳米技术. 电子与信息学报, 2020, 42(6): 1313-1325.

    13. [13]

      徐瑨, 吴慧慈, 陶小峰. 5G网络空间安全对抗博弈. 电子与信息学报, 2020, 41(0): 1-11.

    14. [14]

      夏晓峰, 向宏, 肖震宇, 蔡挺. 基于国产密码算法的数控网络的双层安全防护模型研究及安全评估. 电子与信息学报, 2020, 42(0): 1-7.

    15. [15]

      殷志祥, 唐震, 张强, 崔建中, 杨静, 王日晟, 赵寿为, 张居丽. 基于DNA折纸基底的与非门计算模型. 电子与信息学报, 2020, 42(6): 1355-1364.

    16. [16]

      谢永, 李香, 张松松, 吴黎兵. 一种可证安全的车联网无证书聚合签名改进方案. 电子与信息学报, 2020, 42(5): 1125-1131.

    17. [17]

      欧静兰, 余欢欢, 吴皓威, 马锐, 王柳彬. 基于携能通信的非信任双向中继网络安全传输方案. 电子与信息学报, 2020, 42(0): 1-7.

    18. [18]

      李建华. 能源关键基础设施网络安全威胁与防御技术综述. 电子与信息学报, 2020, 41(0): 1-17.

    19. [19]

      申滨, 吴和彪, 赵书锋, 崔太平. 基于稀疏感知有序干扰消除的大规模机器类通信系统多用户检测. 电子与信息学报, 2020, 41(0): 1-9.

    20. [20]

      陈容, 陈岚, WAHLAArfan Haider. 基于公式递推法的可变计算位宽的循环冗余校验设计与实现. 电子与信息学报, 2020, 42(5): 1261-1267.

  • 图 1  机器学习的一般过程

    图 2  Sigmoid函数与分段激活函数图像

    图 3  联邦学习

    图 4  安全数据聚合协议

    表 1  MiniONN效率实验结果

    数据集MNISTCIFAR10
    精确度(%)99.5291.5
    运行时间(s)32011686
    数据传输(MB)336.71803
    #p/h1638402524
    下载: 导出CSV

    表 2  文献[42]效率分析

    通信计算存储
    用户端$O\left( {\left( {\lambda + \mu } \right)m + nr} \right)$$O\left( {m{ {\left( {\lg\left( m \right)} \right)}^2} + \left( {l + 1} \right) \cdot c\left( {r,n} \right)} \right)$$O\left( {4k\lambda + \mu \left( {m + 3\left\lceil {\dfrac{l}{2} } \right\rceil } \right) + nr} \right)$
    服务器端$O\left( { {m^2}\mu + nmr + \dfrac{ {ml} }{2} } \right)$$O\left( {{m^2} + \left( {m - \zeta } \right) \cdot l \cdot c\left( {r,n} \right)} \right)$$O\left( {mnr + {m^2}\mu + \dfrac{ {ml\mu } }{2} } \right)ght)$
    下载: 导出CSV
  • 加载中
图(4)表(2)
计量
  • PDF下载量:  105
  • 文章访问数:  2614
  • HTML全文浏览量:  1380
文章相关
  • 通讯作者:  徐秋亮, xql@sdu.edu.cn
  • 收稿日期:  2019-11-06
  • 录用日期:  2020-03-08
  • 网络出版日期:  2020-04-03
  • 刊出日期:  2020-05-01
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

返回文章