高级搜索

基于差分隐私模型的位置轨迹发布技术研究

冯登国 张敏 叶宇桐

引用本文: 冯登国, 张敏, 叶宇桐. 基于差分隐私模型的位置轨迹发布技术研究[J]. 电子与信息学报, 2020, 42(1): 74-88. doi: 10.11999/JEIT190632 shu
Citation:  Dengguo FENG, Min ZHANG, Yutong YE. Research on Differentially Private Trajectory Data Publishing[J]. Journal of Electronics and Information Technology, 2020, 42(1): 74-88. doi: 10.11999/JEIT190632 shu

基于差分隐私模型的位置轨迹发布技术研究

    作者简介: 冯登国: 男,1965年生,中国科学院院士,研究员,研究方向为网络与信息安全;
    张敏: 女,1975年生,研究员,研究方向为数据安全与隐私保护;
    叶宇桐: 男,1993年生,博士生,研究方向为差分隐私保护技术
    通讯作者: 冯登国,feng@is.iscas.ac.cn
  • 基金项目: 国家自然科学基金(U1636216)

摘要: 位置轨迹大数据的安全分享、发布需求离不开位置轨迹隐私保护技术支持。在差分隐私出现之前,K-匿名及其衍生模型为位置轨迹隐私保护提供了一种量化评估的手段,但其安全性严重依赖于攻击者所掌握的背景知识,当有新的攻击出现时模型无法提供完善的隐私保护。差分隐私技术的出现有效地弥补了上述问题,越来越多地应用于轨迹数据隐私发布领域中。该文对基于差分隐私理论的轨迹隐私保护技术进行了研究与分析,重点介绍了差分隐私模型下位置直方图、轨迹直方图等空间统计数据发布方法,差分隐私模型下轨迹数据集发布方法,以及连续轨迹实时发布隐私保护模型。与此同时,在对现有方法对比分析的基础上,提出了未来的重点发展方向。

English

    1. [1]

      BAO Jie, HE Tianfu, RUAN Sijie, et al. Planning bike lanes based on sharing-bikes’ trajectories[C]. The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, 2017: 1377–1386.

    2. [2]

      YUAN Jing, ZHENG Yu, ZHANG Chengyang, et al. T-drive: Driving directions based on taxi trajectories[C]. The 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, USA, 2010: 99–108.

    3. [3]

      TOCKAR A. Riding with the stars: Passenger privacy in the NYC Taxicab dataset[EB/OL]. https://research.neustar.biz/author/atockar/, 2018.

    4. [4]

      W-Pwn. 健身APP泄露军事机密, 包括中国南海[EB/OL]. https://zhuanlan.zhihu.com/p/33405626, 2018.

    5. [5]

      CHEN Zhenyu, FU Yanyan, ZHANG Min, et al. The de-anonymization method based on user spatio-temporal mobility trace[C]. The 19th International Conference on Information and Communications Security, Beijing, China, 2017: 459–471.

    6. [6]

      ABUL O, BONCHI F, and NANNI M. Never walk alone: Uncertainty for anonymity in moving objects databases[C]. The 24th IEEE International Conference on Data Engineering, Cancun, Mexico, 2008: 376–385.

    7. [7]

      TERROVITIS M, POULIS G, MAMOULIS N, et al. Local suppression and splitting techniques for privacy preserving publication of trajectories[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(7): 1466–1479. doi: 10.1109/TKDE.2017.2675420

    8. [8]

      YAO Lin, WANG Xinyu, WANG Xin, et al. Publishing sensitive trajectory data under enhanced l-diversity model[C]. The 20th IEEE International Conference on Mobile Data Management, Hong Kong, China, 2019: 160–169.

    9. [9]

      冯登国. 大数据安全与隐私保护[M]. 北京: 清华大学出版社, 2018: 194–219.

    10. [10]

      DWORK C. Differential privacy[C]. The 33rd International Colloquium on Automata, Languages and Programming, Venice, Italy, 2006: 1–12. doi: 10.1007/11787006_1.

    11. [11]

      ZHU Tianqing, LI Gang, ZHOU Wanlei, et al. Differentially private data publishing and analysis: A survey[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(8): 1619–1638. doi: 10.1109/TKDE.2017.2697856

    12. [12]

      MCSHERRY F and TALWAR K. Mechanism design via differential privacy[C]. The 48th Annual IEEE Symposium on Foundations of Computer Science, Providence, USA, 2007: 94–103.

    13. [13]

      MCSHERRY F D. Privacy integrated queries: An extensible platform for privacy-preserving data analysis[C]. The 2009 ACM SIGMOD International Conference on Management of Data, Rhode Island, USA, 2009: 19–30.

    14. [14]

      DWORK C, KENTHAPADI K, MCSHERRY F, et al. Our data, ourselves: Privacy via distributed noise generation[C]. The 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Peterbury, Russia, 2006: 486–503.

    15. [15]

      KIFER D and MACHANAVAJJHALA A. No free lunch in data privacy[C]. The 2011 ACM SIGMOD International Conference on Management of Data, Athens, Greece, 2011: 193–204.

    16. [16]

      GEHRKE J, LUI E, and PASS R. Towards privacy for social networks: A zero-knowledge based definition of privacy[C]. The 8th Conference on Theory of Cryptography, Providence, USA, 2011: 432–449.

    17. [17]

      KIFER D and MACHANAVAJJHALA A. Pufferfish: A framework for mathematical privacy definitions[J]. ACM Transactions on Database Systems, 2014, 39(1): Article No.3. doi: 10.1145/2514689

    18. [18]

      DWORK C. A firm foundation for private data analysis[J]. Communications of the ACM, 2011, 54(1): 86–95. doi: 10.1145/1866739.1866758

    19. [19]

      DWORK C, MCSHERRY F, NISSIM K, et al. Calibrating noise to sensitivity in private data analysis[C]. The 3rd Theory of Cryptography Conference, New York, USA, 2006: 265–284.

    20. [20]

      QARDAJI W, YANG Weining, and LI Ninghui. Differentially private grids for geospatial data[C]. The 29th IEEE International Conference on Data Engineering, Brisbane, Australia, 2013: 757–768.

    21. [21]

      CORMODE G, PROCOPIUC C, SRIVASTAVA D, et al. Differentially private spatial decompositions[C]. The 28th IEEE International Conference on Data Engineering, Washington, USA, 2012: 20–31.

    22. [22]

      ZHANG Jun, XIAO Xiaokui, and XIE Xing. PrivTree: A differentially private algorithm for hierarchical decompositions[C]. The 2016 International Conference on Management of Data, San Francisco, USA, 2016: 155–170.

    23. [23]

      HAY M, RASTOGI V, MIKLAU G, et al. Boosting the accuracy of differentially private histograms through consistency[J]. Proceedings of the VLDB Endowment, 2010, 3(1/2): 1021–1032.

    24. [24]

      XIAO Yonghui, XIONG Li, and YUAN Chun. Differentially private data release through multidimensional partitioning[C]. The 7th Workshop on Secure Data Management, Singapore, 2010: 150–168.

    25. [25]

      XIE Hairuo, TANIN E, and KULIK L. Distributed histograms for processing aggregate data from moving objects[C]. 2007 International Conference on Mobile Data Management, Mannheim, Germany, 2007: 152–157.

    26. [26]

      XIE Hairuo, TANIN E, KULIK L, et al. Euler histogram tree: A spatial data structure for aggregate range queries on vehicle trajectories[C]. The 7th ACM SIGSPATIAL International Workshop on Computational Transportation Science, Dallas/Fort Worth, USA, 2014: 18–24.

    27. [27]

      GHANE S, KULIK L, and RAMAMOHANARAO K. Publishing spatial histograms under differential privacy[C]. The 30th International Conference on Scientific and Statistical Database Management, Bozen-Bolzano, Italy, 2018: Article No.14.

    28. [28]

      HARDT M, LIGETT K, and MCSHERRY F. A simple and practical algorithm for differentially private data release[C]. Advances in Neural Information Processing Systems, Lake Tahoe, USA, 2012: 2339–2347.

    29. [29]

      TO H, NGUYEN K, and SHAHABI C. Differentially private publication of location entropy[C]. The 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Burlingame, USA, 2016: Article No. 35.

    30. [30]

      CHEN Rui, FUNG B C M, DESAI B C, et al. Differentially private transit data publication: A case study on the montreal transportation system[C]. The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 2012: 213–221.

    31. [31]

      CHEN Rui, ACS G, and CASTELLUCCIA C. Differentially private sequential data publication via variable-length n-grams[C]. The 2012 ACM Conference on Computer and Communications Security, Raleigh, USA, 2012: 638–649.

    32. [32]

      HE Xi, CORMODE G, MACHANAVAJJHALA A, et al. DPT: Differentially private trajectory synthesis using hierarchical reference systems[J]. The VLDB Endowment, 2015, 8(11): 1154–1165. doi: 10.14778/2809974.2809978

    33. [33]

      HE Xi, RAVAL N, and MACHANAVAJJHALA A. A demonstration of VisDPT: Visual exploration of differentially private trajectories[J]. The VLDB Endowment, 2016, 9(13): 1489–1492. doi: 10.14778/3007263.3007291

    34. [34]

      HUA Jingyu, GAO Yue, and ZHONG Sheng. Differentially private publication of general time-serial trajectory data[C]. 2015 IEEE Conference on Computer Communications, Hong Kong, China, 2015: 549–557. doi: 10.1109/INFOCOM.2015.7218422.

    35. [35]

      LI Meng, ZHU Liehuang, ZHANG ZIjian, et al. Achieving differential privacy of trajectory data publishing in participatory sensing[J]. Information Sciences, 2017, 400/401: 1–13. doi: 10.1016/j.ins.2017.03.015

    36. [36]

      ERLINGSSON Ú, PIHUR V, and KOROLOVA A. Rappor: Randomized aggregatable privacy-preserving ordinal response[C]. The 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, USA, 2014: 1054–1067.

    37. [37]

      BASSILY R and SMITH A. Local, private, efficient protocols for succinct histograms[C]. The 47th Annual ACM Symposium on Theory of Computing, Portland, USA, 2015: 127–135.

    38. [38]

      XIAO Yonghui and XIONG Li. Protecting locations with differential privacy under temporal correlations[C]. The 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, USA, 2015: 1298–1309.

    39. [39]

      WANG Shuo, SINNOTT R, and NEPAL S. Protecting the location privacy of mobile social media users[C]. IEEE International Conference on Big Data, Washington, USA, 2016: 1143–1150.

    40. [40]

      HARDT M and TALWAR K. On the geometry of differential privacy[C]. The 42nd ACM Symposium on Theory of Computing, Cambridge, UK, 2010: 705–714.

    41. [41]

      SONG Shuang, WANG Yizhen, and CHAUDHURI K. Pufferfish privacy mechanisms for correlated data[C]. The 2017 ACM International Conference on Management of Data, Chicago, USA, 2017: 1291–1306.

    42. [42]

      OU Lu, QIN Zheng, LIAO Shaolin, et al. An optimal pufferfish privacy mechanism for temporally correlated trajectories[J]. IEEE Access, 2018, 6: 37150–37165. doi: 10.1109/ACCESS.2018.2847720

    43. [43]

      DWORK C and ROTH A. The algorithmic foundations of differential privacy[J]. Foundations and Trends® in Theoretical Computer Science, 2014, 9(3/4): 211–407. doi: 10.1561/0400000042

    44. [44]

      KELLARIS G, PAPADOPOULOS S, XIAO Xiaokui, et al. Differentially private event sequences over infinite streams[J]. Proceedings of the VLDB Endowment, 2014, 7(12): 1155–1166. doi: 10.14778/2732977.2732989

    45. [45]

      ABADI M, CHU A, GOODFELLOW I, et al. Deep learning with differential privacy[C]. The 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 2016: 308–318.

    46. [46]

      GAO Qiang, ZHOU Fan, ZHANG Kunpeng, et al. Identifying human mobility via trajectory embeddings[C]. The 26th International Joint Conference on Artificial Intelligence, Macao, China, 2017: 1689–1695.

    47. [47]

      WANG Guowei, LIAO Dongliang, and LI Jing. Complete user mobility via user and trajectory embeddings[J]. IEEE Access, 2018, 6: 72125–72136. doi: 10.1109/ACCESS.2018.2881457

    48. [48]

      ZHOU Fan, GAO Qiang, TRAJCEVSKI G, et al. Trajectory-user linking via variational autoencoder[C]. The 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018: 3212–3218.

    49. [49]

      LI Xiucheng, ZHAO Kaiqi, CONG Gao, et al. Deep representation learning for trajectory similarity computation[C]. The 34th IEEE International Conference on Data Engineering, Paris, France, 2018: 617–628.

    50. [50]

      JORGENSEN Z, YU Ting, and CORMODE G. Conservative or liberal? Personalized differential privacy[C]. The 31st IEEE International Conference on Data Engineering, Seoul, South Korea, 2015: 1023–1034.

    51. [51]

      LI Haoran, XIONG Li, JI Zhanglong, et al. Partitioning-based mechanisms under personalized differential privacy[C]. The 21st Pacific-Asia Conference on Knowledge Discovery and Data Mining, Jeju, South Korea, 2017: 615–627.

    52. [52]

      YE Yutong, ZHANG Min, FENG Dengguo, et al. Multiple privacy regimes mechanism for local differential privacy[C]. The 24th International Conference on Database Systems for Advanced Applications, Chiang Mai, Thailand, 2019: 247–263.

    53. [53]

      CHEN Rui, LI Haoran, QIN A K, et al. Private spatial data aggregation in the local setting[C]. The 32nd IEEE International Conference on Data Engineering, Helsinki, Finland, 2016: 289–300.

    1. [1]

      方维维, 刘梦然, 王云鹏, 李阳阳, 安竹林. 面向物联网隐私数据分析的分布式弹性网络回归学习算法. 电子与信息学报, 2020, 42(0): 1-9.

    2. [2]

      杨亚涛, 赵阳, 张卷美, 黄洁润, 高原. 同态密码理论与应用进展. 电子与信息学报, 2020, 42(0): 1-13.

    3. [3]

      李攀攀, 谢正霞, 周志刚, 乐光学, 郑仕链, 杨小牛. 基于Hilbert填充曲线的海洋无线传感网源节点位置隐私保护方法. 电子与信息学报, 2020, 42(6): 1510-1518.

    4. [4]

      蒋瀚, 刘怡然, 宋祥福, 王皓, 郑志华, 徐秋亮. 隐私保护机器学习的密码学方法. 电子与信息学报, 2020, 42(5): 1068-1078.

    5. [5]

      李万林, 王超, 许国良, 雒江涛, 张轩. 基于信令数据的轨迹驻留点识别算法研究. 电子与信息学报, 2020, 42(0): 1-8.

    6. [6]

      毛秀海, 李凡, 左小磊. DNA数据存储. 电子与信息学报, 2020, 42(6): 1303-1312.

    7. [7]

      刘雪艳, 芦婷婷, 杨晓涛. 具有隐私保护的完整性可验证的关键字搜索方案. 电子与信息学报, 2020, 41(0): 1-8.

    8. [8]

      王刚, 靳彦青, 彭华, 张光伟. Lempel-Ziv-Welch压缩数据的误码纠正. 电子与信息学报, 2020, 42(6): 1436-1443.

    9. [9]

      李腾耀, 王布宏, 尚福特, 田继伟, 曹堃锐. ADS-B攻击数据弹性恢复方法. 电子与信息学报, 2020, 41(0): 1-9.

    10. [10]

      田俊峰, 井宣. 多方参与高效撤销组成员的共享数据审计方案. 电子与信息学报, 2020, 42(6): 1534-1541.

    11. [11]

      王立辉, 闫守礼, 李清. 一种轻量级数据加密标准循环掩码实现方案. 电子与信息学报, 2020, 41(0): 1-8.

    12. [12]

      左志斌, 常朝稳, 祝现威. 一种基于数据平面可编程的软件定义网络报文转发验证机制. 电子与信息学报, 2020, 42(5): 1110-1117.

    13. [13]

      苗美媛, 宋丹, 徐位凯, 湛佳, 王琳. 非平稳信道下的鲁棒数据链优化设计综述——带限环境下的混沌传输系统. 电子与信息学报, 2020, 42(0): 1-12.

    14. [14]

      袁庆军, 张勋成, 高杨, 王永娟. 轻量级分组密码PUFFIN的差分故障攻击. 电子与信息学报, 2020, 42(6): 1519-1525.

    15. [15]

      贺利芳, 吴雪霜, 张天骐. 正交多载波降噪差分混沌键控通信系统. 电子与信息学报, 2020, 41(0): 1-9.

    16. [16]

      贺利芳, 陈俊, 张天骐. 短参考多用户差分混沌移位键控通信系统性能分析. 电子与信息学报, 2020, 42(0): 1-8.

    17. [17]

      贺利芳, 吴雪霜, 张天骐. 正交多用户短参考差分混沌移位键控通信系统性能分析. 电子与信息学报, 2020, 42(0): 1-9.

    18. [18]

      张琳, 陈炳均, 吴志强. 基于矩阵低秩估计的可靠多载波差分混沌键控接收机. 电子与信息学报, 2020, 42(0): 1-8.

    19. [19]

      李根, 马彦恒, 侯建强, 徐公国. 基于子孔径Keystone变换的曲线轨迹大斜视SAR回波模拟. 电子与信息学报, 2020, 41(0): 1-8.

    20. [20]

      马彬, 王梦雪, 谢显中. 超密集异构无线网络中基于位置预测的切换算法. 电子与信息学报, 2020, 42(0): 1-9.

  • 图 1  基于四叉树的差分隐私位置计数查询方法

    图 2  EH模型功能示意图

    图 3  轨迹序列的前缀树表示

    图 4  基于N-gram的搜索树方法

    图 5  轨迹片段在RS2前缀树中的表达示例

    图 6  轨迹聚类方法过程示意图

    表 1  轨迹序列数据集

    序号路径
    1L1→L2→L3
    2L1→L2
    3L3→L2→L1
    4L1→L2→L4
    5L1→L2→L3
    6L3→L2
    7L1→L2→L4→L1
    8L3→L1
    下载: 导出CSV

    表 2  基于差分隐私保护的轨迹信息发布方法比较

    类型数据发布方法隐私保护模型隐私保护机制发布数据类型时间复杂度分析
    空间
    直方图
    四叉索引树,KD-索引树,K叉平均树,PrivTree$ \left( {\varepsilon ,\delta } \right) - {\rm{DP}}$Laplace机制位置直方图噪音的全局敏感度和树高相关,构建四叉树结构的复杂度为O(n·lgn)
    EH-DP, privSHε-DP ε-DPLaplace机制和指数机制轨迹直方图构建EH数据结构的复杂度为O (mn)
    位置熵(LE)$ \left( {\varepsilon ,\delta } \right) - {\rm{DP}}$Laplace机制位置熵直方图用到了平滑敏感度替代全局敏感度,添加噪音较少,计算所有地理位置的熵需要遍历用户的地理位置,复杂度为O(mn)
    轨迹
    集合
    发布
    前缀树,n-gram, DPT, PSTε-DPLaplace机制移动轨迹数据集构建前缀树的复杂度为O(mn·lgm)
    K-means聚类,OPT K-means聚类ε-DPLaplace机制和指数机制移动轨迹数据集若聚类为k,迭代次数平均为t,由于采用差分隐私指数机制,复杂度为O (mktnk)
    连续
    轨迹
    发布
    δ-位置集合,(δ, r)-位置集合ε-DPLaplace机制单个轨迹点假设攻击者知道用户的移动模式,使噪音添加的更合理。时间复杂为O(m|$\Delta $X|) (其中|$\Delta $X|为构建位置集合大小)
    Wasserstein机制,MQM机制,FGS-PufferFish机制ε-PufferFish隐私Laplace机制单条轨迹时间复杂度为O (nkm2·lgm) (其中k为PufferFish定义的空间关联的最大区域)
    下载: 导出CSV
  • 加载中
图(6)表(2)
计量
  • PDF下载量:  62
  • 文章访问数:  4318
  • HTML全文浏览量:  793
文章相关
  • 通讯作者:  冯登国, feng@is.iscas.ac.cn
  • 收稿日期:  2019-08-26
  • 录用日期:  2019-11-30
  • 网络出版日期:  2019-12-05
  • 刊出日期:  2020-01-01
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

返回文章