地球信息科学学报 ›› 2015, Vol. 17 ›› Issue (4): 391-400.doi: 10.3724/SP.J.1047.2015.00391

• • 上一篇    下一篇

匿名集序列规则与转移概率矩阵的空间预测和实验

张海涛(), 葛国栋, 黄慧慧, 徐亮   

  1. 南京邮电大学 地理与生物信息学院, 南京 210003
  • 收稿日期:2014-10-11 修回日期:2014-10-31 出版日期:2015-04-10 发布日期:2015-04-10
  • 作者简介:

    作者简介:张海涛(1978-),男,博士,副教授,研究方向为移动GIS理论方法与关键技术、时空数据挖掘、LBS隐私保护。E-mail:zhanghaitao@njupt.edu.cn

  • 基金资助:
    2010年度江苏政府留学奖学金项目;国家自然科学基金项目“基于大时空范围LBS匿名集的推理攻击及隐私保护”(41201465);江苏省自然科学基金项目“对抗基于时空关联规则推理攻击的LBS隐私保护研究”(BK2012439)

A Method of Spatial Prediction Based on Transition Probability Matrix and Sequential Rules of Spatial-temporal K-anonymity Datasets

ZHANG Haitao*(), GE Guodong, HUANG Huihui, XU Liang   

  1. College of Geographic and Biologic Information, Nanjing University of Posts & Telecommunications, Nanjing 210003, China
  • Received:2014-10-11 Revised:2014-10-31 Online:2015-04-10 Published:2015-04-10
  • Contact: ZHANG Haitao E-mail:zhanghaitao@njupt.edu.cn
  • About author:

    *The author: SHEN Jingwei, E-mail:jingweigis@163.com

摘要:

随着位置服务(Location Based Service,LBS)的广泛应用,隐私保护成为LBS进一步深入发展亟待解决的问题,时空K-匿名成为一个主流方向。LBS应用服务器存储用户执行连续查询生成的历史匿名数据集,分析大时空尺度历史的匿名数据集,空间预测可以实现LBS应用的个性化服务。本文提出了一种融合概率统计与数据挖掘2种典型技术——马尔科夫链与序列规则,对匿名数据集中包含的特定空间区域进行预测的方法。方法包括4个过程:(1)分析序列规则、马尔科夫过程进行预测的特点;(2)以匿名数据集序列规则的均一化置信度为初始转移概率,构建n步转移概率矩阵;(3)设计以n步转移概率矩阵进行概略空间预测的方法,以及改进的指定精确路径的空间预测方法;(4)实验验证方法的性能。结果证明,该方法具有模型结构建立速度快、精确空间预测概率与真实概率的近似度可灵活调节等优点,具有可用性。

关键词: 时空K-匿名, 序列规则, 马尔科夫链, 转移概率矩阵, 空间预测

Abstract:

Recently, spatial-temporal K-anonymity has become a prominent method among a series of techniques for user privacy protection in Location Based Services (LBS) applications, because of its easy implementation and broad applicability. Analyzing spatial prediction scenarios based on spatial-temporal K-anonymity datasets is important in improving the utilization of LBS anonymity datasets for individualized services. In this paper, we present a spatial prediction method by combining the advantages of probabilistic statistics techniques and data mining techniques. The detailed process is divided into four phases: Phase 1, the predictive characteristics based on sequential rules and Markova chain are studied, and then an algorithm is designed to compute the n-step transition probability matrices of normalized sequential rules mined from sequences of spatial-temporal K-anonymity datasets; Phase 2, directly adopting the n-step transition probability matrices of example datasets, the simple predictions are performed; however, the drawback of this method is also found: the full path of the simple predictions cannot be learned, which is very important to the analysis of behavior patterns of LBS users; therefore in Phase 3, a precise predictive algorithm is designed, which recursively discovers the detailed k step path, its transition probability from the detailed k-1 step, and the simple k step that includes the start and the stop node only; and in Phase 4, simulation experiments are conducted, while the experimental results demonstrate that the proposed approach can build the predictive model faster than traditional methods, and can also adjust the accuracy of the predictions flexibly by setting different confidence thresholds for sequential rules of datasets.

Key words: spatial-temporal K-anonymity, sequential rules, Markov chain, transition probability matrix, spatial prediction.