地球信息科学学报 ›› 2018, Vol. 20 ›› Issue (6): 762-771.doi: 10.12082/dqxxkx.2018.180087

• 2017年中国地理信息科学理论与方法学术年会优秀论文专辑 • 上一篇    下一篇

基于滑动窗口的手机定位数据个体停留区域识别算法

林楠1,2(), 尹凌1,*(), 赵志远1,3   

  1. 1. 中国科学院深圳先进技术研究院,深圳 518055
    2. 中国科学院大学,北京 100049
    3. 武汉大学 测绘遥感信息工程国家重点实验室,武汉 430079
  • 收稿日期:2018-01-31 修回日期:2018-03-09 出版日期:2018-06-20 发布日期:2018-07-12
  • 通讯作者: 尹凌 E-mail:nan.lin@siat.ac.cn;yinling@siat.ac.cn
  • 作者简介:

    作者简介:林 楠(1993-),男,硕士生,主要从事手机定位数据的活动特征挖掘与模拟研究。E-mail: nan.lin@siat.ac.cn

  • 基金资助:
    国家自然科学基金项目(41771441);深圳市科技创新委基础研究项目(JCYJ20170307164104491);广东省自然科学基金项目(2016A050503035)

Detecting Individual Stay Areas from Mobile Phone Location Data Based on Moving Windows

LIN Nan1,2(), YIN Ling1,*(), ZHAO Zhiyuan1,3   

  1. 1. Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing of Wuhan University, Wuhan 430079, China
  • Received:2018-01-31 Revised:2018-03-09 Online:2018-06-20 Published:2018-07-12
  • Contact: YIN Ling E-mail:nan.lin@siat.ac.cn;yinling@siat.ac.cn
  • Supported by:
    National Natural Science Foundation of China, No.41771441;Basic Research Project of Shenzhen City, No.JCYJ20170307164104491;Natural Science Foundation of Guangdong Province, No.2016A050503035

摘要:

手机的普及使手机定位数据成为分析个体时空行为特征的新兴重要数据源之一,并被逐渐应用到人口管理、城市规划、交通分析和流行病防控等众多领域的研究中。从手机定位数据中识别个体的停留区域是众多基于手机定位数据研究的重要基础环节。然而,当前常用的手机定位数据定位精度相对较低,且往往存在定位震荡和定位漂移导致的数据噪声,这些因素增加了从手机定位数据中识别停留区域的难度。为了提高从手机定位数据中识别个体停留区域的准确性,本研究结合个体行为的时空连续性,提出了一种基于滑动窗口的增长聚类算法。实验结果显示,相较常用的ST-DBSCAN算法和SMoT算法,对于采样时间间隔稀疏的手机定位数据,本研究提出的滑动窗口聚类算法在准确率方面的提升幅度最大可以达到35%。由于隐私问题,当前研究和应用中使用的大规模手机定位数据集中的时间分辨率往往较低,因此,本研究提出的滑动窗口聚类算法具有较为广泛的应用场景,可增强基于手机用户停留区域的众多研究结果的可靠性,为手机定位数据的广泛合理应用提供关键技术支撑。

关键词: 手机定位数据, 数据噪声, 轨迹分析, 聚类, 停留区域识别

Abstract:

With the development and popularization of mobile phones, mobile phone location data have become an important source of data for analyzing individual mobility characteristics. With these location data, many studies can be performed at a fine spatiotemporal scale in fields such as population management, urban planning, transportation analysis and health intervention. Detection of individual stay areas is an important and basic step in many studies based on mobile phone location data. However, the sparse spatial and temporal resolution of raw mobile phone location data and data noise caused by location oscillation and location drift introduce great challenges in effectively detecting individual stay areas from raw mobile phone location data. Considering the spatiotemporal continuity of individual behavior, this study proposes an incremental clustering algorithm based on a moving window to improve the accuracy of detecting individual stay areas from mobile phone location data. Specifically, the proposed algorithm first sorts the raw records in chronological order. Then, the algorithm consecutively examines the adjacent records with a given distance threshold. Records that satisfy the rule will be added to the current cluster. For each unqualified record, the algorithm extracts a series of records within a moving window and calculates the spatial distance of these records as a criterion for clustering. The time interval between the unqualified record and the selected records should be less than a given time threshold, which is also the width of the moving window in this proposed algorithm. In this step, the algorithm treats some unqualified records as location drift records or location oscillation records based on the detection rules and aggregates them into the current cluster, and unqualified records that do not fit the detection rules are excluded from the current cluster and the algorithm creates a new cluster for the unqualified records. Finally, the algorithm calculates the location and temporal information of each valid cluster as the parameters of the corresponding stay area and constructs a stay area sequence for each mobile user. We compared the results of the proposed algorithm with those obtained using the ST-DBSCAN and SMoT algorithms. The experiment applied the three algorithms to a mobile phone location dataset in Shenzhen that is a type of Call Detail Records, and the results show that the proposed algorithm significantly improves the accuracy by up to 35% for detecting individual stay areas from sparse mobile phone location data compared to the other two algorithms. Due to privacy issues associated with the government or telecom operators, the temporal resolution of large-scale mobile phone location data used in recent research is usually sparse, and thus the proposed algorithm can be used to improve the effectiveness of detecting individual stay areas and to provide reliable results for many studies based on mobile phone location data.

Key words: Mobile phone location data, data noise, trajectory analysis, incremental clustering, stay areas detection