地球信息科学学报 ›› 2017, Vol. 19 ›› Issue (6): 763-771.doi: 10.3724/SP.J.1047.2017.00763

• 地球信息科学理论与方法 • 上一篇    下一篇

利用手机通话位置数据估计城市24 h人口分布误差

尹凌1,2(), 姜仁荣1,3, 赵志远2,4, 宋晓晴2,4, 李晓明1,3   

  1. 1. 国土资源部城市土地资源监测与仿真重点实验室,深圳 518034
    2. 中国科学院深圳先进技术研究院,深圳 518055
    3. 深圳市数字城市工程研究中心,深圳 518034
    4. 武汉大学 测绘遥感信息工程国家重点实验室,武汉 430079
  • 收稿日期:2016-09-11 修回日期:2016-10-16 出版日期:2017-06-20 发布日期:2017-06-20
  • 作者简介:

    作者简介:尹 凌(1981-),女,重庆人,博士,副研究员,研究方向为时空数据分析。E-mail: yinling@siat.ac.cn

  • 基金资助:
    国土资源部城市土地资源监测与仿真重点实验室开放基金资助项目(KF-2015-01-052);国家自然科学基金项目(41301440);深圳市科创委基础研究项目(JCYJ20140610151856728、JCYJ20140610151856729)

Exploring the Bias of Estimating 24-hour Population Distributions Using Call Detail Records

YIN Ling1,2,*(), JIANG Renrong1,3, ZHAO Zhiyuan2,4, SONG Xiaoqing2,4, LI Xiaoming1,3   

  1. 1. Key Laboratory of urban land resources monitoring and simulation, Ministry of Land and Resources of China, Shenzhen 518034, China
    2. Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
    3. Shenzhen Digital City Engineering Research Center, Shenzhen 518034, China
    4. State Key Laboratory of Information Engineering in Surveying, Mapping, Remote and Sensing, Wuhan University, Wuhan 430079, China
  • Received:2016-09-11 Revised:2016-10-16 Online:2017-06-20 Published:2017-06-20
  • Contact: YIN Ling E-mail:yinling@siat.ac.cn

摘要:

手机通话位置数据已经成为全球范围内广泛使用的人类活动研究数据,其为高时空分辨率的人口分布估计提供了一种新的途径。然而,手机通话位置数据具有不规则稀疏采样的特点,其反映出的人口分布可能具有一定误差。本研究以深圳市为例,首次结合24 h连续定位的手机信令数据,分别从时间和地理维度量化分析了使用手机通话位置数据估计24 h人口分布的偏差,同时探讨了剔除低频用户对上述偏差的影响。本研究揭示,在居民活跃时间段,使用通话用户的分布估计人口分布时,相对误差的中位数在25%~30%之间;城市内部的土地利用类型与通话用户人口估计偏差具有显著的关联;剔除低频用户会略微减小土地利用对人口分布偏差的影响。本研究成果可帮助理解手机通话位置数据在估计高时空分辨率人口分布上的局限性和适用性,为合理使用手机通话位置数据进行相关研究和应用提供科学依据。

关键词: 手机数据, 人口分布, 偏差, 土地利用, 深圳市

Abstract:

Call detail records (CDRs) have been widely used to study human activities over the world. They offer a new channel to estimate population distribution with higher spatiotemporal resolution. However, the samples of CDRs distribute irregularly and sparsely, which could cause certain bias in the derived population distribution. This study is the first assessment that takes a mobile signaling dataset of 24-hour tracking users as a benchmark to evaluate the bias in population distribution derived from CDRs. Particularly, taking Shenzhen City as an example, this study quantifies the relative errors of 24-hour population distributions from both temporal and geographical dimensions, and also discusses the impact of excluding low-frequency call users on these errors. This study found that the medians of relative errors lie between 25%~30% when using caller volume to estimate population distribution during human active hours and the errors will increase during sleeping time. Such bias should be made aware of for researchers and application practitioners. This study also demonstrated that the urban land use types strongly relate with estimation errors of population distribution derived from CDRs. Especially, the population in rural residential land and industrial land will be significantly underestimated, while that in transportation land will be highly overestimated. For applications such as emergency evacuation or facility allocation based on population derived from CDRs, these results can support correcting population estimation errors and help to locate rescuing support or public resources more properly. At last, this study showed that excluding low-frequency call users can slightly mitigate the impact of land use on the estimation errors, suggesting excluding low-frequency users in a rigorous way. Overall, the findings of this study can help understand the limitation and suitability of applying CDRs to estimate population distribution with high spatiotemporal resolution, as well as offering scientific support for research and applications of using CDRs in an appropriate way.

Key words: mobile phone data, population distribution, bias, land use, Shenzhen City