基于社交媒体数据的城市人群分类与活动特征分析
作者简介:周 艳(1976-),女,陕西西安人,博士,副教授,主要从事地理信息系统应用和空间大数据分析。E-mail: zhouyan_gis@uestc.edu.cn
收稿日期: 2017-04-30
要求修回日期: 2017-07-20
网络出版日期: 2017-10-09
基金资助
国家重点研发计划资助项目(2016YFB0502300)
国家自然科学基金项目(41471332、41571392)
中央高校基本科研业务费专项资金资助(ZYGX2015J113)
Analysis of Classification Methods and Activity Characteristics of Urban Population based on Social Media Data
Received date: 2017-04-30
Request revised date: 2017-07-20
Online published: 2017-10-09
Copyright
空间信息技术已开始进入全空间信息系统发展阶段,即将空间信息系统的范畴从传统测绘空间扩展到宇宙空间、室内空间、微观空间等可量测空间。位置大数据不仅是全空间信息系统的重要研究对象之一,而且也成为了广域全空间中了解人们生活方式以及城市动态变化的一种有效途径。本文基于社交媒体数据中的位置签到数据,提出一种不同于传统以社会经济属性为依据的城市人群分类方法。首先利用签到数据的时间序列构造矩阵模型;然后,通过分析用户签到活动的时间特征,采用K-means聚类算法和K近邻算法(K-NN)识别出具有不同时空行为特征的城市人群(静态居民、动态居民、通勤者以及访问者);最后,本文根据得到的人群分类结果,通过分析不同类型人群的时空间行为特征,发现不同类型人群时空间行为的差异性与潜在规律性,从而为表征城市人群的组成结构及特征,研究城市时空结构提供一种新的视角。
周艳 , 李妍羲 , 黄悦莹 , 耿二辉 . 基于社交媒体数据的城市人群分类与活动特征分析[J]. 地球信息科学学报, 2017 , 19(9) : 1238 -1244 . DOI: 10.3724/SP.J.1047.2017.01238
With the rapid development of spatial information technology, the concept of Pan-spatial Information System has been proposed. It extends the scope of spatial information system from the traditional mapping space to the space, interior space, microscopic space and other measurable space. Location data is one of the important research objects of Pan-spatial Information System and it has become a way of studying people's social life and urban dynamics. In this paper, we propose a new crowd classification method based on check-in data which is different from the traditional method based on socioeconomic attributes. Firstly, using the time series of check-in data, we build a matrix model. Then, we analyze the temporal characteristics of residents’ check-in activities. The analytical process starts from spatial-temporal profiles, learns the different behaviors, and returns annotated profiles. In the analytical process, we use the K-means clustering algorithm and K-NN algorithm to learn how to annotate profiles with a city user category (resident, dynamic resident, commuter, or visitor). Finally, according to the classification results of the population, we analyze the temporal and spatial behavior of different city user category and find their differences and potential regularity of spatial behavior. Our method can be applied to a new research perspective for characterizing the composition and characteristics of the urban population and studying urban spatiotemporal structure.
Fig. 1 The construction method of time series matrix图1 时序矩阵构造方法 |
Fig. 2 Examples of time series matrix of each type of urban population categories图2 每类城市人口类别的时序矩阵示例 |
Fig. 3 Individual check-in profiles’ categories and classification quality: the result of the semi-automatic labeling over the reduced time windows of one, two, three and four months, respectively图3 月份组合数为1、2、3、4的城市人口类别及总体的分类结果质量比较 |
Fig. 4 The check-in law of change of different types of people in different time period图4 不同类型人群在不同时间段签到变化规律 |
Fig. 5 Community discovery of check-in trajectory networks of different types of people图5 不同类型人群签到轨迹网络的社区发现结果 |
The authors have declared that no competing interests exist.
[1] |
[
|
[2] |
[
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
[
|
[9] |
[
|
[10] |
|
[11] |
|
[12] |
[
|
[13] |
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
/
〈 | 〉 |