社交媒体数据对反映台风灾害时空分布的有效性研究
作者简介:梁春阳(1993-),男,硕士生,研究方向为自发地理信息与应急管理。E-mail: pepsi8696@163.com
收稿日期: 2018-01-02
要求修回日期: 2018-04-08
网络出版日期: 2018-06-20
基金资助
国家重点研发计划重点专项(2016YFC0502905)
福建省公益科研院所专项(2015R1034-1)
福建省测绘地理信息局科技资助项目(2017JX03)
Assessing the Effectiveness of Social Media Data in Mapping the Distribution of Typhoon Disasters
Received date: 2018-01-02
Request revised date: 2018-04-08
Online published: 2018-06-20
Supported by
National Key Research and Development Program of China, No.2016YFC0505905
Non-profit Research Projects of Fujian Province, No.2015R1034-1
Development Foundation of Surveying, Mapping and Geoinformatics of Fujian Province, No.2017JX03
Copyright
当灾害事件发生时,与之相关的社交媒体数据不断产生,其中包含了丰富的灾情信息和签到地理位置信息,这为灾情态势的及时感知提供了一种新的数据源,但是因社交媒体用户量的地区差异及网络空间中信息传播模式的特点,给社交媒体签到数据所代表的空间点过程的模式分析带来了一些新的问题,如签到点密度与实际灾害点事件密度之间的对应关系、签到点之间的空间关系、点格局的空间异质性及其影响因素等。本文以2016年14号台风“莫兰蒂”为例,以“台风”和“莫兰蒂”为关键词,在新浪微博平台上采集了2016年9月14-17日的微博数据,使用文档主题生成模型(Latent Dirichlet Allocation, LDA)和支持向量机(Support Vector Machine, SVM)对微博文本进行分类,构建了含有签到位置信息的灾情点事件数据库。在此基础上,针对社交媒体用户分布的空间异质性提出了一种基于签到点用户活跃度的加权模型。以全局自相关统计量Moran′s I为指标,对加权前后的签到微博数据进行对比,发现这些在社交网络中产生的签到微博数据在现实地理空间中存在明显的空间自相关性;基于“雨”、“停电”等关键词,利用上述加权处理后的微博数据库进行灾害制图,结合真实灾情资料进行时空对比分析,结果表明系列图谱能够反映台风灾害的时空过程趋势。
梁春阳 , 林广发 , 张明锋 , 汪玮杨 , 张文富 , 林金煌 , 邓超 . 社交媒体数据对反映台风灾害时空分布的有效性研究[J]. 地球信息科学学报, 2018 , 20(6) : 807 -816 . DOI: 10.12082/dqxxkx.2018.180022.
When a disaster occurs, a large number of images and texts with geographic information quickly flood the social network, which provides a new data source for timely awareness of disaster situations. However, due to the regional variation in the number of social media users and characteristics of information diffusion in cyberspace, new problems have risen in the mode analysis of spatial point processes represented by the check-in data. Examples are the correlation between check-in point density and disaster location density, spatial relation between check-in points or spatial heterogeneity of point pattern and associated influences. In this study, we took Typhoon No.14 in 2016 as an example and collected Sina Weibo data between September 14 and September 17, 2016 using keywords “Typhoon” and “Meranti”. We classified the Weibo texts using Latent Dirichlet Allocation (LDA) and Support Vector Machine (SVM) algorithms and constructed a disaster database containing relevant check-in information. In addition, considering the spatial heterogeneity of Weibo users, we proposed a weighted model based on user activity at the check-in points. Using the global autocorrelation statistics Moran′s I as an indicator, we compared the check-in data before and after adding weights and discovered obvious spatial autocorrelation of the check-in data in real geographical locations. We tested our model on Weibo data with keyword “rain” and “power failure”. The results show that a series of maps generated by our model is able to reflect the typhoon disaster spatio-temporal process trends.
Fig. 1 The statistics of daily check-in times in some cities图1 部分城市每日签到次数统计图 |
Fig. 2 The distribution of microblog user check-in activities图2 微博用户签到活跃度分布图 |
Fig. 3 The distribution of disaster-related microblog′s records with location information图3 灾情签到微博数量分布图 |
Tab. 1 The comparison between average neighbors and threshold distance using inverse distance weight表1 反距离权重下的平均邻居数与阀值距离对应表 |
最小邻居数/个 | 平均邻居数/个 | 阀值距离/km |
---|---|---|
1 | 5.87 | 178.6 |
5 | 6.41 | 369.1 |
10 | 10.04 | 488.3 |
15 | 15 | 637.8 |
20 | 20 | 750.5 |
… | … | … |
Fig. 4 The average neighbors and global Moran′s I using inverse distance weight图4 反距离空间权重下的平均邻居数与Moran′s I指数 |
Fig. 5 The average neighbors and global Moran’s I using K nearest neighbors weight图5 K近邻空间权重下的平均邻居数与Moran’s I指数 |
Fig. 6 The time series of maps with fuzzy query using "rain" as keyword图6 以“雨”为关键词进行模糊查询的时序图 |
Fig. 7 The time series of maps with fuzzy query using "power failure" as keyword图7 以“停电”为关键词进行模糊查询的时序图 |
Fig. 8 Moran's I index of microblog′s records with different disaster keywords in different time period图8 含不同灾情特征词的微博在不同时间段的Moran′s I统计图 |
The authors have declared that no competing interests exist.
[1] |
|
[2] |
[
|
[3] |
|
[4] |
|
[5] |
[
|
[6] |
|
[7] |
|
[8] |
|
[9] |
[
|
[10] |
[
|
[11] |
[
|
[12] |
|
[13] |
[
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
[
|
[23] |
[
|
[24] |
[
|
[25] |
|
[26] |
|
[27] |
[
|
[28] |
|
/
〈 |
|
〉 |