面向不同用户群体的社交媒体台风舆情演化分析及对比研究
金 城(1995— ),男,浙江湖州人,硕士生,研究方向为海岸带灾害风险管理。E-mail: jincheng95@zju.edu.cn |
收稿日期: 2021-02-04
要求修回日期: 2021-07-13
网络出版日期: 2022-02-25
基金资助
国家自然科学基金项目(41971019)
版权
Analysis and Comparative Study of the Evolution of Public Opinion on Social Media during Typhoon for Different User Groups
Received date: 2021-02-04
Request revised date: 2021-07-13
Online published: 2022-02-25
Supported by
National Natural Science Foundation of China(41971019)
Copyright
社交媒体数据可以为台风灾害追踪、灾时救援和灾情评估提供及时有效的信息。现有研究常采用主题建模和情感分析等技术对台风期间社交媒体平台(如新浪微博等)舆论话题和情感变化进行研究。在省域范围内以小时为时间粒度的多维度有效性论证尚有欠缺,且在舆情分析时未能区分用户群体差异。本文以台风“利奇马”为例,在浙江省域范围内,以新浪微博数据为研究对象,首先从词频分析、台风关注度时空变化以及特定灾害事件响应3个角度探讨了微博数据对台风灾情响应的有效性;其次采用隐含狄利克雷分布(Latent Dirichlet Allocation,LDA)主题模型技术挖掘微博文本主题信息,并根据Louvain算法对主题社团进行划分;然后开发了一种基于自定义情感词典的情感分析方法用于情感指数计算,与SnowNLP相比情感倾向性预测精度得到了提高;最后分析了台风期间官方和民众在新浪微博平台上的话题关注以及情感演变差异。结果表明:① 在省级范围内,微博数据能有效反映台风动态和灾害时空分布;② 台风事件微博文本的主题变化反映了灾情不同阶段舆论关注点的动态变化;③ 官方微博文本比民众微博文本具有更明确的主题社团结构;④ 台风事件相关微博文本中的消极情绪在台风登陆后显著增加,其中民众微博文本对台风灾害的情绪响应更及时,官方微博文本中的情感表达始终相对积极。
金城 , 吴文渊 , 陈柏儒 , 杨续超 . 面向不同用户群体的社交媒体台风舆情演化分析及对比研究[J]. 地球信息科学学报, 2021 , 23(12) : 2174 -2186 . DOI: 10.12082/dqxxkx.2021.210065
Social media has been successfully applied to typhoon monitoring, on-site rescue, and disaster loss assessment. Preview studies mostly utilized topic modeling and sentiment analysis technique to analyze the focus of public opinion and sentiment evolution in the social media platform during the typhoon period. However, the existing studies were usually conducted at large spatial scales and long time spans. Moreover, the difference in behavior pattern among user groups was ignored. Firstly, a case study of Typhoon Lekima was implemented for verifying the effectiveness of microblog's response to typhoon disaster in Zhejiang province from three perspectives: word frequency, spatiotemporal change of public attention to typhoon, and public response to specific events. Secondly, the Latent Dirichlet Allocation (LDA) topic model was adopted to mine the text topics, whose community structure were divided by Louvain algorithm. Thirdly, a custom emotion dictionary was developed to calculate the sentiment index, and subsequently compared with SnowNLP in sentiment polarity prediction. Finally, we investigated the difference between official microblogs and public microblogs in topic concern and sentiment evolution. The results indicated that microblogs were capable of tracking typhoon dynamics and reflecting the spatiotemporal distribution of hazards within the provincial region. The LDA model result showed that the percentage of microblogs on public dynamics topic was large in days and small in nights; the percentage of microblogs on warning topic was on a downward trend; the disaster event rose significantly after typhoon landed; and the peak of that on rescue activities appeared in the late period of typhoon. The topic of official microblog had a clearer community structure than the public microblog, but this characteristic may be blurry when mixing the microblogs from two groups. The negative emotion on Sina Weibo significantly deepened in the typhoon landing period, and the public had a more timely emotional response to typhoon disasters, while the sentiment index of official microblog was always higher.
表1 台风事件微博获取情况Tab. 1 The amount of typhoon-related microblogs crawled |
台风编号 | 台风名称 | 搜集时段 | 有效微博数量/条 |
---|---|---|---|
1904 | 木恩(Mun) | 2019-07-02 00:00—2019-07-05 00:00 | 303 |
1907 | 韦帕(Wipha) | 2019-07-31 00:00—2019-08-03 00:00 | 369 |
1909 | 利奇马(Lekima) | 2019-08-09 00:00—2019-08-12 00:00 | 72 514 |
1911 | 白鹿(Bailu) | 2019-08-24 00:00—2019-08-27 00:00 | 1404 |
1914 | 剑鱼(Kajiki) | 2019-09-01 00:00—2019-09-04 00:00 | 396 |
1919 | 海贝思(Hagibis) | 2019-10-11 00:00—2019-10-14 00:00 | 784 |
图3 台风“利奇马”期间浙江省各地级市台风相关微博的数量与相对数量变化Fig. 3 Changes in the amount and relative amount of typhoon-related microblogs of prefecture-level cities in Zhejiang province during Typhoon Lekima |
表2 基于情感词典的模型和SnowNLP的情感倾向性评价结果Tab. 2 Sentiment polarity assessment result of sentiment dictionary-based model and SnowNLP respectively |
基于情感词典 | SnowNLP | ||||||
---|---|---|---|---|---|---|---|
积极 | 中性 | 消极 | 积极 | 中性 | 消极 | ||
Precision | 0.873 | 0.713 | 0.835 | 0.472 | 0.348 | 0.547 | |
Recall | 0.785 | 0.760 | 0.881 | 0.608 | 0.204 | 0.517 | |
F-measure | 0.826 | 0.736 | 0.857 | 0.532 | 0.257 | 0.532 |
[1] |
李钢, 邱新法, 张眉, 等. 浙江省台风灾害直接经济损失评估模型[J]. 热带地理, 2014, 34(2):178-183.
[
|
[2] |
郭云霞, 侯一筠, 齐鹏. 中国东南沿海区域台风数值模拟与危险性分析[J]. 海洋科学, 2020, 44(4):1-12.
[
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
|
[14] |
陈梓, 高涛, 罗年学, 等. 反映自然灾害时空分布的社交媒体有效性探讨[J]. 测绘科学, 2017, 42(8):44-48,129.
[
|
[15] |
杨腾飞, 解吉波, 李振宇, 等. 微博中蕴含台风灾害损失信息识别和分类方法[J]. 地球信息科学学报, 2018, 20(7):906-917.
[
|
[16] |
梁春阳, 林广发, 张明锋, 等. 社交媒体数据对反映台风灾害时空分布的有效性研究[J]. 地球信息科学学报, 2018, 20(6):807-816.
[
|
[17] |
张岩, 李英冰和郑翔.基于微博数据的台风“山竹”舆情演化时空分析[J]. 山东大学学报(工学版), 2020, 50(5):118-126.
[
|
[18] |
|
[19] |
王晰巍, 张柳, 黄博, 等. 基于LDA的微博用户主题图谱构建及实证研究——以“埃航空难”为例[J]. 数据分析与知识发现, 2020, 4(10):47-57.
[
|
[20] |
|
[21] |
|
[22] |
|
[23] |
应急管理部救灾和物资保障司. 应急管理部公布2019年全国十大自然灾害[EB/OL]. https://www.mem.gov.cn/xw/bndt/202001/t20200112_343410.shtml, 2020-01-12.
[ Disaster Relief and Material Support Division, Ministry of Emergency Management of the People's Republic of China. Ministry of Emergency Management announced 10 major natural disasters in 2019[EB/OL]. https://www.mem.gov.cn/xw/bndt/202001/t20200112_343410.shtml, 2020-01-12.
|
[24] |
人民网舆情数据中心. 2019年政务指数·微博影响力报告[EB/OL]. http://yuqing.people.com.cn/NMediaFile/2020/0117/MAIN202001171722000261251830504.pdf,2020-01-17.
[Public Opinion Data Centre of People's Daily Online. Government affairs index microblog influence report 2019[EB/OL]. http://yuqing.people.com.cn/NMediaFile/2020/0117/MAIN202001171722000261251830504.pdf,2020-01-17.
|
[25] |
|
[26] |
方东昊. 基于LDA的微博短文本分类技术的研究与实现[D]. 沈阳:东北大学, 2011.
[
|
[27] |
王鹏, 高铖, 陈晓美. 基于LDA模型的文本聚类研究[J]. 情报科学, 2015, 33(1):63-68.
[
|
[28] |
|
[29] |
吴祖峰, 王鹏飞, 秦志光, 等. 改进的Louvain社团划分算法[J]. 电子科技大学学报, 2013, 42(1):105-108.
[
|
[30] |
|
[31] |
黄天诚. 基于图着色的并行Louvain社区发现算法研究[D]. 长春:吉林大学, 2016.
[
|
[32] |
|
/
〈 |
|
〉 |