地球信息科学学报 ›› 2021, Vol. 23 ›› Issue (2): 331-340.doi: 10.12082/dqxxkx.2021.200226

• 疫情舆情动态分析 • 上一篇    下一篇

重大公共卫生事件中的舆情分析方法研究——以新冠肺炎疫情为例

韩珂珂1,2(), 邢子瑶1,2, 刘哲1,2, 刘峻明1,2, 张晓东1,2,*()   

  1. 1.中国农业大学土地科学与技术学院,北京 100083
    2.中国农业大学农业农村部农业灾害遥感重点实验室,北京 100083
  • 收稿日期:2020-05-07 修回日期:2020-07-20 出版日期:2021-02-25 发布日期:2021-04-25
  • 通讯作者: 张晓东 E-mail:hankeke2019@163.com;zhangxd@cau.edu.cn
  • 作者简介:韩珂珂(1998— ),女,河南商丘人,硕士生,主要从事灾害信息挖掘与分析研究。E-mail: hankeke2019@163.com
  • 基金资助:
    国家重点研发计划项目(2018YFC1508901-3)

Research on Public Opinion Analysis Methods in Major Public Health Events: Take COVID-19 Epidemic as an Example

HAN Keke1,2(), XING Ziyao1,2, LIU Zhe1,2, LIU Junming1,2, ZHANG Xiaodong1,2,*()   

  1. 1. College of Land Science and Technology, China Agriculture University, Beijing 100083, China
    2. Key Laboratory of Remote Sensing for Agri-Hazards, Ministry of Agriculture and Rural Affairs, Beijing 100083, China
  • Received:2020-05-07 Revised:2020-07-20 Online:2021-02-25 Published:2021-04-25
  • Contact: ZHANG Xiaodong E-mail:hankeke2019@163.com;zhangxd@cau.edu.cn
  • Supported by:
    National Key R&D Program of China(2018YFC1508901-3)

摘要:

2019年12月以来,新冠肺炎疫情迅速席卷全球,截至北京时间2020年5月10日16时40分,全球累计确诊病例4 115 662例,已成为全球聚焦的主要话题。微博等社交媒体平台成为此次疫情相关信息传播的重要渠道和公众情绪的有效传感器之一。对微博信息进行深入挖掘分析不但能研判舆情特点,更有助于政府对公众的情绪进行针对性疏导,合理管控舆情。因此,本文采集了2020年1月18日到2020年1月28日期间关于新冠肺炎的33万余条新浪微博数据,基于Louvain和Kmeans的空间聚类、改进的BTM主题词提取等算法,将用户关注热点信息和情感特征作为地域标签,构建了反映情感特征、地域关联与热点关注在内的舆情评价方法,实现了基于位置的信息融合,能够分析不同区域的舆情特点与关注主题差异。研究表明:基于BERT词向量的BTM主题词提取方法可以有效弥补传统主题词提取的计算量大、数据冗余等缺点,在热点挖掘时具有更强的表达能力;不同区域关注热点具有一定的差异性,结合省级、市级及基于Louvain-Kmeans的空间聚类的多尺度舆情分析方法,可以全方位展现不同区域舆情特点。本文提出的舆情分析方法可以有效反映不同区域的舆情特征,为重大公共卫生事件的舆情分析提供参考。

关键词: 新冠肺炎, 微博, 情感分析, 空间聚类, 舆情, 提取, 热点挖掘, 爬虫

Abstract:

Since December 2019, COVID-19 has rapidly swept the world. As of May 10, 2020, 16:40 PM, Beijing time, the global confirmed COVID-19 cases reached 4,115,662, which has become a major global issue. Social media platforms such as microblog have become the important channel for information transmission and an effective sensor of public sentiment. In-depth mining and analysis of microblog information can not only characterize the public opinion, but also help the government to conduct targeted guidance on public sentiment and properly control public opinion. Therefore, this study collected more than 330,000 Sina Weibo data about COVID-19 from January 18, 2020 to January 28, 2020. Based on the spatial clustering method using Louvain and K-means and an improved BTM subject word extraction algorithm, users' attention information and emotional characteristics are labeled with their locations. Thus, the evaluation method of public opinion is constructed by integrating user's location information, which is able to analyze the characteristics of public opinion and the difference in the topics concerned at different regions. Our results show that the characteristics of public opinion in different regions can be comprehensively evaluated using the spatial clustering method based on Louwain and K-mean. The BTM subject word extraction method based on BERT word vector can effectively make up the disadvantages of traditional subject word extraction methods that need large computation and have data redundancy, and thus has stronger expression ability in user data mining. The hot topics concerned in different regions have certain differences. The public opinion analysis method proposed in this paper can effectively reflect the public opinion characteristics of different regions and provide reference for the public opinion analysis of major public health events.

Key words: COVID-19, Weibo, sentiment analysis, spatial clustering, public opinion, subject word extraction, hot mining, the crawler