地球信息科学学报 ›› 2021, Vol. 23 ›› Issue (2): 341-350.doi: 10.12082/dqxxkx.2021.200248

• 疫情舆情动态分析 • 上一篇    下一篇

基于用户情感变化的新冠疫情舆情演变分析

张琛1(), 马祥元1, 周扬1, 郭仁忠1,2,*()   

  1. 1.武汉大学资源与环境科学学院,武汉 430079
    2.深圳大学建筑与城市规划学院智慧城市研究院,深圳 518060
  • 收稿日期:2020-05-20 修回日期:2020-07-21 出版日期:2021-02-25 发布日期:2021-04-25
  • 通讯作者: 郭仁忠 E-mail:czhang0315@whu.edu.cn;guorz2013@qq.com
  • 作者简介:张 琛(1997— ),男,河南信阳人,硕士生,从事地名地址匹配、知识图谱构建研究。E-mail: czhang0315@whu.edu.cn
  • 基金资助:
    中国博士后科学基金项目(2019M663070)

Analysis of Public Opinion Evolution in COVID-19 Pandemic from a Perspective of Sentiment Variation

ZHANG Chen1(), MA Xiangyuan1, ZHOU Yang1, GUO Renzhong1,2,*()   

  1. 1. School of Resource and Environmental Sciences, Wuhan University,Wuhan 430079, China
    2. Research Institute for Smart Cities, School of Architecture and Urban Planning, Shenzhen University, Shenzhen 518060, China
  • Received:2020-05-20 Revised:2020-07-21 Online:2021-02-25 Published:2021-04-25
  • Contact: GUO Renzhong E-mail:czhang0315@whu.edu.cn;guorz2013@qq.com
  • Supported by:
    China Postdoctoral Science Foundation(2019M663070)

摘要:

新冠肺炎疫情作为国际性突发公共卫生事件引发了社会媒体的高度关注。微博评论内容是用户对疫情中介性事件的认知、态度、倾向和行为的汇集,为基于用户情感分析的舆情演化研究提供了高现势性和高时序性的文本语料。本文以2020年1月23日至4月8日期间“人民日报”每日疫情通报的微博评论为信息基底,首先使用中文自然语言处理工具SnowNLP对语料进行情感倾向性抽取,完成正负向的情感分类,然后基于Single-Pass聚类算法实现文本语料的聚类分析,探索疫情热点话题,最后利用Louvain社团发现算法实现舆情被关注度的信息挖掘。① 时间维度上,每日情感趋势表明用户经历了焦虑害怕(1月24日—2月18日)、平稳自信(2月19日—3月15日)和紧张担忧(3月16日—4月8日)的情感更迭阶段;② 空间维度上,用户参与数量、所在地情绪状态和评论地情绪投射等关联分析显示不同行政区的疫情关注度和情感状态存在明显差异,疫情越严重地区的微博用户,其参与度越高且情绪状态与投射值越低。该研究通过引入自然语言处理技术和社团网络算法,构建出一种面向社交媒体评论文本数据的舆情分析方法框架,为重大公共事件的舆情研究提供了理论支持和创新思路。

关键词: 新冠肺炎疫情, 微博评论, 情感分析, 主题聚类, 舆情演变, 社团网络, 时空数据分析, 网络爬虫

Abstract:

As a Public Health Emergency of International Concern (PHEIC), the COVID-19 pandemic caused great concern in social media all over the world. The content of Weibo comments is a collection of users' perceptions, attitudes, tendencies, and behaviors of the pandemic, and provides a high-timeliness and high-sequence text corpus for public opinion evolution research based on sentiment analysis. In this paper, we used a corpus obtained from People's Daily on Weibo during COVID-19 pandemic (January 23 - April 8, 2020) as our research data. First, we extracted emotional tendencies to classify text comments into positive and negative sentiments with SnowNLP, a Chinese natural language processing tool. Second, based on the Single-Pass clustering algorithm, we implemented text cluster analysis to explore hot topics about the pandemic situation. Moreover, we realized the information mining about public attention by using the Louvain community analysis algorithm. (1) On temporal dimension, the result of daily emotional trend analysis shows that the public has experienced three emotional phases, which are a period presenting anxiety and fear (January 23 - February 18), a period presenting steadiness and confidence (February 19 - March 15) and a period presenting tension and concern (March 16 - April 8). (2) On a spatial dimension, joint analysis of the number of users, the emotional states, and emotional projections among different provinces shows obvious differences in the public attention and emotional value of the COVID-19 pandemic. Additionally, for those Weibo users in COVID-19 affected areas, the level of their online participation is positively correlated with the pandemic severity and the value of the emotional state and emotional projection is lower. Meanwhile, those in worst-hit areas tend to have a higher impact on the evolution of public opinion. The results show that Weibo users in Guangdong Province and Heilongjiang Province have high levels of attention and low averages of emotional state and emotional projection. It can be judged the two provinces are still facing great pressure for pandemic prevention and control. Although Hubei Province is most affected by the pandemic, with a low emotional state value but a high emotional projection value, it is speculated Weibo users' comments on Hubei Province are more encouraging and praised. In addition, the number of confirmed cases in the northwestern region is relatively small, and the number of comment participation is less than in other regions, but the averages of emotional state and emotional projection are higher. The research applies natural language processing and network community detection algorithms to construct a methodological framework of public opinion analysis for social media comments. The developed framework has promising potentials, as it provides theoretical and practical support for related research on major public events.

Key words: COVID-19 pandemic, Sina Weibo comments, sentiment analysis, text clustering, evolution of public opinions, community network structure, temporal and spatial data analysis, web crawler