Journal of Geo-information Science ›› 2019, Vol. 21 ›› Issue (7): 1009-1017.doi: 10.12082/dqxxkx.2019.180701

• Orginal Article • Previous Articles     Next Articles

Extracting Rainstorm Disaster Information from Microblogs Using Convolutional Neural Network

Shuhan LIU1(), Yandong WANG1,2,3,*(), Xiaokang FU1   

  1. 1. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
    2. Collaborative Innovation Center for Remote Sensing, Wuhan University, Wuhan 430079, China
    3. Faculty of Geomatics, East China University of Technology, Nanchang 330013, China
  • Received:2018-12-28 Revised:2019-03-25 Online:2019-07-25 Published:2019-07-25
  • Contact: Yandong WANG;
  • Supported by:
    The National Key Research Program of China, No.2016YFB0501403;The National Natural Science Foundation of China, No.41271399;China Special Fund for Surveying, Mapping and Geo-information Research in the Public Interest, No.201512015


Nowadays social media has played an increasingly significant role in disaster management, thanks to its real-time nature and location-based services. When a disaster happens, a large number of images and texts with temporal and geographic information quickly flood in the social media network. Complementary to the traditional disaster management, social media could provide a lot of dynamic, nearly real-time disaster information to researchers. Current studies place more emphasis on using machine learning to deal with social media disaster data. Yet, in many cases deep learning has a better performance in automatic feature extraction than the traditional machine learning, and it can be used to extract and classify disaster information from social media. This paper focused on a method of extracting the disaster information from social media data using Convolutional Neural Network (CNN). To obtain the word vector corresponding to social media texts, a corpus of disaster events by using social media data was trained by word2vec model. Then, the vectorized microblog sentences and their corresponding disaster categories were used as input to the multi-classification model, which is based on convolutional neural network. After training and optimization, we used this model to extract disaster information from a large number of social media data streams. For an experiment, we combined Sina Weibo API and web crawler, and got over twenty thousand microblog texts with the theme of "Beijing Heavy Rainstorm" happened in 2012. Besides the irrelevant texts, we divided the data into seven categories. The topic classification model of rainstorm disaster information was built and trained based on a small number of tagged Sina Weibo data. The experimental results achieved the F-value of over 80% and the precision of over 90%, proving the validity of applying the model to our dataset. Moreover, this model when used to classify the data on Beijing's rainstorm in 2016 newly crawled form Weibo also had a good performance. According to the different rainstorm emergency topics classified by model, we carried out the deep mining of time series and spatial features to detect the phases of disaster development. Through visualization and statistical analysis, it was found that the time series analysis of disaster was consistent with the development of actual disasters, indicating the effectiveness of the CNN-based method in monitoring Beijing rainstorm. The study shows that using deep learning to extract disaster emergency information from social media is effective and feasible, which provides a new approach to real-time disaster emergency management.

Key words: convolutional neural network, Sina Weibo, short text classification, rainstorm disaster, information extraction