Journal of Geo-information Science ›› 2019, Vol. 21 ›› Issue (1): 2-13.doi: 10.12082/dqxxkx.2019.180680

Special Issue: 地理大数据

Previous Articles     Next Articles

Public Security Event Themed Web Text Structuring

Tao PEI1,2,7(), Sihui GUO1,2, Yecheng YUAN1,*(), Xueying ZHANG3,7, Wen YUAN1, Ang GAO4, Zhiyuan ZHAO5, Cunjin XUE6   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. Key Laboratory of Virtual Geographic Environment, Nanjing Normal University, Ministry of Education, Nanjing 210023, China
    4. China National Institute of Standardization, Beijing 100088, China
    5. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing of Wuhan University, Wuhan 430079, China
    6. Key Laboratory of Digital Earth Science, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100094, China
    7. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
  • Received:2018-12-01 Revised:2018-12-24 Online:2019-01-20 Published:2019-01-20
  • Contact: Yecheng YUAN;
  • Supported by:
    National Natural Science Foundation of China, No.41525004,41421001


The information of public security event contained in text can be the data source of the evaluation and the relief if it can be structured into a relational database. Although previous research can extract the information of events into different attributes, the determination on the attribution of the attribute information to specific event remains unsolved. To solve the problem, this paper proposes a theoretical frame of public security event themed web text structuring, which is composed of three parts. First, an event semantic model is used to construct the seismic event semantic framework which defines abstract elements of event and their semantic relationships. Taking seismicity as an example, spatial element, time element, attribute element, source element are defined as basic elements. Spatial element includes earthquake latitude, longitude, depth and location. Attribute element is further subdivided into four sub-elements: Cause, result, behavior and influence element. Next, an annotation system is applied to typical event materials to label semantic elements, e.g. the place name where an earthquake took place, that is, instantiation of the abstract elements. The key to this step is labeling the relations between elements and specific event. Finally, the event text is structured into event type, event name, event time, event location and other attributes by using the text information extraction algorithm. The algorithm used the labeled materials in the last step as training data to optimize parameters, which can incorporate linked information. The extracted event text (e.g. words, phrases) finally is normalized to structured information for further analysis. An event information mining platform following the whole frame is developed, which includes the modules of webpage searching, text cleaning, event information extraction, visualization and analyzing. The platform processed the whole Chinese webpages of 2014 and found 85 506 seismicity reports. Taking Yunnanludian earthquake as an example, we display the structuring process and result of related web text, which can be the important reference for the relief of the disaster and the analysis of public concern. With the platform, we can demonstrate the seismic text structuring result and its social concern across China, which can be a new tool of event information mining and analyzing.

Key words: semantic framework, text parsing, social concern about events, seismic events, spatial search engine