地球信息科学学报 ›› 2014, Vol. 16 ›› Issue (3): 341-348.doi: 10.3724/SP.J.1047.2014.00341

• 本期要文(可全文下载) •    下一篇

RSS新闻事件的多维描述与时空可视化方法

桑鹏1, 唐新明2, 艾波1, 王华斌2,3   

  1. 1. 山东科技大学测绘科学与工程学院, 青岛266590;
    2. 国家测绘地理信息局卫星测绘应用中心, 北京101300;
    3. 武汉大学, 武汉430072
  • 收稿日期:2013-08-13 修回日期:2013-11-18 出版日期:2014-05-10 发布日期:2014-05-10
  • 作者简介:桑鹏(1990- ),男,山东泰安人,硕士生,研究方向为时空数据可视化。E-mail:troy0@qq.com
  • 基金资助:

    国家自然科学基金项目(41271394);高等学校博士学科点专项科研基金项目(20123718120001);数字制图与国土信息应用工程国家测绘局重点实验室开放研究基金项目(GCWD201107);国家科技支撑计划项目(2011BAB01B04)。

Multi-Dimensional Description and Spatio-temporal Visualization of News Events Based on RSS

SANG Peng1, TANG Xinming2, AI Bo1, WANG Huabin2,3   

  1. 1. Geomatics College, Shandong University of Science and Technology, Qingdao 266590, China;
    2. Satellite Surveying and Mapping Application Center, State Bureau of Surveying and Mapping, Beijing 101300, China;
    3. Wuhan University, Wuhan 430072, China
  • Received:2013-08-13 Revised:2013-11-18 Online:2014-05-10 Published:2014-05-10

摘要:

百度等按照时间或焦点的传统新闻检索方式,缺少对新闻事件在时间维度和空间维度及时空发展规律上的组织和表达。鉴此,本文提出了一种在时间和空间维度对在线简易信息聚合(Really Simple Syndication,RSS)新闻进行多维描述和时空可视化的方法,帮助用户全面、直观理解焦点新闻事件的时空发展过程及趋势。该方法从新浪、百度和Google等多家网站的RSS新闻服务中抽取新闻,将新闻报道时间近似为新闻事件发生时间进行时间维度描述,动态解析并识别新闻概要中的中文地名词汇,进行地址匹配和空间定位,完成新闻事件空间维度描述。以H7N9禽流感热点新闻为例,本文通过过度颜色、统计折线图进行时间维可视化表达,以大小渐变的圆形符号进行空间维可视化表达,多维度描述并展示了H7N9禽流感新闻事件的发展过程和趋势。

关键词: RSS, 新闻事件, 多维描述, 时空可视化

Abstract:

Traditional methods of news retrieval which return a series of related news-list that sorted by time or events such as Baidu, are lack of intuitive description in both temporal and spatial dimensions, as well as spatio-temporal development that related to news events. This paper presented a method of multi-dimensional description and spatio-temporal visualization of online RSS news events, which helps readers understand the spatio-temporal development of the whole news event. Firstly, this method pulled news from several well-known websites such as Baidu, Sina and Google News based on RSS (Really Simple Syndication) service, and then used a multi-dimensional description method to mark the spatial and temporal dimensions of RSS news. The method of temporal dimensional description defines news publishing time as news' occurrence time, while the method of spatial dimensional description dynamically parses and identifies Chinese geographical name from news description, and then matches them with their geographical coordinates. Spatial dimensional description method is the primary content of this article. This approach has been separated into four stages to accomplish the analyzing process: (i) XSL Transformation, which uses XSL(eXtensible Stylesheet Language) to transform a news RSS document into a HTML(Hypertext Markup Language) document;(ii) Description Extraction, which uses the regular expression to extract the news description from news HTML document;(iii) Chinese place Name Extraction, which uses ICTCLAS to extract geographic name from description;And (iv) Geocoding, which uses Google Geocoder API to get the geographical coordinates of the place name. At last, this paper demonstrated the spatio-temporal visualization of news events and made a brief analysis by setting H7N9 hot news as an example. In the analysis, temporal visualization used transition color to show the changes between two time nodes according to the amount of news, and then used line chart to show the variation tendency of the total amount of news. Furthermore, spatial visualization clustered news by province and used different-sized plots to indicate the diffidence of news amounts between two provinces.

Key words: news event, multi-dimensional description, RSS, spatio-temporal visualization