地球信息科学学报 ›› 2020, Vol. 22 ›› Issue (7): 1476-1486.doi: 10.12082/dqxxkx.2020.190565

• 地球信息科学理论与方法 • 上一篇    下一篇

面向多源地理空间数据的知识图谱构建

刘俊楠1(), 刘海砚1(), 陈晓慧1, 郭漩2, 郭文月2, 朱新铭1, 赵清波1   

  1. 1.信息工程大学数据与目标工程学院,郑州 450001;
    2.信息工程大学地理空间信息学院,郑州 450001
  • 收稿日期:2019-09-30 修回日期:2019-12-18 出版日期:2020-07-25 发布日期:2020-09-25
  • 通讯作者: 刘海砚 E-mail:6929423@qq.com;liu2000@vip.sina.com
  • 作者简介:刘俊楠(1991— ),男,辽宁锦州人,博士生,主要从事时空数据挖掘与知识图谱相关研究。E-mail:6929423@qq.com
  • 基金资助:
    河南省自然科学基金项目(182300410005);国家自然科学基金项目(41801313)

The Construction of Knowledge Graph Towards Multi-Source Geospatial Data

LIU Junnan1(), LIU Haiyan1(), CHEN Xiaohui1, GUO Xuan2, GUO Wenyue2, ZHU Xinming1, ZHAO Qingbo1   

  1. 1. School of Data and Target Engineering, Information Engineering University, Zhengzhou 450001, China;
    2. Institute of Geospatial Information, University of Information Engineering, Zhengzhou 450001, China
  • Received:2019-09-30 Revised:2019-12-18 Online:2020-07-25 Published:2020-09-25
  • Contact: LIU Haiyan E-mail:6929423@qq.com;liu2000@vip.sina.com
  • Supported by:
    Natural Science Foundation of Henan Province(182300410005);National Natural Science Foundation of China(41801313)

摘要:

知识图谱广泛应用于人工智能领域,基于此融合多源地理空间数据并表示地理事物的语义和时空信息,实现“数据—知识”的转换成为人们关注的热点。但现有通用知识图谱的空间知识覆盖度低且存在错误,同时基于维基百科构建的地理知识图谱存在空间关系、中文属性和坐标信息等属性缺失问题。因此本文以地理空间数据和百度百科数据的特征分析为基础,提出了以地理空间数据提取地理实体为主,百度百科补充属性信息为辅的知识图谱构建方式。① 基于GeoSparql设计模式层的地理实体、要素、几何形状和空间关系的逻辑关系;② 通过地理实体提取、实体链接和属性信息填充,在数据层实现空间知识融合;③ 结合关系型数据库和图数据库,设计空间知识存储方式;④ 在实体和关系2个方面定量分析知识图谱的构建规模。结果表明,本文构建的知识图谱中地理实体覆盖度和链接百科成功率相对较高,扩充了地理实体的概念描述信息,并将地理坐标的覆盖率提高到100%,对地理数据到地理知识的拓展具有重要意义。

关键词: 知识图谱, 百度百科, 地理空间数据, 数据融合, 地理空间知识, 空间关系, 拓扑关系, 地理实体

Abstract:

Knowledge graph is widely applied in the field of artificial intelligence. Fusing multisource geospatial data is a hot topic for the transformation of “data-knowledge”. However, the general knowledge graph has low spatial knowledge and some of them is incorrect. Moreover, geographic knowledge graph from Wikipedia has some problems such as missing spatial relation, Chinese attribute, and exact coordinates information. In this paper, we analyze the characteristics of geospatial data and baidubaike.In addition, we propose a knowledge graph construction method based on geographic entities which are extracted from geospatial data and supplemented by attribute information from baidubaike.At the end, the scale of knowledge graph is analyzed in terms of entities and relations. The experiment proves that the conceptual description information of geographic entities is expanded, and there is a higher success rate of linking web page with geographic entities than ever. In addition, the coverage of geographic coordinates is increased to 100%. The knowledge graph constructed in this paper will have an important significance to extend geospatialdata to knowledge.

Key words: knowledge graph, baidubaike, geospatial data, data fusion, geospatial knowledge, spatial relationship, topology, geographic entity