地球信息科学学报 ›› 2015, Vol. 17 ›› Issue (4): 379-390.doi: 10.3724/SP.J.1047.2015.00379

• •    下一篇

时间本体及其在地学数据检索中的应用

侯志伟1,2(), 诸云强1,4,*(), 高星1,3, 潘鹏1, 罗侃1,2, 王东旭1,2   

  1. 1. 中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室, 北京 100101
    2. 中国科学院大学, 北京 100049
    3. 中国南海研究协同创新中心, 南京 210093
    4. 江苏省地理信息资源开发与利用协同创新中心, 南京 210023
  • 收稿日期:2014-12-02 修回日期:2015-01-04 出版日期:2015-04-10 发布日期:2015-04-10
  • 通讯作者: 诸云强 E-mail:houzw.13s@igsnrr.ac.cn;zhuyq@igsnrr.ac.cn
  • 作者简介:

    作者简介:侯志伟(1989-),男,湖南永兴人,硕士生,研究方向为地学数据共享和地理信息技术与应用。E-mail:houzw.13s@igsnrr.ac.cn

  • 基金资助:
    国家自然科学基金项目“基于元数据语义的地理空间数据关联方法研究”(41371381);科技基础性工作专项重点项目“科技基础性工作数据资料集成与规范化整编”(2013FY110900);国家科技基础条件平台-地球系统科学数据共享平台(2005DKA32300)

Time-Ontology and its Application in Geodata Retrieval

HOU Zhiwei1,2(), ZHU Yunqiang1,4,*(), GAO Xing1,3, PAN Peng1, LUO Kan1,2, WANG Dongxu1,2   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. Collaborative Innovation Center of South China Sea Studies, Nanjing 210093, China
    4. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
  • Received:2014-12-02 Revised:2015-01-04 Online:2015-04-10 Published:2015-04-10
  • Contact: ZHU Yunqiang E-mail:houzw.13s@igsnrr.ac.cn;zhuyq@igsnrr.ac.cn
  • About author:

    *The author: SHEN Jingwei, E-mail:jingweigis@163.com

摘要:

高效、准确地获取目标数据及其关联数据,是决定大数据共享与挖掘分析能否实现的关键因素。传统的数据检索方法无法利用地学数据间的显性或隐含关系,已不能满足日益增长的对检索结果质和量的需求,而本体理论和技术的语义检索成为当前的研究热点。本文针对时间这一地学数据的本质属性,在系统研究地学数据时间概念与特征的基础上,建立了地学数据时间本体模型,并深入论述了模型中的时间关系、时间坐标系等内容,提出了时间位置和时间距离的描述函数,同时研究了二者的本体表达方式。构建了包括地质年代等在内的地学数据时间本体库,并以语义网开发框架Jena为基础,经本体解析、元数据时间信息抽取与标注等过程,将时间本体应用于地球系统科学数据共享平台的元数据检索之中。结果表明,以时间本体的地学数据语义检索查全率约为关键字方法的1倍,检索结果排序,以及关联数据推荐方面也有更好的效果,为促进地学数据共享与关联发现提供了一种有效的方法。

关键词: 地学数据, 时间本体, 语义检索, 时间特征

Abstract:

The way to obtain the target data and relevant data efficiently and accurately has been a critical factor in data sharing and data mining during the era of BigData. The retrieval techniques which are currently in use could no more meet the increasing demands on quality and quantity for retrieving data, due to the unavailable usage of explicit and implicit relations among geodata. Current researches mainly focus on semantic retrieval, which is based on the theories and technologies of ontology. Taking consideration of time, an essential attribute of geodata, this paper constructed a geodata time-ontology model founded on the researches about the concepts and characteristics of temporal geodata. In addition, this article presented information about the temporal relations and time coordinate system, analyzed the functions for time position and time distance, and studied their formalization. In the end, a time-ontology base had been built up according to the time-ontology model, and an application had been developed using Apache Jena, a free and open source Java framework for building semantic web and linked data applications. After parsing the ontologis, extracting and annotating the time expressions from the metadata, the time ontology had been further applied to the retrieval of metadata from the data sharing infrastructure of earth system science. Results of these experiments show that the semantic geodata retrieval based on time-ontology has doubled the recall ratio, and it also performs much better than traditional information retrieval methods from the perspective of linked data recommendation and result sorting, which provides an effective approach for sharing geodata and finding linked data.

Key words: geodata, time-ontology, semantic retrieval, temporal characteristics