一种融合文本中地理位置和土地利用/覆被信息的野生动物活动细粒度定位方法
尹文萍(1997—),女,山东莱阳人,硕士生,主要从事地理文本挖掘研究。E-mail: yin@mail.ynu.edu.cn |
收稿日期: 2021-10-18
修回日期: 2021-11-17
网络出版日期: 2022-09-25
基金资助
国家自然科学基金项目(41971239)
第二次青藏高原综合科学考察研究(2019QZKK0402)
A Novel Method for Fine-Grained Geolocation of Wildlife Activities by Integrating Geographical Information and Land Use/Cover in Texts
Received date: 2021-10-18
Revised date: 2021-11-17
Online published: 2022-09-25
Supported by
National Natural Science Foundation of China(41971239)
Second Tibetan Plateau Scientific Expedition and Research Program(2019QZKK0402)
文本蕴含大量地理位置描述信息,有效融合地理关联信息以实现文本的精细定位是地理信息服务的难点。本文提出一种融合土地利用/覆被信息的描述地理位置的细粒度定位方法:在文本描述地理关联信息(地理位置实体、土地利用/覆被实体与空间关系)抽取、土地利用/覆被精细分类与地理位置粗粒度匹配定位的基础上,使用自然语言空间关系近似转换模型,确定地理位置的细粒度定位范围;基于土地利用/覆被实体及其周边精细分类信息,在该范围内检索匹配,确定地理位置的细粒度定位坐标。本文以野生亚洲象活动/肇事监测文本为例开展实验,并用匹配率与位置精度评价定位质量,结果表明:本文方法显著提升了地理位置的细粒度定位质量,实验精确匹配率(81.51%)、位置误差距离的均值(65.97 m)及其≤50 m的比例(70.50%)均优于国内主流在线地理编码与地名检索服务结合空间关系或其单独使用结果。该方法有助于完善地理位置定位方法体系、提升地理信息空间化质量,并可服务于野生动物活动/肇事监测预警等精细定位任务。
尹文萍 , 高宸 , 樊辉 , 谢菲 , 张鑫 . 一种融合文本中地理位置和土地利用/覆被信息的野生动物活动细粒度定位方法[J]. 地球信息科学学报, 2022 , 24(7) : 1363 -1374 . DOI: 10.12082/dqxxkx.2022.210641
Text data contain rich geographic information. How to mine and spatialize the geographic information embedded in text data through linking the geographic location text with its spatial location in the real world is fundamental for utilizing geographic information. However, as the semantic granularity of geographic location in texts is too raw to be directly used in most cases. It becomes a major challenge for geographic knowledge services to achieve fine-grained geolocation of texts by effectively integrating geographical information with other related features such as land use/cover. The existing geolocation methods, including geocoding, place name retrieval, and fuzzy area modeling, have been widely used to decode non-urban geographic location texts without considering land use/cover information. These methods usually failed to precisely extract geolocation in texts on wildlife activities. In this study, we proposed a fine-grained geolocation method through the inclusion of land use/cover information in texts on wildlife activities. This method employed a natural language spatial relationship approximation conversion model to determine a fine-grained geolocation domain by integrating geographically relevant information (including geographic location entities, land use/cover entities, and spatial relationships), fine classification of land use/cover, and coarse-grained matching geolocation. The coordinates of fine-grained geolocation were determined by iteratively searching and matching within the fine-grained geolocation domain by combining the natural language form of land use/cover entities and fine land use/cover classification map. Our experiments were conducted using texts information of the wild Asian elephants' activities/accidents occurred in southern Yunnan Province of China. The quality of geolocation in the experiments was evaluated using matching level and location accuracy. The results show that the method proposed here can soundly mine fine-grained geolocation of texts on wild Asian elephants' activities/accidents. By mining and analyzing the texts on Asian elephants' activities/accidents in an area with frequent human-elephant conflict in 2020, fine-grained geolocation of the examined Asian elephants' activities/accidents was accurately extracted. Compared with the domestic mainstream online geocoding and place name retrieval services with or without considering spatial relationships, the proposed method greatly improved the quality of fine-grained geolocation. The exact matching ratio of experimental location points reached to 81.51%, and the mean value of the location error distance between location points and real points was 65.97 m, with a proportion of the location error distance below 50 m of 70.50%. The significant outperformance of this method in mining and spatializing geographic information of texts on Asian elephants' activities/accidents sheds new light on wildlife monitoring and early warning, and human-wildlife conflict emergency management based on the fine-grained geolocation derived from multi-media texts on wildlife activities.
表1 实体及关系抽取精度Tab. 1 Accuracies of extracted spatial entities and spatial relationships |
抽取任务 | 精确率 | 召回率 | F1值 |
---|---|---|---|
地理关联实体 | 0.824 | 0.841 | 0.832 |
空间关系信息 | 0.702 | 0.717 | 0.709 |
表2 实验区2020年土地利用/覆被精细分类精度Tab. 2 Accuracy of land use/cover classification in the experimental area in 2020 (%) |
土地利用/覆被类型 | 制图精度 | 用户精度 |
---|---|---|
林地 | 99.03 | 99.16 |
茶园 | 98.47 | 99.33 |
水体 | 95.26 | 94.07 |
建筑用地 | 97.44 | 97.68 |
道路 | 82.62 | 75.72 |
甘蔗地 | 98.48 | 98.63 |
香蕉地 | 98.83 | 96.57 |
其他 | 98.50 | 98.83 |
总体精度 98.36 | ||
Kappa系数 97.73 |
表3 匹配率与位置精度的秩均值对比Tab. 3 Rank-average comparison of matching rate and position accuracy among different methods |
秩均值 | 本文方法 | 高德_S | 腾讯_S | 百度_S | 多源_S | 高德_G | 腾讯_ G | 百度_ G | 多源_G |
---|---|---|---|---|---|---|---|---|---|
匹配率秩均值 | 3.67 | 5.00 | 5.33 | 5.67 | 4.33 | 5.50 | 5.83 | 5.17 | 4.50 |
位置精度秩均值 | 3.80 | 5.40 | 5.20 | 5.00 | 4.80 | 6.20 | 5.60 | 4.80 | 4.20 |
[1] |
陆锋, 余丽, 仇培元. 论地理知识图谱[J]. 地球信息科学学报, 2017, 19(6):723-734.
[
|
[2] |
刘凯, 龙毅, 秦耀辰. 论地理信息的空间粒度[J]. 地理与地理信息科学, 2014, 30(1):8-12,17.
[
|
[3] |
余丽, 陆锋, 张恒才. 网络文本蕴涵地理信息抽取:研究进展与展望[J]. 地球信息科学学报, 2015, 17(2):127-134.
[
|
[4] |
|
[5] |
廖薇薇, 柳林, 周素红, 等. 多源在线地理编码服务分类优化模型[J]. 热带地理, 2018, 38(2):255-263.
[
|
[6] |
张弘弢, 肖炼, 周尧, 等. 多源在线地理编码与地名检索服务聚合方法[J]. 地理与地理信息科学, 2020, 36(4):1-7.
[
|
[7] |
|
[8] |
王圣音, 高勇, 陆锋, 等. 场所模型及大数据支持下的场所感知[J]. 武汉大学学报·信息科学版, 2020, 45(12):1930-1941.
[
|
[9] |
刘瑜. 社会感知视角下的若干人文地理学基本问题再思考[J]. 地理学报, 2016, 71(4):564-575.
[
|
[10] |
|
[11] |
|
[12] |
唐天琪, 曹青, 张翎, 等. 点线目标自然语言空间关系描述模拟表达方法研究[J]. 地球信息科学学报, 2018, 20(2):139-146.
[
|
[13] |
曹青, 洪必文, 张翎, 等. 基于自然语言空间关系描述的地图近似表达方法[J]. 地球信息科学学报, 2018, 20(11):1541-1549.
[
|
[14] |
洪必文, 曹青, 张翎, 等. 基于自然语言形态描述的地理实体模拟表达方法[J]. 地球信息科学学报, 2019, 21(10):1491-1501.
[
|
[15] |
|
[16] |
李冬梅, 张扬, 李东远, 等. 实体关系抽取方法研究综述[J]. 计算机研究与发展, 2020, 57(7):1424-1448.
[
|
[17] |
|
[18] |
|
[19] |
|
[20] |
西双版纳发布. 防护并举和谐共生西双版纳有效化解人象冲突矛盾[EB/OL]. https://www.thepaper.cn/newsDetail_forward_10268743, 2020-12-04
[
|
[21] |
阚琪. 基于条件随机场的命名实体识别及实体关系识别的研究与应用[D]. 北京: 北京交通大学, 2015.
[
|
[22] |
郭丹. 自然语言空间信息标注及识别[D]. 武汉: 武汉大学, 2017.
[
|
[23] |
|
[24] |
邬伦, 刘磊, 李浩然, 等. 基于条件随机场的中文地名识别方法[J]. 武汉大学学报·信息科学版, 2017, 42(2):150-156.
[
|
[25] |
余本功, 范招娣. 面向自然语言处理的条件随机场模型研究综述[J]. 信息资源管理学报, 2020, 10(5):96-111.
[
|
[26] |
|
[27] |
高俊平, 张晖, 赵旭剑, 等. 面向维基百科的领域知识演化关系抽取[J]. 计算机学报, 2016, 39(10):2088-2101.
[
|
[28] |
王明印. 开放式中文实体关系抽取研究[D]. 北京: 北京邮电大学, 2015.
[
|
[29] |
|
[30] |
|
[31] |
|
[32] |
林丽, 樊辉, 金缘. 山区县域土地利用/覆被变化多尺度多模型模拟对比——以云南省勐腊县为例[J]. 山地学报, 2020, 38(4):630-642.
[
|
[33] |
Wandergis. CoordTransform_py[DB/OL]. https://github.com/wandergis/coordTransform_py, 2020-05-13.
|
[34] |
田沁, 巩玥, 亢孟军, 等. 国内主流在线地理编码服务质量评价[J]. 武汉大学学报·信息科学版, 2016, 41(10):1351-1358.
[
|
[35] |
中国资源卫星应用中心. 高分二号[EB/OL]. http://www.cresda.com/CN/Satellite/3128.shtml, 2014-10-15.
[ ChinaCentre For Resources Satellite Data and Application. GF-2[EB/OL]. http://www.cresda.com/CN/Satellite/3128.shtml, 2014-10-15. ]
|
[36] |
|
[37] |
张广运, 张荣庭, 戴琼海, 等. 测绘地理信息与人工智能2.0融合发展的方向[J]. 测绘学报, 2021, 50(8):1096-1108.
[
|
[38] |
沈少青, 宫鹏, 程晓, 等. 陆生动物声音遥感:定位与误差分析[J]. 遥感学报, 2011, 15(6):1255-1275.
[
|
[39] |
宫鹏. 对遥感科学应用的一点看法[J]. 遥感学报, 2019, 23(4):567-569.
[
|
[40] |
|
[41] |
|
[42] |
|
[43] |
宫鹏, 张伟, 俞乐, 等. 全球地表覆盖制图研究新范式[J]. 遥感学报, 2016, 20(5):1002-1016.
[
|
[44] |
|
[45] |
|
[46] |
刘纪远, 宁佳, 匡文慧, 等. 2010-2015年中国土地利用变化的时空格局与新特征[J]. 地理学报, 2018, 73(5):789-802.
[
|
[47] |
刘涵, 宫鹏. 21世纪逐日无缝数据立方体构建方法及逐年逐季节土地覆盖和土地利用动态制图——中国智慧遥感制图iMap(China)1.0[J]. 遥感学报, 2021, 25(1):126-147.
[
|
[48] |
宫鹏. 智慧遥感制图(iMap)[J]. 遥感学报, 2021, 25(2):527-529.
[
|
/
〈 | 〉 |