Change Detection of Geographic Features Based on Web Pages

  • 1. School of Geography, The University of Leeds, LS2 9JT, United Kingdom;
    2. Key Laboratory of Virtual Geographical Environment, Ministry of Education, Nanjing Normal University, Nanjing 210046, China;
    3. National Geomatics Center of China, Beijing 100830, China;
    4. School of Computer Science, Nanjing University of Posts and Telecommunication, Nanjing 210003, China

Received date: 2013-06-07

  Revised date: 2013-07-05

  Online published: 2013-09-29


Geographic features change detection has became a vital component of the national geographical information 12th Five-Year-Plan and the national geographic general survey. In web pages, billions of geographic feature changes were contained, especially in government official websites, news homepages, social portals and etc. The web pages of these websites update frequently, which could provide the latest data for geographic infor-mation change detection. Considering the complex characteristics of the web geographic information description, this paper did some valuable achievements. First of all, the geographic information knowledge base was established by summarizing the geographic information words and phrases, which could give the great supports to geographic information semantics change detection. Then, the web geographic information was obtained using two kinds of web crawler technologies. Combining the Google Custom Search crawler and general topic crawler, the web geographic information obtainment could be more complete in both scope and depth. Thirdly, the geographic information was parsed and extracted from the web text, which showed users the related features, place names, times and attributes. Last but not least, the prototype system was finally developed and the results were analyzed. The experiments indicated that the accuracy of related web pages obtainment and features change detection were over 74% and 70% respectively. In addition, the results of geographic information change detection highly relied on the integrity of knowledge base, which need to be completed further. Moreover, the uncertainty and fuzziness of web geographic information also limited the change detection results. Therefore, the web page based geographic information change detection could be a supplementary method of geographic information change detection. Combining the traditional surveying detection and remote-sensing imagery detection methods, it could solve the problems of continuous updating and timely updating of geographic information efficiently.

Cite this article

WANG Shu, JI Lei-Jing, ZHANG Xue-Yang, DIAO Ren-Liang, CHEN Xiao-Dan, TU Gao . Change Detection of Geographic Features Based on Web Pages[J]. Journal of Geo-information Science, 2013 , 15(5) : 625 -634 . DOI: 10.3724/SP.J.1047.2013.00625


[1] 钱育华.数字城镇的数据更新[J].地球信息科学,2002,4(3): 64-67.

[2] Chen J, Zhao R L ,Wang D H. Dynatmic updating system for national fundamental GIS: Concepts and research agenda[J]. GeomaticsWorld, 2007,5(5):4-9.

[3] Heipke C. Updating geospatial databases from images[C].//Li Z L, Chen J, Baltsavias E (Eds.). Advances in Photogrammetry, Remote Sensing and Spatial Information Sciences: 2008 ISPRS Congress Book, Boca Raton: CRC Press, 2008,355-362.

[4] Badard T. Towards a generic updating tool for geographic databases[C]. Proceedings of GIS/LIS'98 Annual Exposition and Conference. USA, 1998,352-363.

[5] 陈军,王东华,商瑶玲.国家1:50000数据库更新工程总体设计研究与技术创新[J].测绘学报,2010,39(1):7-10.

[6] 王迪伟.基于PDA的1:10000比例尺地形图野外调绘[J]. 测绘通报,2010(7):59-61.

[7] 李冰,曹宏文,曹歆宏.大比例尺地理信息数据库建设刍论 [J].科技创新与生产力,2010,195(7):83-85.

[8] 王帅.初探首次全国地理国情普查[J].3S News Weekly, 2013(5):30-33.

[9] Palkowsky B, MetaCarta I. A new approach to information discovery——Geography really does matter[C]. Proceedings of the SPE Annual Technical Conference and Exhibition, 2005.

[10] Ai T. Constraints of progressive transmission of spatial data on the web[J]. Geo-spatial Information Science, 2010, 13(2):85-92.

[11] 容伟杰.网络信息存在的几大问题[J].图书馆学研究, 2003(2):48-49.

[12] 孙瑞英.网络数据内容分析研究[J].图书馆学研究,2005 (5):35-39.

[13] Wu M L, Li W J L, Lu Q, et al. CTEMP: A Chinese temporal parser for extracting and normalizing temporal information[C]. Natural Language Processing—IJCNLP 2005, Second International Joint Conference, Korea, 2005:694-706.

[14] 赵国荣.中文新闻语料中的时间短语识别方法研究[D]. 太原:山西大学,2006.

[15] 逯万辉,马建霞.基于条件随机场模型的复杂时间信息抽 取研究[J].现代图书情报技术,2011(10):29-33.

[16] 宋洋,徐蔚然.基于条件随机场的事件起止时间表达式的识别[J].中国科技论文在线,2012(1):1-8.

[17] 俞鸿魁,张华平,刘群,等.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94.

[18] 钱晶,张杰,张涛.基于最大熵的汉语人名地名识别方法 研究[J].小型微型计算机系统,2006,27(9):1761-1764.

[19] 张雪英,闾国年,李伯秋,等.基于规则的中文地址要素解 析方法[J].地球信息科学学报,2010,12(2):9-16.

[20] 李丽双,党延忠,廖文平,等.CRF与规则相结合的中文地名识别[J].大连理工大学学报,2012,52(2):285-289.

[21] 李丽双,黄德根,陈春荣.SVM与规则相结合的中文地名自动识别[J].中文信息学报,2006,20(5):27-51.

[22] Mark M D, Comas D, Egenhofer M J, et al. Evaluating and refining computational models of spatial relations through cross-linguistic human-subjects testing[C].//Frank A and Kuhn W (Eds.). Spatial Information Theory -A Theoretical Basis for GIS, International Conference COSIT, Semmering, Austria, Lecture Notesin Computer Science. Berlin: Springer-Verlag,1995,553-568.

[23] Egenhofer M J. Locational SQL: Syntax extensions, surveying engineering program[D]. Orono: University of Maine,1987.

[24] Coyne B, Sproat R. WordsEye: An automatic textto-scene conversion system[C]. Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, New York, 2001:487-496.

[25] Le X Q, Yang C J, Yu W Y. Spatial concept extraction based on spatial semantic role in natural language[J]. Geomatics and Information Science of Wuhan University, 2005,30(12):1100-1103.

[26] Zhang X Y, Lv G N. Natural-language spatial relations and their applications in GIS[J]. Geo-information Science, 2007,9(6):77-81.

[27] 朱少楠,张雪英,张春菊.地理空间关系描述的句法模式识别[C]. Proceedings of 2010 International Conference on Broadcast Technology and Multimedia Communication, 2010(4):355-357.

[28] 蒋文明.面向中文文本的空间方位关系抽取方法研究[D].南京:南京师范大学,2010.

[29] 高文利. IERDL——基于关键词驱动的信息抽取系统的规则描述语言[J].软件导刊,2009,10(8):67-69.

[30] Ravi S, Pa?ca M. Using structured text for large-scale attribute extraction[C]. Proceedings of the 17th ACM Conference on Information and Knowledge Management, ACM, 2008:1183-1192.

[31] Soderland S. Learning information extraction rules for semi-structured and free text[J]. Machine Learning, 1999, 34(1-3):233-272.

[32] 刘臻煕.中文文本中地理实体属性信息抽取方法研究[D].南京:南京师范大学,2010.

[33] 张春菊.面向特定时间的Web文本时刻属性信息挖掘方法[D].南京:南京师范大学,2013.

[34] 周立,邓云青.城市地理信息系统数据更新方式研究[J]. 地理空间信息,2008,6(5):45-47.

[35] 曾文华,黄桦.基于网页信息检索的地理信息变化检测方法[J].计算机应用,2010(4):1132-1134.

[36] 闫会杰,赵巍.服务于地理信息数据动态更新的网络蜘蛛[J].测绘技术装备,2012(2):21-22.

[37] Metzger M J. Making sense of credibility on the Web: Models for evaluating online information and recommendations for future research[J]. Journal of the American Society for Information Science and Technology, 2007,58 (13):2078-2091.

[38] 李丽双,党延忠,廖文平,等.CRF与规则相结合的中文地名识别[J].大连理工大学学报,2012,52(2):285-289.