›› 2010, Vol. 12 ›› Issue (1): 9-16.

Rule-based Approach to Semantic Resolution of Chinese Addresses

ZHNAG Xueying, LV Guonian, LI Boqiu   

  1. Key Laboratory of Virtual Geographical Environment, Ministry of Education, Nanjing Normal University, Nanjing 210046, China
  • Received:2009-09-21 Revised:2010-01-08 Online:2010-02-25 Published:2010-02-25

Abstract: A geographic information system(GIS) integrates hardware,software,and data for capturing,managing,analyzing,and displaying all forms of geographically referenced information.Addresses are one of the most popular geographical reference systems in natural languages.Address geocoding is considered as the most effective approach to bridging the gap between business data in management information systems(MIS) and GIS,which supports geospatial information visualization and spatial analysis.Chinese address geocoding faces three significant problems,i.e.address models,address resolution and address matching,because of the un-standardization of Chinese place names and the shortage of national address databases.Address resolution aims to automatically split address strings in natural language into address units without semantic incompletion.It plays a fundamental role in address models and address matching.Previous research focuses on rule or gazetteer based approaches,which are easily implemented but with poor coverage and performance.In theory,Chinese address resolution is similar to word segmentation in Chinese natural language processing.Based on the investigation of large-scale Chinese place names and address syntactic patterns,this paper identifies primary and secondary general characters that represent a variety of address units.And then an address numerical representation method is presented to induce syntactical rules of Chinese addresses.Finally,we develop an RBAI algorithm for implementation Chinese address resolution and illustrate an example.The experimental results indicate that the proposed approach can achieve satisfactory efficiency and effectiveness for large-scale data processing,the accuracy ratio over 92% and the processing rate over 2,800 items per second.The proposed approach and system can be extended to such fields as land management,asset management,city plan,public security,postal system,taxation,public health management and other location-base services.

Key words: Chinese address, semantic resolution, address geocoding, address representation