地球信息科学学报 ›› 2013, Vol. 15 ›› Issue (4): 505-511.doi: 10.3724/SP.J.1047.2013.00505

• 本期要文(可全文下载) • 上一篇    下一篇

空间扫描统计量方法中候选聚集区域生成的快速算法

李小洲1, 王劲峰2   

  1. 1. 武汉科技大学医学院公共卫生学院, 武汉 430065;
    2. 中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室, 北京 100101
  • 收稿日期:2013-01-14 修回日期:2013-03-20 出版日期:2013-08-08 发布日期:2013-08-08
  • 通讯作者: 王劲峰(1965- ),男,上海人,博士生导师,研究员,主要研究方向为空间数据分析及空间统计。E-mail:wangjf@lreis.ac.cn E-mail:wangjf@lreis.ac.cn
  • 作者简介:李小洲(1974- ),男,湖北麻城人,讲师,主要研究方向为空间数据分析及空间流行病学。E-mail:lixiaozhou@wust.edu.cn
  • 基金资助:

    国家科技重大专项子课题“艾滋病和病毒性肝炎等重大传染病防治/传染病病原谱时空信息分析和预报”(2012ZX10004-201);卫生行业科研专项项目“传染病时空预警模型及关键参数研究”(201202006)。

A Fast Method for Making Candidate Clusters in Spatial Scan Statistic Method

LI Xiaozhou1, WANG Jinfeng2   

  1. 1. Medical School, Wuhan University of Science and Technology, Wuhan 430065, China;
    2. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
  • Received:2013-01-14 Revised:2013-03-20 Online:2013-08-08 Published:2013-08-08

摘要:

空间扫描统计量方法是公共卫生监测领域应用非常广泛的空间聚集探测快速算法。其利用传染病监测数据可探测到病例异常增多的局部区域,对可能的传染病暴发做出早期预警。候选聚集区域的预先生成是该方法的一个关键步骤。将现有的候选聚集区域生成方法应用到包含子区域较多的大区域时,可能导致大量候选聚集区域的遗漏,影响探测结果的准确性;或可能生成大量重复的候选聚集区域,导致随后空间扫描计算时间的延长。本文在原有候选聚集区域生成方法的基础上,提出了一种新的快速算法。它以格网点间隔的优化选择,可减少对可能候选聚集区域的遗漏;同时,基于多重排序算法可在较短的时间之内,删除掉原始候选聚集区域集合中的大量重复。通过山东省西南部608个乡镇点的候选聚集区域生成测试,改进的方法可减少候选聚集区域的遗漏,并在较短的时间内删除掉所有的重复候选聚集。

关键词: 格网, 多重排序, 候选聚集区域, 空间扫描统计量

Abstract:

Spatial scan statistic method is a widely adopted spatial cluster detection method in the field of public health surveillance. It can detect a sub-zone where the number of disease cases rises abnormally, based on infectious disease surveillance data, and thus is able to make early warning on possible outbreak of infectious disease. Chinese Center for Disease Control and Prevention (China CDC) launched China Infectious Disease Automated-alert and Response System (CIDARS) in 2004, which handles the infectious disease surveillance data of all of the counties of China to detect possible case clusters. The making of candidate clusters is a key step to this method, which to some extent determines the accuracy and time efficiency of the spatial scan statistic method. There are two deficiencies if the existing candidate clusters making method is applied to a very big research area with a lot of sub-regions. The first is that, the inappropriate separation distance of grid points might miss a lot of possible candidate clusters, which affects the accuracy of detected result. The second is that, the existing method might duplicate a great number of candidate clusters, which could prolong the computing time of subsequent spatial scan operation. In this paper a new efficient method is proposed according to the former existing candidate clusters making method. Based on the correct setting to the separation distance of grid points, the new method could greatly reduce the possibility of missing of some possible candidate clusters. At the same time, applying multiple-sort arithmetic, the proposed new method could find and delete a great number of duplicate clusters in the original-making candidate clusters in a shorter time. Finally, the paper applies and tests the proposed method for the making of candidate clusters in 608 counties in southwest Shandong Province and proves that the method works satisfactorily in both two aims, that is, it reduced the computing time and reduced the missing of candidate clusters.

Key words: candidate cluster, grid, multiple-sort, spatial scan statistic