• 地球信息科学理论与方法 •

### 改进的最小生成树自适应空间点聚类算法

1. 1. 传统村镇文化数字化保护与创意利用技术国家地方联合工程实验室, 衡阳 421002
2. 湖南省古村古镇文化遗产数字化传承协同创新中心, 衡阳 421002
• 收稿日期:2018-01-30 修回日期:2018-04-15 出版日期:2018-07-20 发布日期:2018-07-13
• 通讯作者: 段晓旗 E-mail:715829216@qq.com;201997125@qq.com
• 作者简介:

作者简介：颜金彪（1987-）,男,湖南衡阳人,硕士生,主要从事时空数据挖掘研究。E-mail: 715829216@qq.com

• 基金资助:
国家自然科学基金项目（41471118、41771150、41771188）;衡阳师范学院青年项目（16A01、17A02）

### Improved Adaptive Spatial Points Clustering Algorithm Based on Minimum Spanning Tree

YAN Jinbiao1,2(), ZHENG Wenwu1,2, DUAN Xiaoqi1,2,*(), DENG Yunyuan1,2, GUO Yuanjun1,2, HU Zui1,2

1. 1. National-Local Joint Engineering Laboratory on Digital Preservation and Innovative Technologies for the Culture of Traditional Villages and Towns, Hengyang 421002, China
2. Cooperative Innovation Center for Digitalization of Cultural Heritage in Traditional Villages and Towns, Hengyang 421002, China
• Received:2018-01-30 Revised:2018-04-15 Online:2018-07-20 Published:2018-07-13
• Contact: DUAN Xiaoqi E-mail:715829216@qq.com;201997125@qq.com
• Supported by:
National Natural Science Foundation of China, No.41471118, 41771150, 41771188; Youth Project of Hengyang Normal university, No.16A01,17A02

Abstract:

In this paper, we proposed an improved adaptive spatial point clustering algorithm based on minimum spanning tree (MSTAA in abbreviation) to solve the problems existed in the traditional clustering algorithms. The first problem of these classical clustering algorithms is that the noise edges are determined using the global invariant. Another one is that the initial clustering parameters such as edge weight tolerance, edge variation factor, the number of clusters and initial clustering centers are determined by the users empirically. Furthermore, these algorithms can't find the noise edges at the local level. Based on these problems mentioned above, the algorithm put forward in this article aims to overcome the influence of subjective factors by defining two clipping factors. These trimming factors do not need to be determined by the users and can be automatically obtained according to the statistical features of the side length in the minimum spanning tree. The detailed realization process is as follows. In the first place, the pruning operation on the minimum spanning tree from the global level is carried out, which can eliminate the noises in the global environment. After the first round of tailoring, the initial minimum spanning tree becomes sub-tree collections. In the second place, in order to eliminate the noises at the local level, the algorithm performs the second round of pruning operation by setting the adaptive local cutting factor in the light of the side length statistics of each sub-tree. After the above two rounds of cutting, the MSTAA algorithm will get the final clustering result. In order to validate the effectiveness of the algorithm, both a simulated data and a practical application are adopted. By comparing with 4 classical clustering algorithms (k-means, DBSCAN, SEMST, HEMST), we find that the improved algorithm presented in this paper could find clusters of arbitrary shape and density in the environment where no one provides experience parameters. At the same time, not only does the MSTAA algorithm eliminate the obvious global noise points, but also it can distinguish noise points at the local environment so as to ensure a high similarity degree of cluster point set. All of the features of the MSTAA algorithm mentioned above make it possible to automatically mine hidden information behind spatial point data.