地球信息科学学报 ›› 2020, Vol. 22 ›› Issue (6): 1394-1405.doi: 10.12082/dqxxkx.2020.190276

• 数据共享与数据挖掘 • 上一篇    下一篇

一种逐级合并OD流向时空联合聚类算法

项秋亮1,2,3, 邬群勇1,2,3,*(), 张良盼1,2,3   

  1. 1. 数字中国研究院(福建),福州 350003
    2. 福州大学卫星空间信息技术国家地方联合工程研究中心,福州 350108
    3. 空间数据挖掘与信息共享教育部实验室,福州 350108
  • 收稿日期:2019-06-03 修回日期:2020-02-13 出版日期:2020-06-25 发布日期:2020-08-25
  • 通讯作者: 邬群勇 E-mail:qywu@fzu.edu.com
  • 作者简介:项秋亮(1995— ),男,安徽黄山人,硕士生,研究方向为空间数据挖掘。E-mail: qiuliangxiang@outlook.com
  • 基金资助:
    国家自然科学基金项目(41471333);中央引导地方科技发展专项项目(2017L3012)

An OD Flow Spatio-temporal Joint Clustering Algorithm based on Step-by-step Merge Strategy

XIANG Qiuliang1,2,3, WU Qunyong1,2,3,*(), ZHANG Liangpan1,2,3   

  1. 1. The Academy of Digital China (Fujian), Fuzhou 350003, China
    2. National & Local Joint Engineering Research Center of satellite-spatial Information Technology, Fuzhou University, Fuzhou 350108, China
    3. Key Laboratory of Spatial Data Mining & Information Sharing of MOE, Fuzhou 350108, China
  • Received:2019-06-03 Revised:2020-02-13 Online:2020-06-25 Published:2020-08-25
  • Contact: WU Qunyong E-mail:qywu@fzu.edu.com
  • Supported by:
    National Natural Science Foundation of China(41471333);The Central Guided Local Development of Science and Technology Project(2017L3012)

摘要:

现有OD流向聚类多将O点和D点相分离或者将OD流向看作4维空间的数据点进行聚类处理,忽视了流向长度、方向、时间对流向聚类的影响。本文以流向作为研究对象,提出一种基于流向间相似性度的逐级合并OD流向时空联合聚类算法。首先在充分研究OD流向的空间信息和时间信息的基础上,构建合理的OD流向间时空相似性度量方法,对OD流向间的时空相似性进行量化;然后提出逐级合并OD流向聚类策略,优化类簇合并的顺序,以减少层次聚类的时间开销,实现OD流向的时空联合聚类。以成都市的滴滴出行OD数据和纽约市出租车数据为例对本文方法进行了验证,结果表明:① 本算法聚类获得的流向类簇不仅带有空间特征还具备时间特征;② 在不同参数下本方法可以得到不同时空尺度的聚类结果;③ 与现有较高水平的流向聚类算法相对比,本文方法的聚类效果更好。这体现在流向类簇内部的流向之间有着充分的相似性,以及本文方法不仅可以提取出显著的流向类簇,还可以提取出非热点区域之间的流向类簇。本算法顾及空间因素和时间因素,可以通过调整时空相似性度量方法中的时间参数和空间参数以实现不同时空尺度的流向聚类,这使得从不同时空角度研究城市居民出行模式成为可能。本文提出的OD流向时空联合聚类算法从联合时间信息和空间信息的角度获得对运动数据的新见解,有助于合理全面地研究居民的移动模式、区域之间的空间联系、已知出行结构的确定以及出行目的的探索,是后续一系列分析工作的基础。

关键词: OD流向, 时空联合聚类, 时空相似性度量, 逐级合并策略, 层次聚类, 时空尺度, 移动模式, 空间联系

Abstract:

Most of the existing OD flow clustering methods adopt the strategy of dividing the OD flow into O point and D point or considering flow as the four-dimensional point to implement flow clustering, which ignores the effects caused by the length, direction and time information on the clustering process. In this paper, we proposedabrand-new spatio-temporal flow clustering method based on the similarity between flows with a strategy of merging flow clusters under different grading. Firstly, a reasonablespatio-temporal similarity measurement formula of OD flow was constructed to quantify the spatio-temporal similarity between OD flows on the basis of full stydy of OD flow's spatial information and temporal information. Then, with the purpose of optimizing the order of merging flow clusters, reducing the time consumption of clustering process, a strategy of merging flow clusters under different grading was used to complete flow clustering. In this method, both of time information and spatial information weretaken into consideration. By modifying the parameters of the spatio-temporal similarity measurement formula, our method can obtain clustering results for different time scales and spatial scales, which makes it possible to analyze the movement patterns from a multi-scale perspective. To verify the effective of our method, a series of experiments on real dataset was executed. The clustering results demonstrate that: ①flow clusters discovered by our method not only hadspatial characteristic but also hadtemporal characteristic; ② our method can discover different spatio-temporal OD flow cluster under different spatio-temporal parameters; ③ by comparingthe clustering results of our method with previous work of advanced technology level, it turnedout that our method hada better clustering performance, which was reflected in the fact that flows within the same flow cluster satisfied the similarity relationship and our method can not only find the obvious movements patterns but also capture inconspicuous movements patterns between non-hot zones. Thespatio-temporal joint OD flow clustering method proposed in this paper obtains new insights into motion from the perspective of joint temporal and spatial information, which is conducive to a reasonable and comprehensive study of residents' movement patterns, spatial linkage between regions, the determination of the known travel structure, and the exploration of the purpose of travel. The process of OD flow clutsering is the beginning of a series of subsequent analysis.

Key words: OD flow, spatio-temporal joint clustering, spatio-temporal similarity measure, step-by-step merge strategy, hierarchical clustering, spatio-temporal scales, movement patterns, spatial linkage