地球信息科学学报 ›› 2018, Vol. 20 ›› Issue (10): 1403-1411.doi: 10.12082/dqxxkx.2018.180281

• 地球信息科学理论与方法 • 上一篇    下一篇

基于相似数据聚合与变K值KNN的短时交通流量预测

梁艳平1,2,3(), 毛政元1,2,3,*(), 邹为彬1,2,3,4, 许锐5   

  1. 1. 福州大学福建省空间信息工程研究中心,福州 350002
    2. 福州大学空间数据挖掘与信息共享教育部重点实验室,福州 350002
    3. 福州大学地理空间信息技术国家地方联合工程研究中心,福州 350002
    4. 福建工程学院交通运输学院,福州 350118
    5. 福建工程学院信息科学与工程学院,福州 350118
  • 收稿日期:2018-06-11 修回日期:2018-07-19 出版日期:2018-10-25 发布日期:2018-10-17
  • 通讯作者: 毛政元 E-mail:497336236@qq.com;zymao@fzu.edu.cn
  • 作者简介:

    作者简介:梁艳平(1993-),男,硕士生,研究方向为短时交通流量预测、智能算法。E-mail: 497336236@qq.com

  • 基金资助:
    国家自然科学基金项目(41471333);福建省自然科学基金面上项目(2018J01619)

Short-term Traffic Flow Prediction Based on Similar Data Aggregation and KNN with Varying K-value

LIANG Yanping1,2,3(), MAO Zhengyuan1,2,3,*(), ZOU Weibin1,2,3,4, XU Rui5   

  1. 1. Provincial Spatial Information Engineering Research Center, Fuzhou University, Fuzhou 350002, China
    2. Key Laboratory of Spatial Data Mining and Information Sharing of Ministry of Education, Fuzhou University, Fuzhou 350002, China
    3. National Engineering Research Centre of Geospatial Space Information Technology, Fuzhou University, Fuzhou 350002, China
    4. School of Transportation, Fujian University of Technology, Fuzhou 350118, China
    5. School of Information Science and Engineering, Fujian University of Technology, Fuzhou 350002, China
  • Received:2018-06-11 Revised:2018-07-19 Online:2018-10-25 Published:2018-10-17
  • Contact: MAO Zhengyuan E-mail:497336236@qq.com;zymao@fzu.edu.cn
  • Supported by:
    National Natural Science Foundation of China, No.41471333;Project of Science and Technology of Fujian Province, No.2018J01619.

摘要:

短时交通流量预测是交通控制和诱导涉及的关键技术问题,由于短时交通流量存在不确定性和时变性,其预测难度较大,是相关研究领域与工程实践中亟待解决的难题。为提高短时交通流量预测的准确性,本文设计与实现了基于相似数据聚合和变K值KNN(KNN-SDA)的短时交通流量预测算法。该算法首先采用互信息法在经过预处理的交通流量数据集提取交通流量序列最佳延迟时间信息,生成状态向量,并构建交通流量历史数据库;然后以本文所提出的相似数据聚合方法完成历史数据的聚合与清洗得到训练数据集;最后通过交叉验证确定每个时刻的最优K近邻数,完成算法实现。实验结果表明,本文提出的变K值KNN-SDA算法在保证执行效率的同时能明显提高短时交通流量的预测精度。

关键词: 短时交通流预测, 互信息法, 相似数据聚合, KNN, 交叉验证

Abstract:

Real-time and accurate short-term traffic flow prediction, a critical technical problem in traffic control and guidance which is challenging and needs to be solved urgently in related research fields and engineering practice, still remains because of the hardship caused by the uncertainty and the temporal variability in traffic flow datasets acquired in different times. In order to improve the performance of the short-term traffic flow prediction, a new method based on similar data aggregation techniques and a modified KNN algorithm with varying K-value (KNN-SDA) was proposed and the related algorithm was also implemented and tested on actual measured datasets in this paper. Firstly state vectors were generated from the preprocessed traffic flow datasets by calculating the optimal time delay with the help of the mutual information theory. Each of our state vectors is composed of two parts, the first one of which is a regular state vector and the second one of which is a modified state vector which makes a contribution to a higher similarity between our state vectors and those in training datasets. Subsequently a historical traffic flow database of temporal series was constructed on the basis of results mentioned above for further experiments. After that, the proposed similar data aggregation techniques were applied to aggregate and clean data to obtain 144 training data sets in different times from historical traffic flow database, which would effectively improve the prediction accuracy and efficiency of the proposed algorithm. At last, the optimal K-values, each of which corresponded to a moment, were determined through the cross validation method. So far, the overall process of the KNN-SDA algorithm with varying K-value has been completed. In order to verify the performance of the proposed method, we compared the experimental results derived from our method with those from three other ones. It turns out that the KNN-SDA algorithm with varying K-value proposed in this article can improve the prediction accuracy significantly and ensure high execution efficiency as well.

Key words: short-term traffic prediction, mutual information method, similar data aggregation, KNN, cross validation