地球信息科学学报 ›› 2023, Vol. 25 ›› Issue (4): 741-753.doi: 10.12082/dqxxkx.2023.220673

• 轨迹与交通 • 上一篇    下一篇

一种识别共享单车潮汐点的时空模型和基于KNN-LightGBM的租还需求预测方法

柯日宏(), 吴升(), 柯玮文   

  1. 福州大学 数字中国研究院(福建),福州 350003
  • 收稿日期:2022-09-08 修回日期:2022-12-06 出版日期:2023-04-25 发布日期:2023-04-19
  • 通讯作者: *吴升(1972—),男,福建松溪人,博士,教授,主要研究方向为时空数据分析与可视化、数字化规划等。 E-mail: ws0110@163.com
    *吴升(1972—),男,福建松溪人,博士,教授,主要研究方向为时空数据分析与可视化、数字化规划等。 E-mail: ws0110@163.com
  • 作者简介:柯日宏(1998—),男,福建三明人,硕士生,主要从事地理信息服务与时空数据挖掘研究。E-mail: 820916024@qq.com
  • 基金资助:
    中国科学院战略性先导科技专项(A类)(XDA23100502);福建省高校数字经济学科联盟建设(闽教高〔2022〕15号)

A Spatial-temporal Model for Identifying Tidal Shared-bicycle Stops and Bicycle Sharing Demand Prediction based on KNN-LightGBM

KE Rihong(), WU Sheng(), KE Weiwen   

  1. The Academy of Digital China (Fujian), Fuzhou University, Fuzhou 350003, China
  • Received:2022-09-08 Revised:2022-12-06 Online:2023-04-25 Published:2023-04-19
  • Contact: WU Sheng
  • Supported by:
    Strategic Priority Research Program of the Chinese Academic of Science, No.XDA23100502;Construction of University Discipline Alliance of Digital Economy of Fujian Province, No.Min Jiao Gao(2022)15.

摘要:

随着互联网租赁自行车(共享单车)的兴起,“共享单车+地铁”“共享单车+公交”已成为城市通勤的主要接驳方式,但共享单车的“潮汐效应”也成为共享单车管理和资源调配的“痛点”和“难点”。因此,发现共享单车的“潮汐规律”,准确预测共享单车停车区(电子围栏)的租还需求,对于共享单车的有序规范发展,优化用车体验和环境等具有重要意义。本文首先基于共享单车订单数据和“电子围栏”空间数据,提出一种识别共享单车潮汐点的时空模型并分析其潮汐性时空特征。该模型将潮汐点定义为短时间内因大量共享单车租或还从而导致无车可租或无车位可停的电子围栏,然后根据电子围栏在某时间段的状态进行分类,并赋予不同的缺车/缺停指数。结果显示该模型能够精准识别特定时段出现的潮汐点。随后,基于共享单车订单、城市信息点(POI)、道路、人口、土地利用、气温、风速等时空数据,并考虑局部范围内的电子围栏相关性,构建KNN-LightGBM模型来预测共享单车租还需求:① 利用主成分分析(Principal Component Analysis,PCA)进行特征提取;② 利用KNN(K Nearest Neighbors)算法计算局部范围内电子围栏之间相关信息;③ 整合PCA提取的特征向量和电子围栏相关信息作为输入特征,利用LightGBM方法进行租还需求预测;④ 评估影响租还需求预测的特征重要性。结果表明:与常用的4种机器学习方法进行对比,KNN-LightGBM在不同时间尺度下的预测实验中RMSEMAE的平均值均最小,R2r平均值均最大,预测效果较好;利用KNN计算局部范围内的电子围栏相关性,能够有效的提高预测精度,与LightGBM相比,KNN-LightGBM的RMSE和MAE分别降低了10%和11%,R2r分别提高了3%和4%;共享单车的历史订单数据对租还需求预测最为重要,与最近公共交通接驳站距离的重要性次之。

关键词: 共享单车, 电子围栏, 时空模型, 潮汐性, 需求预测, 机器学习, 厦门

Abstract:

With the rise of bicycle sharing network, "shared-bicycle + subway" and "shared-bicycle + bus" have become the main mode of urban commuting, but the "tidal effect" of shared-bicycle makes it difficult to manage and deploy resources. Therefore, exploring the "tidal law" of shared-bicycle and accurately predicting the demand for borrowing and returning bicycles at parking areas (electronic fences) are important for the orderly and standardized development of shared-bicycle and the optimization of the riding experience and environment. Based on the spatial data of shared-bicycle orders and electronic fences, our research proposes a spatial-temporal model for identifying tidal shared-bicycle stops and analyzing their tidal spatial-temporal characteristics. Our model defines the tidal shared-bicycle stops as electric fences with lacking-bike/lacking-parking due to a large number of shared-bicycles borrowed/returned for a short time. The electric fences are then classified according to their status at a certain period and assigned different lacking-bike/lacking-parking indexes. The results show that our spatial-temporal model can accurately identify the tidal shared-bicycle stops at a specific period. Moreover, based on the spatial-temporal data such as shared bicycle orders, city information points (POI), road, population, land-use type, temperature, and wind speed, and considering the correlation of electronic fences at the local area, we propose a K Nearest Neighbors (KNN)-LightGBM model to predict the sharing demand of shared bicycles, which includes: (1) Principal Component Analysis (PCA) is used to extract characteristics; (2) The KNN algorithm is used to calculate the correlation information of electronic fences at the local area; (3) We integrate the characteristic vectors extracted by PCA and the correlation information of electronic fences as input, and use the LightGBM model to predict the sharing demand of bicycles; (4) We evaluate the importance of the characteristics that affect the sharing demand. The results show that the proposed KNN-LightGBM is better than the common machine learning methods in demand prediction at different time scales. The mean values of RMSE and MAE using our proposed model are the smallest and the mean values of R2 and r are the largest. We use the KNN algorithm to calculate the correlation of electronic fences, which can effectively improve the prediction accuracy. Compared with LightGBM, the RMSE and MAE of KNN-LightGBM are reduced by 10% and 11%, respectively, and R2 and r are improved by 3% and 4%, respectively. Based on the importance assessment of characteristics, the historical data of shared-bicycle orders are the most important for the demand prediction, followed by the distance to the nearest public transportation stations. Our study demonstrates the potential of model.

Key words: shared-bicycle, electronic fence, spatial-temporal model, tidal characteristic, demand forecasting, machine learning, Xiamen