基于多尺度时空聚类的共享单车潮汐特征挖掘与需求预测研究
姜 晓(1992— ),男,江苏徐州人,硕士生,主要从事地学信息可视化与数据挖掘研究。E-mail: jiangxiao@stu.pku.edu.cn |
收稿日期: 2021-10-30
修回日期: 2021-12-25
网络出版日期: 2022-08-25
基金资助
中国博士后科学基金项目(2021M690201)
Usage Patterns Identification and Flow Prediction of Bike-sharing System based on Multiscale Spatiotemporal Clustering
Received date: 2021-10-30
Revised date: 2021-12-25
Online published: 2022-08-25
Supported by
China Postdoctoral Science Foundation(2021M690201)
当前,我国政府和单车企业多以划定电子围栏停车点的方式进行共享单车的规范化管理,由于单个电子围栏内部单车流入流出的随机性和不确定性较大,以单个围栏为单位进行单车管理的工作量大且不具现实意义。因此,有必要对电子围栏停车点进行聚类划分,实行区域化的管理与调度。基于此,本文提出一种基于时空约束的网络图聚类算法,该算法综合考虑空间因素(地理位置、地理环境特征)和时间因素(历史订单),只需通过距离阈值设定即可实现电子围栏的多尺度聚类划分,实验分别在3000 m和700 m距离阈值条件下对厦门岛和乌石浦地区电子围栏进行聚类,结果显示该算法不仅能够将具有相似时空特征的电子围栏聚到同一社区簇内,而且能够使得单车流动主要集中在划分后的社区内部;随后,在社区划分基础上进行单车潮汐特征挖掘,能够有效识别和定位单车使用的热点地区;最后,利用长短时记忆神经网络(Long-Short Time Memory network, LSTM)进行单车订单需求预测,结果显示有84%以上社区的预测准确率在85%以上,平均预测准确率为91.301%,预测效果较好,可有效满足单车调度需求。本文研究成果可服务于电子围栏停车点规划与共享单车的区域化管理与调度工作。
姜晓 , 白璐斌 , 楼夏寅 , 李梅 , 刘晖 . 基于多尺度时空聚类的共享单车潮汐特征挖掘与需求预测研究[J]. 地球信息科学学报, 2022 , 24(6) : 1047 -1060 . DOI: 10.12082/dqxxkx.2022.210691
At present, China government and bike-sharing companies mostly use electronic fence parking stations to manage the shared bicycles normatively. Electric fence parking stations for free-floating bike-sharing are predetermined 'virtual fences' to guide users to park bikes in designated zones and regulate inappropriate parking behaviors. However, due to the randomness and uncertainty of the inflow and outflow of bicycles at a single parking station, the scheduling of bicycles based on an independent parking station is hard to realize. Therefore, it is necessary to group fence stations into clusters and implement regional management. In this paper, we proposed a network clustering algorithm based on spatiotemporal constraints, which comprehensively considered spatial factors (location and geographical environment of the parking stations) and temporal factors (historical bike-sharing system orders) as the clustering partition basis, and this algorithm can realize the multi-scale groups division of parking stations only by setting a distance threshold. We chose Xiamen Island as the research region. Using the distance thresholds of 3000 m and 700 m respectively, we carried out clustering experiments on the electronic fence parking stations in the whole Xiamen Island and its Wushipu block. The results showed that this algorithm can not only gather the parking stations with similar temporal and spatial characteristics into the same group, but also make the shared bike flow mainly concentrated in the streets within each group, which is convenient for regional management. Then, we mined the characteristics of shared bikes among the partitioned groups, which can effectively identify and locate hot areas for shared bikes. The results showed that subway stations, office buildings, parks, hospitals, shopping malls, and residential areas had a greater impact on the usage pattern of shared bikes. In particular, it is necessary to focus on the accumulation of shared bikes near office buildings, shopping malls, hospitals, and subway stations, and the shortage of bicycles near the residential areas, parks, and factories during the morning rush hours. Finally, we used the Long Short Time Memory network (LSTM) to predict the orders of shared bikes. The results showed that 84% of the groups had a prediction accuracy of more than 85%, and the average of the overall prediction accuracy was 91.301%, which can meet the needs of bike-sharing system scheduling. Our research provides scientific suggestions for relevant departments to arrange electronic fence parking stations, and the LSTM model has high accuracy in predicting bicycle flow, which is effective in reducing the scheduling cost of bike-sharing system and improve the management efficiency.
表1 实验数据清单Tab. 1 Experimental data list |
数据名称 | 数据时间 | 数据规模 | 数据描述 | |
---|---|---|---|---|
字段名称 | 字段含义 | |||
厦门岛共享单车订单数据 | 2020年12月21—25日6:00 am—10:00 am | 58万条左右 | BICYCLE_ID | 加密后的单车ID号 |
LATITDUE | 纬度/° | |||
LONGITUDE | 经度/° | |||
LOCK_STATUS | 锁状态 | |||
UPDATE_TIME | 锁状态更新的时间 | |||
厦门岛共享单车电子围栏数据 | 2020年12月 | 1.4万个左右 | FENCE_ID | 电子围栏唯一编号 |
FENCE_LOC | 电子围栏位置坐标串 | |||
厦门岛POI数据 | 2021年1月 | 8000条左右 | POI_TYPE | POI地物类别 |
LATITDUE | 纬度/° | |||
LONGITUDE | 经度/° | |||
厦门岛路网数据 | 2021年1月 | 8000条道路 | Length | 道路长度/m |
name | 道路名称 |
算法1 基于时空约束的网络聚类GC2 |
---|
Require: |
输入:停车点相关性网络矩阵,G; |
节点集合,V |
一次迭代中节点类别交换的最低次数, |
Ensure: 给每个节点初始化一个独一无二的簇标签 |
repeat 初始化交换次数 =0 for in V do 从节点集合V中移除当前节点 记录 此时的标 签 计算节点 与其邻接簇之间的收益value |
将节点 分配给value最大的簇,将此簇的标签 赋予 ,将 添加到集合V中 |
if then |
end if end if until |
表3 厦门岛基于单车使用频率的社区分类与POI指数统计Tab. 3 Communities division by the frequency of shared-bikes usage and POI index staticstics in Xiamen Island |
类别 | 数目 | 工厂 | 医院 | 公园 | 学校 | 地铁 | 办公楼 | 公交 | 居民区 | 商场 | 指数和 |
---|---|---|---|---|---|---|---|---|---|---|---|
高频 | 8 | 0.262 | 0.333 | 0.480 | 0.214 | 0.544 | 0.656 | 0.078 | 0.318 | 0.323 | 3.209 |
中频 | 9 | 0.212 | 0.505 | 0.544 | 0.302 | 0.544 | 0.497 | 0.201 | 0.197 | 0.280 | 3.281 |
低频 | 8 | 0.204 | 0.153 | 0.247 | 0.105 | 0.178 | 0.300 | 0.054 | 0.275 | 0.098 | 1.613 |
表4 乌石浦地区基于单车流入流出的社区分类与POI指数统计Tab. 4 Communities division by the inflow and outflow of shared-bikes and POI index staticstics in Wushipu area |
类别 | 数目 | 工厂 | 医院 | 公园 | 学校 | 地铁 | 办公楼 | 公交 | 居民区 | 商场 | 指数和 |
---|---|---|---|---|---|---|---|---|---|---|---|
流入 | 26 | 0.210 | 0.526 | 0.385 | 0.365 | 0.654 | 0.467 | 0.567 | 0.208 | 0.441 | 3.822 |
流出 | 34 | 0.221 | 0.382 | 0.407 | 0.355 | 0.548 | 0.323 | 0.548 | 0.222 | 0.298 | 3.302 |
表5 LSTM模型预测社区单车需求结果评价Tab. 5 Evaluation of LSTM model prediction results |
社区 | 评价指标 | |||
---|---|---|---|---|
MAE | RMSE | PEARSON/% | AcR/% | |
0 | 12.394 | 26.682 | 84.651 | 86.888 |
1 | 27.065 | 37.996 | 97.572 | 94.228 |
2 | 21.540 | 38.921 | 96.790 | 96.674 |
3 | 60.711 | 96.166 | 92.481 | 86.226 |
4 | 48.237 | 66.411 | 98.163 | 97.235 |
5 | 11.158 | 16.725 | 98.011 | 95.098 |
6 | 6.461 | 8.509 | 97.987 | 95.845 |
7 | 33.671 | 42.844 | 99.099 | 95.559 |
8 | 5.448 | 10.145 | 75.370 | 91.557 |
9 | 44.250 | 64.302 | 97.784 | 97.264 |
10 | 25.329 | 55.762 | 98.099 | 95.295 |
11 | 19.842 | 29.365 | 99.111 | 97.393 |
12 | 35.355 | 50.229 | 99.335 | 95.543 |
13 | 2.250 | 3.806 | 54.333 | 78.498 |
14 | 7.0785 | 9.752 | 95.637 | 93.854 |
15 | 7.211 | 10.692 | 92.511 | 93.463 |
16 | 9.106 | 17.276 | 78.129 | 89.304 |
17 | 9.171 | 19.647 | 87.105 | 92.711 |
18 | 59.250 | 108.202 | 95.537 | 87.758 |
19 | 45.013 | 70.014 | 97.275 | 96.156 |
20 | 15.644 | 29.260 | 98.862 | 97.189 |
21 | 33.316 | 47.215 | 99.282 | 95.957 |
22 | 49.750 | 79.096 | 90.956 | 76.763 |
23 | 11.092 | 22.382 | 68.255 | 83.070 |
24 | 1.671 | 2.315 | 82.898 | 73.008 |
均值 | 24.080 | 38.548 | 91.010 | 91.301 |
[1] |
邵鹏, 王齐, 赵超. 共享单车绿色使用行为与意愿的影响因素研究[J]. 干旱区资源与环境, 2020, 34(3):64-68.
[
|
[2] |
|
[3] |
|
[4] |
[5] |
章永来, 周耀鉴. 聚类算法综述[J]. 计算机应用, 2019, 39(7):1869-1882.
[
|
[6] |
高楹, 宋辞, 郭思慧, 等. 接驳地铁站的共享单车源汇时空特征及其影响因素[J]. 地球信息科学学报, 2021, 23(1):155-170.
[
|
[7] |
靳爽, 庞明宝. 基于K-means的城市轨道交通社区接驳共享单车停靠点规划[J]. 科学技术与工程, 2019, 19(30):343-347.
[
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
刘畅. 共享单车需求预测及调度研究[D]. 武汉: 武汉理工大学, 2018.
[
|
[13] |
|
[14] |
|
[15] |
数字中国建设峰会. 2021数字中国创新大赛之大数据赛道-城市管理大数据专题[DB/OL].(2021-1-25)[2021-1-31].https://dcic.datafountain.cn/competitions/10 015
[ Digital China Summit. Digital China innovation contest-the big data of urban management, DCIC 2021. [DB/OL]. (2021-1-28)[2021-1-28].https://dcic.datafountain.cn/competitions/10 015
|
[16] |
高德地图. 厦门岛地区POI数据与城市路网数据[DB/OL].(2021-1-31)[2021-1-31]. https://ditu.amap.com/
[
|
[17] |
张景奇, 史文宝, 修春亮. POI数据在中国城市研究中的应用[J]. 地理科学, 2021, 41(1):140-148.
[
|
[18] |
|
[19] |
|
[20] |
|
[21] |
禹文豪, 艾廷华. 核密度估计法支持下的网络空间POI点可视化与分析[J]. 测绘学报, 2015, 44(1):82-90.
[
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
|
[29] |
付俐哲. 基于时空聚类与LSTM神经网络的共享单车需求预测模型[D]. 兰州: 西北师范大学, 2021.
[
|
[30] |
曹旦旦, 范书瑞, 张艳, 等. 基于长短期记忆神经网络模型的共享单车短时需求量预测[J]. 科学技术与工程, 2020, 20(20):8344-8349.
[
|
[31] |
|
[32] |
万敏. 基于数据的共享单车需求预测和调度研究[D]. 南京: 南京大学, 2020.
[
|
[33] |
|
/
〈 | 〉 |