大型商场顾客消费行为轨迹推断
初 晨(1998— ),男,山东青岛人,硕士,主要从事时空数据挖掘研究。E-mail: chuchen0411@igsnrr.ac.cn |
收稿日期: 2021-10-30
修回日期: 2021-12-16
网络出版日期: 2022-08-25
基金资助
国家重点研发计划项目(2021YFB3900803)
Inferring Consumption Behavior of Customers in Shopping Malls from Indoor Trajectories
Received date: 2021-10-30
Revised date: 2021-12-16
Online published: 2022-08-25
Supported by
National Key Research and Development Program of China(2021YFB3900803)
如何获取大型商场内海量顾客消费行为一直是行为地理学面临的难点问题,而近年来爆发式增长的室内轨迹数据为这一问题解决提供了机遇,但室内轨迹的语义信息缺失、数据质量差等问题给推断顾客消费行为造成了挑战。本研究提出了一种顾及文本-轨迹的商场顾客消费行为轨迹推断框架,无需隐私敏感的顾客消费记录数据,可以获取大量顾客消费行为,该方法通过爬取室内店铺的网络文本,增强室内店铺语义属性,进而实现顾客几何轨迹到语义轨迹的转化提升,并引入了轨迹嵌入特征表示学习方法,捕捉群体轨迹之间的移动特征,综合轨迹移动特征、轨迹语义特征及顾客嵌入特征,通过高维聚类实现了大型商场顾客消费模式的推断。通过某大型商场7045位顾客的真实轨迹进行实验分析,实验结果表明,本文提出的方法与传统特征提取方法相比,聚类结果在轮廓系数上提升最高达69.8%,顾客消费行为提取准确率更高。研究发现,室内顾客移动具有一定楼层倾向性,并且室内空间结构如店铺位置、扶梯位置、功能区划分等,会影响顾客消费模式。本文提出的方法可以有效识别不同消费水平、移动特征的顾客群体,实现顾客消费行为的轨迹推断。
初晨 , 张恒才 , LU Feng , 陆锋 . 大型商场顾客消费行为轨迹推断[J]. 地球信息科学学报, 2022 , 24(6) : 1034 -1046 . DOI: 10.12082/dqxxkx.2022.210690
How to obtain the consumption behavior of massive customers in large indoor shopping malls has always been a difficult problem in behavioral geography. However, with the explosive growth of indoor trajectory data in recent years, there's a great opportunity to solve this problem. Meanwhile, the lack of semantic information and poor data quality of indoor trajectory still pose challenges to the inference of consumer behavior. This study proposes a framework for customers' consumption behavior inference in shopping malls without collecting private personal consumption records. This framework integrates the Web text information of stores with movement features extracted from personal and historical customer trajectories. The semantic attributes of indoor stores are enhanced by introducing the crawled network text data of indoor stores, so as to realize the transformation from customer geometric trajectory to semantic trajectory. Specifically, the framework offers a method to model the customers' consumption feature from three aspects, including the raw trajectory's movement feature, semantic feature, and movement embedding feature. By employing the representation learning algorithm in extraction of customers' movement embedding feature, the framework can learn the movement pattern from the historical crowd trajectories and use the movement embedding feature to model movements of a single customer in a complex indoor environment automatically. Finally, the research realizes residents' consuming behavior inference by clustering the concatenated multi-sources consuming features and analyzing the clusters with statistic values and visualization. Through the experimental analysis of a real-world indoor trajectory dataset generated from a large shopping mall with 7045 customers, the inference result proves that the framework can effectively extract the spatial-temporal movement and consumption pattern of residents. Comparing with the classic feature extraction methods and typical clustering methods, the framework we propose achieves an improvement for up to 69.8% in the Silhouette Coefficient. This improvement illustrates that the customers' consumption behavior inferring framework we propose can identify the customers with different consuming behaviors more effectively and cluster customers' feature with high dimension more precisely. Through the analysis of indoor customer clusters' movement pattern, the research finds out that the moving behavior of all shopping mall customers are affected directly and prominently by the design of indoor environment e.g., the distribution of functional zones, location of escalators, etc. Besides, the research also finds out that customers have strong preference to consume in the identical floor. The framework we proposed can identify customer groups with different consumption levels and movement patterns and discover consuming patterns from massive shopping mall customers without knowing their personal information. The application of the framework in inferring customer behavior patterns could provide a support for relative researches in behavioral geography.
表1 轨迹移动特征Tab. 1 Customer trajectory's movement feature |
特征名称 | 特征计算方式 |
---|---|
顾客室内停留时间 | |
顾客室内移动总距离 | |
顾客室内平均移动速度 | |
顾客各楼层停留时间 | 各楼层第一个定位点和最后一个时间差 |
注:表中各个变量含义同2.2节所述。 |
表2 轨迹语义特征Tab. 2 Customer trajectory's semantic feature |
特征名称 | 特征计算方式 |
---|---|
顾客访问店铺数量 | Length |
顾客在某店铺中停留最长时间 | Max ( ) |
顾客游览店铺总时间 | |
顾客消费水平平均值 | |
顾客消费水平最大值 | Max ( ) |
顾客消费能力标准差 | Std ( ) |
顾客访问的各种类型的店铺数量 | 语义轨迹中包含的各种类型店铺 数量 |
注:表中各个变量含义同2.2节所述。 |
算法1 Indoor-GloVe共现矩阵构建算法 |
---|
输入:轨迹数据集: 输出:店铺共现矩阵:CoocurMatrix Function:Cooccurrence_Statistic ( ): 1 初始化共现频率矩阵:CoocurMatrix = 2 For next do3 For next do4 For next do5 CoocurMatrix [ , ] += 16 return CoocurMatrix |
算法2 顾客消费特征SOM处理算法 |
---|
输入:顾客消费特征数据集: 神经元边长数: 神经元特征数数: 迭代阈值: 学习率: 输出:神经元权重矩阵:WeightMatrix Function:SOM_ConFeature(ConFeatureSet, N, num_iteration, , ): 1 随机初始化神经元权重矩阵:WeightMatrix = 2 For do3 For next do4 5 6 7 return WeightMatrix |
表3 轨迹数据示例Tab. 3 Example of trajectory data |
时间戳 | 楼层ID | 顾客ID | X | Y |
---|---|---|---|---|
2017-12-31 08:01:54 | F7 | 341298C7**** | 13****99.9 | 4****50.8 |
2017-12-31 08:03:42 | F6 | 341298C7**** | 13****62.0 | 4****66.6 |
…… | …… | …… | …… | …… |
2017-12-31 19:41:27 | F1 | 28FAA07D**** | 13****99.7 | 4****11.8 |
2017-12-31 19:43:35 | B1 | 28FAA07D**** | 13****43.5 | 4****08.1 |
表4 店铺语义属性Tab. 4 Semantic attribution of stores |
店铺名称 | 平均消费价格/元 | 店铺类型 |
---|---|---|
上海老庙黄金银楼 | 1515 | 饰品 |
毛家饭店 | 89 | 餐饮 |
…… | …… | …… |
CHARLES & KEITH | 487 | 女装 |
表5 各顾客聚类关键特征统计Tab. 5 Statistical features of different consumers clusters |
顾客类别 | 顾客数量/人 | 消费价格均值/元 | 平均访问店铺数量/个 | 平均移动距离/m |
---|---|---|---|---|
0 | 1193 | 561.1 | 7.3 | 2174.7 |
1 | 5341 | 728.4 | 10.4 | 2200.5 |
2 | 284 | 1180.3 | 10.7 | 2339.6 |
3 | 229 | 746.0 | 11.3 | 2603.4 |
4 | 470 | 705.6 | 6.0 | 2063.8 |
5 | 328 | 651.6 | 10.4 | 2878.8 |
表6 特征组成-聚类方法轮廓系数对比Tab. 6 Silhouette Coefficient comparison of different cluster algorithms and user feature combination |
消费行为建模方法 | 轮廓系数 | ||
---|---|---|---|
研究方法 | K-Means | 层次聚类 | |
轨迹移动特征+轨迹 语义特征 | 0.2395 | -0.0159 | 0.1180 |
顾客嵌入特征 | 0.2728 | 0.3028 | 0.2911 |
多源特征组合 | 0.4067 | 0.2335 | 0.3868 |
[1] |
|
[2] |
傅辰昊, 周素红, 闫小培, 等. 广州市零售商业中心的居民消费时空行为及其机制[J]. 地理学报, 2017, 72(4):603-617.
[
|
[3] |
柴彦威, 王茂军. 日本消费者行为地理学研究进展[J]. 地理学报, 2004, 59(z1):167-174.
[
|
[4] |
吴康敏, 王洋, 叶玉瑶, 等. 广州市零售业态空间分异影响因素识别与驱动力研究[J]. 地球信息科学学报, 2020, 22(6):1228-1239.
[
|
[5] |
张文佳, 柴彦威. 居住空间对家庭购物出行决策的影响[J]. 地理科学进展, 2009, 28(3):362-369.
[
|
[6] |
柴彦威, 翁桂兰, 沈洁. 基于居民购物消费行为的上海城市商业空间结构研究[J]. 地理研究, 2008, 27(4):897-906.
[
|
[7] |
刘学, 甄峰, 张敏, 等. 网上购物对个人出行与城市零售空间影响的研究进展及启示[J]. 地理科学进展, 2015, 34(1):48-54.
[
|
[8] |
张文忠, 李业锦. 北京城市居民消费区位偏好与决策行为分析--以西城区和海淀中心地区为例[J]. 地理学报, 2006, 61(10):1037-1045.
[
|
[9] |
周素红, 林耿, 闫小培. 广州市消费者行为与商业业态空间及居住空间分析[J]. 地理学报, 2008, 63(4):395-404.
[
|
[10] |
杨洁, 杨乃, 黄婷, 等. 大型商场内人群择路行为认知规律的研究[J]. 武汉大学学报·信息科学版, 2017, 42(3):414-420.
[
|
[11] |
|
[12] |
李也. 张量聚类和回归建模及其在消费行为分析上的应用研究[D]. 上海: 上海交通大学, 2020.
[
|
[13] |
|
[14] |
|
[15] |
陈锐志, 郭光毅, 叶锋, 等. 智能手机音频信号与MEMS传感器的紧耦合室内定位方法[J]. 测绘学报, 2021, 50(2):143-152.
[
|
[16] |
陈锐志, 王磊, 李德仁, 等. 导航与遥感技术融合综述[J]. 测绘学报, 2019, 48(12):1507-1522.
[
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
于邓波. 基于室内行人定位轨迹的行为模式识别与分析[D]. 武汉: 武汉大学, 2019.
[
|
[28] |
|
[29] |
|
[30] |
|
[31] |
|
[32] |
|
[33] |
|
[34] |
|
[35] |
|
[36] |
|
[37] |
|
[38] |
|
[39] |
|
/
〈 |
|
〉 |