基于地理标记照片的个性化景点推荐方法
叶 凡(1995— ),女,江西上饶人,硕士,主要从事时空数据挖掘、智能推荐。E-mail: 15770938521@163.com |
收稿日期: 2020-10-15
要求修回日期: 2020-12-25
网络出版日期: 2021-10-25
基金资助
旗山学者奖励支持计划(XRC-19001)
国家重点研发计划项目(2017YFB0504202)
版权
A Personalized Attraction Recommendation Method based on Geotagged Photos
Received date: 2020-10-15
Request revised date: 2020-12-25
Online published: 2021-10-25
Supported by
Qishan Scholar Award Support Scheme(XRC-19001)
National Key Research and Development Project(2017YFB0504202)
Copyright
研究如何根据已有的海量旅游信息及数据,为游客个性化推荐旅游景点具有重要意义。本文利用从Flickr网站获取的2013—2018年香港特别行政区范围内的地理标记照片来识别旅游景点,并根据游客游览顺序重建旅游轨迹。在此基础上,针对现有方法尚未考虑游客偏好在旅行过程中会发生动态变化的问题,提出一种基于隐含狄利克雷分布模型(Latent Dirichlet Allocation, LDA)和用户长短期偏好的个性化景点推荐方法(A Recommendation Method Based on LDA and User's Long and Short-Term Preference, L-ULSP)。该方法利用LDA主题模型获取景点特征信息,挖掘景点间的相关性,再利用注意力机制和长短期记忆网络分别学习用户的长期偏好和短期偏好,最后结合长短期偏好捕捉用户偏好的动态变化。实验结果表明, L-ULSP方法所推荐的景点在命中率和平均倒数排名2个指标上均优于现有其他方法,证明了本文所提方法可以从景点序列中有效学习游客偏好,并为游客推荐下一个景点。此外,本文通过对比实验,进一步验证了同时考虑用户的长短期偏好能够更好地学习用户的偏好变化。
叶凡 , 孙玉 , 陈崇成 , 于大宇 . 基于地理标记照片的个性化景点推荐方法[J]. 地球信息科学学报, 2021 , 23(8) : 1391 -1400 . DOI: 10.12082/dqxxkx.2021.200608
Personalized recommendation of tourist attractions for visitors is useful based on the vast amount of tourism information and data. In this paper, we use Flickr's geotagged photos from 2013 to 2018 in Hong Kong to identify tourist hot spots and reconstruct the tourism trajectory according to the tourist visiting order. On this basis, we propose a personalized recommendation method based on Latent Dirichlet Allocation (LDA) model and User's Long-term and Short-term Preference (L-ULSP) to address the problem that existing methods do not take into account the dynamic changes in visitor preferences during the travel process. In this method, the LDA model is used to obtain the feature information of attractions, and the correlation between attractions is explored. Then, attention mechanism is used to focus on the important information in the long-term sequence to capture the long-term preference of tourists, and LSTM is used to model the short-term sequence information to learn the short-term preference of tourists. Finally, the long-term and short-term preferences are weighted to obtain the final preferences of tourists to capture the dynamic changes of user preferences. The algorithm has the following advantages: (1) By mining the topic feature information of Geotagged photo text, the description information of attractions is added, which can capture users' travel preference more accurately; (2) The algorithm considers both the long-term and short-term preferences of users, and can learn the dynamic changes of users' preferences in the process of travel while modeling the sequence information of attractions. The experimental results show that: (1) The attractions recommended by the L-ULSP method outperform other existing methods in both Hit Rate and Mean Reciprocal Rank, two common evaluation metrics for recommendation algorithms, proving that the proposed method can effectively learn visitor preferences from a sequence of attractions and recommend the next attraction to visitors. It is demonstrated that the method can achieve good recommendation results in travel recommendation scenarios; (2) The comparison experiments between the model using long-term preference as the user's final preference and the model combining user's long-term and short-term preferences as the final preference further validate that considering both the user's long-term and short-term preferences can better learn the user's preference changes and thus improve the accuracy of recommendations; (3) This paper further compares the calculation efficiency of L-ULSP with different deep learning recommendation models based on RNN, and counts the running time of each model. The results show that this method is better than most methods in efficiency.
表2 各数据处理步骤后的游客数量和记录数量Tab. 2 Number of tourists and records after each data processing procedure |
数据处理步骤 | 游客数/位 | 记录数/个 |
---|---|---|
滤除非香港行政区划数据后 滤除重复数据后 滤除非游客数据后 滤除定位错误数据后 | 8898 8898 8165 8165 | 280 361275 252142 261137 671 |
图3 2013—2018年香港特别行政区地理标记照片空间分布Fig. 3 Spatial distribution of geotagged photos in HongKong from 2013 to 2018 |
图4 中心城区不同参数下形成的簇数Fig. 4 The number of clusters detected with different value of parameters in central area |
图5 非中心城区不同参数下形成的簇数Fig. 5 The number of clusters detected with different value of parameters in non-central area |
表3 不同算法的推荐结果对比Tab. 3 Comparison of recommendation results of different algorithms |
方法 | HR@10 | MRR@10 |
---|---|---|
POP SKNN FPMC FOSSIL GRU4Rec NARM STAMP L-ULSP | 20.6930.7935.4441.5441.3844.8342.6746.63 | 7.739.5911.8218.2720.6721.2722.6724.64 |
表4 用户长短期偏好的影响Tab. 4 The impact of long and short-term preferences |
Model | HR@5 | MRR@5 | HR@10 | MRR@10 | HR@20 | MRR@20 |
---|---|---|---|---|---|---|
L-ULP L-ULSP | 33.1735.81 | 19.2723.23 | 44.7146.63 | 20.8724.64 | 53.6156.73 | 21.4925.33 |
表5 不同模型的运行时间对比Tab. 5 Running time comparison of different models |
Model | Runtime/s |
---|---|
GRU4Rec NARM STAMP L-ULSP | 91.55 239.05 68.67 76.24 |
本文对比方法的代码来源于原论文作者公开的代码,在此表示感谢。
[1] |
|
[2] |
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
[19] |
李昇智, 乔建忠, 林树宽. 一种基于用户移动行为相似性的位置预测方法[J]. 计算机科学, 2018, 45(12):288-292,307.
[
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
|
[29] |
|
[30] |
|
[31] |
|
/
〈 | 〉 |