Journal of Geo-information Science >
Fine-grained Semantic Interaction Mining and Pattern Analysis between Tourist Attractions: A Case Study of Yunnan Province, China
Received date: 2021-10-08
Revised date: 2021-12-01
Online published: 2022-12-25
Supported by
National Key Research and Development Program of China(2017YFB0503600)
National Natural Science Foundation of China(42171448)
Exploring the semantic interaction and interaction pattern of tourist attractions is useful for optimizing the tourism pattern according to the needs of tourists. Existing semantic interaction mining methods ignore the contextual vocabulary that contains human perception information in texts. And there is a lack of research that analyzes the interaction pattern. Therefore, this paper proposes a framework for fine-grained semantic interaction mining and pattern analysis between attractions. First, the contextual information between two attractions is extracted through the co-occurrence relationship of words based on the online travel notes. Then, the semantic connection between attractions is mined by using the method of keyword analysis based on TF-IDF and the method of semantic network analysis from the perspectives of discussion focus and semantic structure. Finally, we regard attraction interaction as an object and use the Spearman rank correlation coefficient and the Graph Kernel (a method for graph similarity measurement) to calculate the correlation between them. Then the network analysis method is used to explore the interaction pattern. The experiment takes Yunnan Province as the case study area, the results of the text mining using travel notes in 2018 show that: (1) The framework is feasible and applicable. The travel experience can be improved according to the needs of tourists by mining and analyzing the fine-grained semantic interaction between attractions. And the route fragments that play a key role in optimizing the tourism pattern can be found by analyzing the semantic interaction pattern of attractions; (2) Cangshan Mountain-Erhai Lake should focus on improving the natural scenery travel experience; while Dali Old Town-Erhai Lake should consider improving tourists’ insufficient attention to branded tourism resources; (3) The coexistence of the three types of semantic interaction patterns, including single-core agglomeration, single-core radial, and multi-regional cooperation, presents the characteristics of node-axes evolving and diffusing. The high betweenness centrality and cross-regional attraction interactions are important for promoting the transformation of the other two models to multi-regional cooperation to develop "global tourism". The research results can provide references for recommending tourism routes and balancing tourism patterns. In the future work, we will explore the dynamic evolution of the semantic interaction between attractions and apply the results to tourist route recommendation.
CHEN Yu , QIN Kun , YU Xuesong , XING Lingli . Fine-grained Semantic Interaction Mining and Pattern Analysis between Tourist Attractions: A Case Study of Yunnan Province, China[J]. Journal of Geo-information Science, 2022 , 24(10) : 2021 -2032 . DOI: 10.12082/dqxxkx.2022.210613
表1 苍山-洱海和大理古城-洱海前20项关键词提取结果Tab. 1 Keyword extraction results (Top 20) of CangShan mountain-Erhai and Dali Old Town-Erhai |
苍山-洱海 | 大理古城-洱海 | |||||||
---|---|---|---|---|---|---|---|---|
词 | TF-IDF | 词 | TF-IDF | 词 | TF-IDF | 词 | TF-IDF | |
雪 | 0.339 | 古城 | 0.114 | 环 | 0.302 | 双廊 | 0.099 | |
洱海月 | 0.328 | 索道 | 0.112 | 苍山 | 0.274 | 花 | 0.095 | |
上关花 | 0.247 | 喜洲 | 0.101 | 大理 | 0.238 | 小时 | 0.093 | |
下关风 | 0.199 | 东 | 0.095 | 古城 | 0.202 | 南门 | 0.091 | |
雪月 | 0.165 | 关风 | 0.095 | 客栈 | 0.173 | 租车 | 0.085 | |
大理古城 | 0.150 | 崇圣寺 | 0.091 | 租 | 0.153 | 电动车 | 0.084 | |
大理 | 0.131 | 风 | 0.089 | 逛 | 0.139 | 车站 | 0.079 | |
环 | 0.128 | 天龙八部 | 0.085 | 崇圣寺 | 0.137 | 入住 | 0.079 | |
花 | 0.116 | 风景 | 0.084 | 喜洲 | 0.126 | 住 | 0.077 | |
背靠 | 0.116 | 上关 | 0.080 | 环游 | 0.108 | 火车 | 0.077 |
目前利用文本数据挖掘景点间语义交互作用的研究主要有2类。①利用数值大小表达景点间语义交互作用强度。如不少研究利用语法结构树[4]、频繁模式和关联规则挖掘[5-8]等方法从文本中抽取旅游路线片段,统计景点间的旅游流量以度量景点间的交互作用强度;Yang等[9]利用游记和新闻文本数据,根据景点在文本中的共现频率,度量和分析旅游景点之间的合作水平。但这种单一维度的度量忽略了共现地名的上下文词汇信息,难以反映场所间细粒度的语义交互作用。②利用词汇表达景点之间的语义交互。此类研究主要集中在挖掘景点之间的距离、方位等地理空间关系[5,10-11],常使用三元组的形式表示,如<Batu Caves,13 Kilometers north, Kuala Lumpur>等[10]。少部分学者关注景点间其他类型的语义交互,Kori等[12]利用文本数据挖掘游客关于路线片段的兴趣或话题,但其使用的语料为所有包含某个路线片段的整篇文档,而不是上下文,因此在挖掘某条路线片段语义时引入了其他路线片段的语义;此外,该研究仅保留了名词词性的词,忽略了其他可以反映人的感知的词,如表示游客行为的动词等。上述研究抽取的这类空间关系或名词词性的语义交互,难以反映一定时期内人们对景点之间讨论的焦点、体验、感受和需求。然而,目前利用文本中这种细粒度信息的相关研究,着重于挖掘人们对单个场所的形象感知[7,13-15],如Che等[15]利用词频-逆向文档频率(Term Frequency-Inverse Document Frequency,TF-IDF)、语义网络等方法分析消费者对某购物广场的看法,鲜少研究利用这些信息挖掘景点间细粒度的语义交互作用。此外,相关研究多基于景点间语义交互强度,且多以景点或整体旅游格局为分析对象。如吕琳露等[7]以景点为节点,利用游记提取景点间旅游流量作为边权,构建旅游流网络,分析景点在旅游流网络中扮演的角色;Juan等[16]利用2个景点被游客共同评论的次数度量语义交互强度,并构建景点关联网络分析景点角色和网络结构的演变。以景点交互作为分析对象的研究较少,主要分析对应路线片段的热门程度[6-7]、或景点被同时游玩的可能性[8]。
[1] |
|
[2] |
|
[3] |
张郴, 黄震方. 旅游地三元空间交互理论模型建构[J]. 地理研究, 2020, 39(2):232-242.
[
|
[4] |
|
[5] |
|
[6] |
|
[7] |
吕琳露, 李亚婷. 游记文本中的知识发现与聚合——以蚂蜂窝旅行网杭州游记为例[J]. 情报杂志. 2017, 36(7):176-181.
[
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
李萍, 陈田, 王甫园, 等. 基于文本挖掘的城市旅游社区形象感知研究——以北京市为例[J]. 地理研究, 2017, 36(6):1106-1122.
[
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
余丽, 陆锋, 刘希亮, 等. 稀疏地理实体关系的关键词提取方法[J]. 地球信息科学学报, 2016, 18(11):1465-1475.
[
|
[22] |
|
[23] |
|
[24] |
|
[25] |
谢永俊, 彭霞, 黄舟, 等. 基于微博数据的北京市热点区域意象感知[J]. 地理科学进展, 2017, 36(9):1099-1110.
[
|
[26] |
周佳颖, 王俊蓉, 张景秋. 微博用户的中国传统节日感知及区域差异研究[J]. 地球信息科学学报, 2019, 21(1):77-85.
[
|
[27] |
|
[28] |
|
[29] |
|
[30] |
|
[31] |
|
[32] |
|
[33] |
|
[34] |
|
[35] |
|
[36] |
|
[37] |
|
[38] |
云南省统计局. 云南省2018年国民经济和社会发展统计公报[EB/OL]. 14.
[ Yunnan Provincial Bureau of Statistics, China. The statistical bulletin of the economic and social development of Yunnan of the year 2018[EB/OL]. 14. ]
|
[39] |
|
/
〈 |
|
〉 |