地球信息科学学报 ›› 2020, Vol. 22 ›› Issue (6): 1370-1382.doi: 10.12082/dqxxkx.2020.190594

• 数据共享与数据挖掘 • 上一篇    下一篇

北京市四环内街区尺度下的主题混合模式挖掘

刘菁菁1,2, 刘雨思1,2, 伊迪升1,2, 杨静1,2, 张晶1,2,*()   

  1. 1. 首都师范大学 三维信息获取与应用教育部重点实验室,北京 100048
    2. 首都师范大学资源环境与旅游学院,北京 100048
  • 收稿日期:2019-10-11 修回日期:2020-02-02 出版日期:2020-06-25 发布日期:2020-08-25
  • 通讯作者: 张晶 E-mail:zhangjing5946@sina.com
  • 作者简介:刘菁菁(1995— ),女,山西运城人,硕士生,研究方向为空间分析与数据挖掘。E-mail: 923508685@qq.com
  • 基金资助:
    虚拟现实技术与系统国家重点实验室开放基金项目(01119220010011)

Extracting Mixed Topic Patterns within Downtown Beijing at the Block Level

LIU Jingjing1,2, LIU Yusi1,2, YI Disheng1,2, YANG Jing1,2, ZHANG Jing1,2,*()   

  1. 1. MOE Key Lab of 3D Information Acquisition and Application, Capital Normal University, Beijing 100048, China
    2. College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China
  • Received:2019-10-11 Revised:2020-02-02 Online:2020-06-25 Published:2020-08-25
  • Contact: ZHANG Jing E-mail:zhangjing5946@sina.com
  • Supported by:
    The Open Project Program of the State Key Laboratory of Virtual Reality Technology and Systems, Beihang University(01119220010011)

摘要:

城市是多样性聚集的场所,且多元化和差异性日益增强,故探究土地混合利用具有一定的现实意义。现有的土地混合研究大多以POI(Point of Interest)为研究基础,着眼于城市主题的研究较少。本文采用百度POI数据,在街区尺度下考虑POI共现以提取主题,并挖掘北京市四环内的主题混合模式,其结果可以为城市规划及其建设提供参考。首先,采用LDA(Latent Dirichlet Allocation)主题模型得出街区的主题向量以及主题的POI共现模式;其次,引入多样性指数对街区的混合度进行度量,并依据自然断裂法将街区分为高混合街区、中等混合街区、低混合街区3类;最后,为了探究3类街区中的主题混合模式,先采用多元线性回归找出不同类街区中对混合度影响显著的主题,在此基础上对街区中的混合模式进行提取。结果表明:高混合街区的主题混合模式都是茶座餐厅主题与其他主题的混合;中等混合街区中的混合模式大多是以公司企业主题与住宅(商铺)主题再结合其他主题的混合;低混合街区中最典型的2种模式是茶座餐厅主题主导与风景名胜主题主导的接近单一的模式。不同的模式也体现了不同混合区的特征及其之间的差异,有助于对城市深度理解,从而为混合城市的建设提供参考。

关键词: 街区, LDA, POI共现, 主题, 主题混合模式, 混合度, TF-IDF, 多元线性回归

Abstract:

Cities with different land use types influenced by rapid urbanization and urban expansion support various human activities, such as shopping, eating, living, working, and recreation. The mixed use of land can stimulate the vitality of the city, enable the city togather enough people at different points in time, thus producing more interaction, promoting diversified consumption, and improving the economic and social benefits of the city.Mixed characteristics of land use types in cities gain more popularity in many researches due to the huge practical meanings. However, previous researches on mixed characteristics calculation mainly focused on POI data,and there is a lack of consideration for detecting urban topics. Human activities usually take place in different types of points of interest, the potential relationships and spatial interactions between the different types of adjacent POIs can work together to express the potential semantics of locations. In this paper, from an urban topic perspective, a method for the consideration of the relationship between POIs was proposed, and the Hill Numbers Diversity Index was applied to calculate the mixed degree of topics at the block level. Specifically,LDA (Latent Dirichlet Allocation) topic model was firstly used to generate topic vectors of the block and the co-occurrence patterns of POIs. Secondly, the diversity index was introduced to measure the mixed degree of blocks. Then, according to the Goodness of Variance Fit (GVF) and the nature break method, the blocks wereclassified into three groups: (1) high mixed blocks, (2) medium mixed blocks, and (3) low mixed blocks. Finally, multiple linear regression was applied based on mixed degree and topics in the blocksto uncover the significant topics and mixed pattern.Results show that different mixed blocks haddifferent mixed patterns.For high mixed blocks, the topic of teahouse restaurant was significant; the topics of company, enterprise, and residence weresignificant in medium mixed blocks; and the most typical two patterns in low mixed blocks werethe existence of landscape and famous scenery topic and teahouse restaurant topic. To sum up,starting from the urban topic, this paper reveals the mixed pattern of block, and the results show thatdifferent mixed patterns reflect the characteristics of different mixed areas and present certain rules in spatial distribution, which is conducive to the deep understanding of the cityareas, so as to provide a reference for the construction of Beijing mixed city, and also provide suggestions for other mixed cities.

Key words: block, LDA, POI co-occurrence, topic, topic mixed pattern, topic mixed degree, TF-IDF, multiple linear regression