Journal of Geo-information Science ›› 2016, Vol. 18 ›› Issue (9): 1174-1183.doi: 10.3724/SP.J.1047.2016.01174

• Orginal Article • Previous Articles     Next Articles

A Study on the User Behavior of Geoscience Data Sharing Based on Web Usage Mining

WANG Mo1,2, WANG Juanle1,3,*()   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
    3. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
  • Received:2015-11-06 Revised:2016-03-16 Online:2016-09-27 Published:2016-09-27
  • Contact: WANG Juanle


Understanding the user behavior of science data sharing is a key step to implement effective and accurate service for science data sharing. This study aims to explore the user behavior of science data sharing using spatial data mining and Web usage mining techniques for the National Earth System Science Data Sharing Platform. At the stage of data preprocessing, procedures of user identification, session identification and user location identification were performed. Spatial hotspot analysis was conducted to analyze the user pageviews, sessions, and dataset visits to explore the geographical variance of user behaviors using the Getis-Ord Gi* method. FP-growth was taken to be the algorithm for mining association rules, and was performed for analyzing data visits and data downloads. Data mining results show that: (1) the user distribution of data sharing platform does not show significant correlation with the overall university population distribution in China, but shows a significant positive correlation with the population of research-oriented universities; (2) the hotspot analysis shows that regions of hotspots were clustering in Beijing, Tianjin, and northern Hebei Province for all three perspectives, whereas the cold spots geographically scattered to a greater extent, e.g. the southern coastal provinces, Henan Province, Shandong Province, Sichuan Province, etc.; (3) the association rules mining reveals a number of frequently visited item sets and rules from the valuable user pageviews. The frequently visited item sets for data downloads were well coincided with the frequently visited data. However, no conspicuous rules occurred in data downloads. Results of the spatial hotspot analysis and association rules mining detected the geographical variance of users’ interests in data and discovered the usage patterns for the frequently visited data, which can be used for designing the personalized recommendation. This study provides a method for mining web user behaviors with the combination of Web usage mining and spatial data mining techniques, which can also be applied to the data mining of websites in other fields.

Key words: Web usage mining, spatial data mining, user behavior mining, science data sharing, Earth System Science data