Most Viewed

  • Published in last 1 year
  • In last 2 years
  • In last 3 years
  • All

Please wait a minute...
  • Select all
    |
  • QIN Qiming
    Journal of Geo-information Science. 2025, 27(10): 2283-2290. https://doi.org/10.12082/dqxxkx.2025.250426

    [Objectives] With the rapid increase in the number of Earth observation satellites in orbit worldwide, remote sensing data has been accumulating explosively, offering unprecedented opportunities for Earth system science research to dynamically monitor global change. At the same time, it also brings a series of challenges, including multi-source heterogeneity, scarcity of labeled data, insufficient task generalization, and data overload. [Methods] To address these bottlenecks, Google DeepMind has proposed AlphaEarth Foundations (AEF), which integrates multimodal data such as optical imagery, SAR, LiDAR, climate simulations, and textual sources to construct a unified 64-dimensional embedding field. This framework achieves cross-modal and spatiotemporal semantic consistency for data fusion and has been made openly available on platforms such as Google Earth Engine. [Results] The main contributions of AEF can be summarized as follows: (1) Mitigating the long-standing “data silos” problem by establishing globally consistent embedding layers; (2) Enhancing semantic similarity measurement through a von Mises-Fisher (vMF) spherical embedding mechanism, thereby supporting efficient retrieval and change detection; (3) Shifting complex preprocessing and feature engineering tasks into the pre-training stage, enabling downstream applications to become “analysis-ready” and significantly reducing application costs. The paper further highlights the application potential of AEF in three stages: (1) Initially in land cover classification and change detection; (2) Subsequently in deep coupling of embedding vectors with physical models to drive scientific discovery; (3) Ultimately evolving into a spatial intelligence infrastructure, serving as a foundational service for global geospatial intelligence. Nevertheless, AEF still faces several challenges: (1) Limited interpretability of embedding vectors, which constrains scientific attribution and causal analysis; (2) Uncertainties in domain transfer and cross-scenario adaptability, with robustness in extreme environments yet to be verified; (3) Performance advantages that require more empirical validation across regions and independent experiments. [Conclusions] Overall, AEF represents a new direction for research in remote sensing and geospatial artificial intelligence, with breakthroughs in data efficiency and cross-task generalization providing solid support for future Earth science studies. However, its further development will depend on continuous advances in interpretability, robustness, and empirical validation, as well as on transforming the 64-dimensional embedding vectors into widely usable data resources through different pathways.

  • YU Hanyang, LAN Chaozhen, WANG Longhao, WEI Zijun, GAO Tian, WANG Yiqiao, LIU Ruimeng
    Journal of Geo-information Science. 2025, 27(8): 1896-1919. https://doi.org/10.12082/dqxxkx.2025.250052

    [Significance] Multimodal remote sensing image matching has become a fundamental task in integrated Earth observation, enabling precise spatial alignment across heterogeneous image sources. [Progress] As the diversity of sensing modalities, acquisition geometries, and temporal conditions increases, traditional matching frameworks have proven inadequate for capturing complex variations in radiometric responses, geometric configurations, and semantic representations. This technological gap has driven a significant paradigm shift from handcrafted feature engineering to deep learning-based solutions, which now form the core of current research and application development. This paper provides a comprehensive and structured review of recent advances in deep learning methods for multimodal remote sensing image matching, with an emphasis on the evolution of methodological paradigms and technical frameworks. It establishes a clear dual-path classification: the single-session approach and the end-to-end approach. The former selectively replaces or enhances individual components of traditional pipelines, such as feature encoding or similarity estimation, using neural network modules. The latter integrates the entire matching process into a unified network architecture, enabling joint optimization of feature learning, transformation modeling, and correspondence inference within a closed loop. This progression reflects the field's transition from modular adaptation to holistic modeling, revealing a deeper integration of data-driven representation learning with geometric reasoning. The review further examines the development of architectural strategies supporting this evolution, including attention mechanisms, graph-based structures, hierarchical feature fusion, and modality-bridging transformations. These innovations contribute to improved robustness, semantic consistency, and adaptability across diverse matching scenarios. Recent trends also demonstrate a growing reliance on pretrained vision foundation models, which provide transferable feature spaces and reduce the dependence on large-scale labeled datasets. In addition to summarizing technical advancements, the paper analyzes representative datasets, performance evaluation strategies, and the current challenges that constrain real-world deployment. These include limited data availability, weak cross-scene generalization, computational inefficiency, and insufficient interpretability. [Prospect] By synthesizing methodological progress with practical demands, the review identifies key directions for future research, including the design of modality-invariant representations, physically-informed neural architectures, and lightweight solutions tailored for scalable, real-time image registration in complex operational environments.

  • ZHANG Nuan, WANG Tao, ZHANG Yan, WEI Yibo, LI Liuwen, LIU Yichen
    Journal of Geo-information Science. 2025, 27(8): 1751-1779. https://doi.org/10.12082/dqxxkx.2025.250137

    [Significance] Street View Image-based Visual Place Recognition (SV-VPR) is a geographical location recognition technology that relies on visual feature information. Its core task is to predict and accurately locate unknown locations by analyzing the visual features of street view images. This technology must overcome challenges such as appearance changes under different environmental conditions (e.g., lighting differences between day and night, seasonal variations) and viewpoint differences (e.g., perspective deviations between vehicle-mounted cameras and satellite images). Accurate recognition is achieved through calculating image feature similarity, applying geometric constraints, and related methods. As an interdisciplinary field of computer vision and geographic information science, SV-VPR is closely related to visual positioning, image retrieval, SLAM, and more. It has significant application value in areas such as UAV autonomous navigation, high-precision positioning for autonomous driving, construction of geographical boundaries in cyberspace, and integration of augmented reality environments. It is particularly advantageous in GPS-denied environments. [Analysis] This paper systematically reviews the research progress of visual location recognition based on street view images, covering the following aspects: First, the basic concepts and classifications of visual place recognition technologies are introduced. Second, the foundational principles and categorization methods specific to street view image-based visual place recognition are discussed in depth. Third, the key technologies in this field are analyzed in detail. Furthermore, relevant datasets for street view image-based visual place recognition are comprehensively reviewed. In addition, evaluation methods and index systems used in this domain are summarized. Finally, potential future research directions for SV-VPR are explored. [Purpose] This review aims to provide researchers with a systematic overview of the technological development trajectory of SV-VPR, helping them quickly understand the current research landscape. It also offers a comparative analysis of key technologies and evaluation methods to support algorithm selection, and identifies emerging challenges and potential breakthrough areas to inspire innovative research.

  • HUANG Yi, ZHANG Xueying, SHENG Yehua, XIA Yongqi, YE Peng
    Journal of Geo-information Science. 2025, 27(6): 1249-1262. https://doi.org/10.12082/dqxxkx.2025.250175

    [Objectives] This study addresses the critical challenges in typhoon disaster knowledge services, which are often hindered by "massive data, scarce knowledge, and limited services." The core objective is to rapidly distill actionable knowledge from vast datasets to enhance disaster management efficacy and mitigate typhoon-related impacts. Large Language Models (LLMs), renowned for their superior performance in natural language processing, are leveraged to deeply mine disaster-related information and provide robust support for advanced knowledge services. [Methods] This research establishes a typhoon disaster knowledge service framework encompassing three layers: data, knowledge, and service. [Results] For the data-to-knowledge layer, an LLM-driven (Qwen2.5-Max) automated method for constructing typhoon disaster Knowledge Graphs (KGs) is proposed. This method first introduces a multi-level typhoon disaster knowledge representation model that integrates spatiotemporal characteristics and disaster impact mechanisms. A specialized training dataset is curated, incorporating typhoon-related texts with explicit temporal and spatial attributes. By adopting a "pre-training + fine-tuning" paradigm, the framework efficiently transforms raw disaster data into structured knowledge. For the knowledge-to-service layer, an LLM-based intelligent question-answering system is developed. Utilizing the constructed typhoon disaster KG, this system employs Graph Retrieval-Augmented Generation (GraphRAG) to retrieve contextually relevant knowledge from the graph and generate user-specific disaster prevention and mitigation guidance. This approach ensures seamless conversion of structured knowledge into practical services, such as personalized evacuation plans and resource allocation strategies. [Conclusions] The study highlights the transformative potential of LLMs in typhoon disaster management and lays a foundation for integrating LLMs with geospatial technologies. This interdisciplinary synergy advances Geographic Artificial Intelligence (GeoAI) and paves the way for innovative applications in disaster service.

  • LI Junming, HU Yaxuan, WANG Nannan, WANG Siyaqi, WANG Ruolan, LYU Lin, FANG Ziqing
    Journal of Geo-information Science. 2025, 27(7): 1501-1519. https://doi.org/10.12082/dqxxkx.2025.250161

    [Objectives] Classical statistical inference typically relies on the assumptions of large sample sizes and independent, identically distributed (i.i.d.) observations, conditions that spatio-temporal data frequently violate, leading to inherent theoretical limitations in conventional approaches. In contrast, Bayesian spatio-temporal statistical methods integrate prior knowledge and treat all model parameters as random variables, thereby forming a unified probabilistic inference framework. This enables the incorporation of a broader range of uncertainties and offers robustness in modelling small samples and dependent structures, making Bayesian methods highly advantageous and increasingly influential in spatio-temporal analysis. [Progress] From the perspective of methodological evolution, this paper systematically reviews mainstream Bayesian spatio-temporal statistical models from two complementary perspectives: traditional Bayesian statistics and the Bayesian machine learning. The former includes Bayesian Spatio-temporal Evolutionary Hierarchical Models, Bayesian Spatio-temporal Regression Hierarchical Models, Bayesian Spatial Panel Data Models, Bayesian Geographically Weighted Spatio-temporal Regression Models, Bayesian Spatio-temporal Varying Coefficient Models, and Bayesian Spatio-temporal Meshed Gaussian Process Model. The latter includes Bayesian Causal Forest Models, Bayesian Spatio-temporal Neural Networks, and Bayesian Graph Convolutional Neural Networks. In terms of application, the review highlights representative studies across domains such as public health, environmental sciences, socio-economic and public safety, as well as energy and engineering. [Prospect] Bayesian spatio-temporal statistical methods need to achieve breakthroughs in multi-source heterogeneous data modeling, integration with deep learning, incorporation of causal inference mechanisms, and optimization of high-performance computing. These advances are essential to balance theoretical rigor with practical adaptability and to promote the development of a next-generation spatio-temporal modeling paradigm characterized by causal inference, adaptive generalization, and intelligent analysis.

  • LIU Chengbao, BO Zheng, ZHANG Peng, ZHOU Miyu, LIU Wanyue, HUANG Rong, NIU Ran, YE Zhen, YANG Hanzhe, LIU Shijie, HAN Dongxu, LIN Qian
    Journal of Geo-information Science. 2025, 27(4): 801-819. https://doi.org/10.12082/dqxxkx.2025.240466

    [Significance] Lunar remote sensing is a critical method to ensure the safety and success of lunar exploration missions while advancing lunar scientific research. It plays a significant role in understanding the Moon's geological evolution and the formation of the Earth-Moon system. Accurate lunar topographic maps are essential for mission planning, including landing site selection, navigation, and resource identification. These maps also provide valuable data for studying planetary processes and the history of the solar system. [Progress] In recent years, with growing global interest and investment in lunar exploration, remarkable progress has been made in remote sensing technology. These advancements have significantly improved the precision, resolution, and coverage of lunar topographic mapping. Various lunar remote sensing missions, such as China's Chang'e program, NASA's Lunar Reconnaissance Orbiter, and missions by other space agencies, have acquired substantial amounts of multi-source, multi-modal, and multi-scale data. This wealth of data has laid a solid foundation for technological breakthroughs. For instance, high-resolution laser altimetry, optical photogrammetry, and synthetic aperture radar have provided detailed datasets, enabling refined mapping of the Moon's surface. However, the dramatic increase in data volume, complexity, and heterogeneity presents challenges for effective processing, integration, and application in topographic mapping. This paper provides a comprehensive overview of the current state of lunar topographic remote sensing and mapping, focusing on the implementation and data acquisition capabilities of major lunar remote sensing missions during the second wave of lunar exploration. It systematically summarizes the latest research progress in key surveying and mapping technologies, including laser altimetry, which enables precise elevation measurements; optical photogrammetry, which reconstructs surface features using high-resolution imagery; and synthetic aperture radar, which provides unique insights into topographic and subsurface structures. [Prospect] In addition to reviewing recent advancements, the paper discusses future trends and challenges in the field. Key recommendations include enhancing sensor functionality and performance metrics to improve data quality, optimizing the lunar absolute reference framework for consistency and accuracy, leveraging multi-source data fusion for fine-scale modeling, expanding scientific applications of lunar topography, and developing intelligent and efficient methods to process massive amounts of remote sensing data. These efforts will not only support upcoming lunar exploration missions, such as China's manned lunar landing program scheduled for 2030, but also contribute to a deeper understanding of the Moon and its relationship with Earth.

  • ZHENG Chenglong, SONG Ci, CHEN Jie
    Journal of Geo-information Science. 2025, 27(6): 1317-1331. https://doi.org/10.12082/dqxxkx.2025.250168

    [Objectives] With the deepening of urbanization and intensified market competition, long working hours have become a pervasive social issue, posing challenges to both workers' physical and mental health and to urban sustainable development. Current studies on urban residents' work activities predominantly rely on questionnaire survey data, which suffer from limited sample sizes and a lack of in-depth exploration into long working hours in megacities. [Methods] This research utilized mobile signaling data from Beijing, collected between November and December 2019, to identify stay points using a threshold rule method. Residential and workplace locations were determined through a time-window approach, and users' working hours were extracted. The study then examined the spatial distribution patterns of long-working-hours employees (defined as those working over 40 hours per week) and investigated spatial characteristics across various gender and age groups. Finally, the study also explored the characteristics of long working hours in different employment clusters in Beijing. [Results] The findings reveal that 47.1% of Beijing's workforce engages in long working hours (weekly working hours ≥40 hours), with an average weekly working duration of 48.86 hours. Spatial analysis demonstrates a polycentric agglomeration pattern, concentrated in major employment hubs such as the CBD, Financial Street, Zhongguancun, and Yizhuang. Significant disparities exist across gender and age groups. Male employees work an average of 49.62 hours per week, 1.5 hours more than their female counterparts (48.12 hours). Among male age groups, those aged 20~29 have the longest average weekly working hours at 50.68 hours. In contrast, although women aged 30~39 constitute the largest proportion of the female workforce (22.13%), their average weekly working hours are the lowest, at 47.59 hours. The characteristics of overtime work in different employment clusters show a clear pattern: the CBD and Zhongguancun have a higher number of overtime workers, while Yizhuang stands out with the highest proportion at 58.0%. Wholesale and logistics hubs such as Xinfadi and Majuqiao exhibit the most intensive work schedules, with average weekly working hours exceeding 50 hours. [Conclusions] This study provides rich empirical evidence for understanding the phenomenon of long working hours in Beijing. The results offer data-driven support for optimizing labor time policies, contributing to urban sustainable development and social equity.

  • LI Wangping, WEI Wenbo, LIU Xiaojie, CHAI Chengfu, ZHANG Xueying, ZHOU Zhaoye, ZHANG Xiuxia, HAO Junming, WEI Yuming
    Journal of Geo-information Science. 2025, 27(6): 1448-1461. https://doi.org/10.12082/dqxxkx.2025.250034

    [Objectives] Using deep learning methods for landslide identification can significantly improve efficiency and is of great importance for landslide disaster prevention and mitigation. The DeepLabV3+ algorithm effectively captures multi-scale features, thereby improving image segmentation accuracy, and has been widely used in the segmentation and recognition of remote sensing images. [Methods] We propose an improved model based on DeepLabV3+. First, the Coordinate Attention (CA) mechanism is incorporated into the original model to enhance its feature extraction capabilities. Second, the Atrous Spatial Pyramid Pooling (ASPP) module is replaced with the Dense Atrous Spatial Pyramid Pooling (DenseASPP) module, which helps the network capture more detailed features and expands the receptive field, effectively addressing the limitations of inefficient or ineffective dilated convolution. A Strip Pooling (SP) branch module is added in parallel to allow the backbone network to better leverage long-range dependencies. Finally, the Cascade Feature Fusion (CFF) module is introduced to hierarchically fuse multi-scale features, further improving segmentation accuracy. [Results] Experiments on the Bijie landslide dataset show that, compared with the original model, the improved model achieves a 2.2% increase in MIoU and a 1.2% increase in the F1 score. Compared with other mainstream deep learning models, the proposed model demonstrates higher extraction accuracy. In terms of segmentation quality, it significantly improves the overall accuracy in identifying landslide areas, reduces misclassification and omission, and yields more precise delineation of landslide boundaries. [Conclusions] Based on experiments using the landslide debris flow disaster dataset in Sichuan and surrounding areas, along with practical application verification, the proposed method demonstrates strong recognition capability across landslide images in diverse scenarios and levels of complexity. It performs particularly well in challenging environments such as areas with dense vegetation or proximity to rivers, showing strong generalization ability and broad applicability.

  • LIU Xuanguang, LI Yujie, ZHANG Zhenchao, DAI Chenguang, ZHANG Hao, MIAO Yuzhe, ZHU Han, LU Jinhao
    Journal of Geo-information Science. 2025, 27(5): 1144-1162. https://doi.org/10.12082/dqxxkx.2025.240668

    [Objectives] Existing semantic change detection methods fail to fully utilize local and global features in very high-resolution images and often overlook the spatial-temporal dependencies between bi-temporal remote sensing images, resulting in inaccurate land cover classification results. Additionally, the detected change regions suffer from boundary ambiguity, leading to low consistency between the detected and actual boundaries. [Methods] To address these issues, inspired by the Vision State Space Model (VSSM) with long-sequence modeling capabilities, we propose a semantic change detection network, CVS-Net, which combines Convolutional Neural Networks (CNNs) and VSSM. CVS-Net effectively leverages the local feature extraction capability of CNNs and the long-distance dependency modeling ability of VSSM. Furthermore, we embed a bi-directional spatial-temporal feature modeling module based on VSSM into CVS-Net to guide the network in capturing spatial-temporal change relations. Finally, we introduce a boundary-aware reinforcement branch to enhance the model's performance in boundary localization. [Results] We validate the proposed method on the SECOND and Fuzhou GF2 (FZ-SCD) datasets and compare it with five state-of-the-art methods: HRSCD.str4, Bi-SRNet, ChangeMamba, ScanNet, and TED. Comparative experiments demonstrate that our method outperforms these existing approaches, achieving a Sek of 23.95% and mIoU of 72.89% on the SECOND dataset, and a Sek of 23.02% and mIoU of 72.60% on the FZ-SCD dataset. In ablation experiments, as the proposed modules were progressively added, the SeK improved to 21.26%, 23.04%, and 23.95%, respectively, demonstrating the effectiveness of each module. Notably, compared with CNN-based, Transformer-based, and Mamba-based feature extractors,the proposed CNN-VSS feature extractor achieved the highest Sek, mIoU and Fscd, indicating its robust feature extraction capability and effective balance between local and global feature representation. Additionally, ST-SS2D improved the Sek score by 1.19% on average compared to other spatial-temporal modeling methods, effectively capturing the spatial-temporal dependencies of bi-temporal features and enhancing the model's ability to infer potential feature changes. Furthermore, the proposed edge-enhancement branch improved the consistency between detected and actual boundaries, achieving a consistency degree of 92.97%. [Conclusions] The proposed method significantly improves both the attribute and geometric accuracy of semantic change detection, providing technical references and data support for sustainable urban development and land resource management.

  • LIU Kang
    Journal of Geo-information Science. 2025, 27(7): 1520-1531. https://doi.org/10.12082/dqxxkx.2025.250196

    [Significance] Human mobility is closely tied to transportation, infectious disease spread, and public safety, making trajectory analysis and modeling a long-standing research focus. While numerous specialized trajectory models, such as interpolation, prediction, and classification models, have been developed using machine learning or deep learning, most are task-specific and trained on localized datasets, limiting their generalizability across tasks, regions, or trajectory data. Recent advances in generative AI have demonstrated the potential of foundation models in NLP and computer vision, motivating the need for a trajectory foundation model capable of learning universal patterns from large-scale mobility data to support diverse downstream applications. [Methods] This paper first reviews the research progress of various specialized trajectory models. It then categorizes trajectory modeling tasks into conventional tasks (e.g., trajectory similarity computation, interpolation, prediction, and classification) and generation task (i.e., trajectory generation), and elaborates on recent advances in trajectory foundation models for these two types of tasks. [Conclusions] The paper argues that trajectory foundation models for conventional tasks should enhance not only task generalization but also spatial and data generalization. Trajectory foundation models for generation task must address the challenge of spatial generalization, enabling the generation of large-scale trajectory data "from scratch" based on easily obtainable macro-level urban data or features. Furthermore, integrating trajectory data with other data types (e.g., text, maps, and other geospatial data) to construct multimodal geographic foundation models, as well as developing application-oriented trajectory foundation models for fields such as transportation, public health, and public safety, are promising research directions worthy of future exploration.

  • ZHANG Peng, LIU Wanyue, LIU Chengbao, BO Zheng, NIU Ran, HAN Dongxu, LIN Qian, ZHANG Ziyi, MA Mingze
    Journal of Geo-information Science. 2025, 27(4): 787-800. https://doi.org/10.12082/dqxxkx.2025.240467

    [Significance] The characteristics of the lunar surface, including its mineral compositions, geological formations, environmental factors, and temperature variations, are essential for advancing our understanding of the Moon. These features provide a wealth of scientific data for lunar research, such as resource distribution, environmental characteristics, and evolutionary history. Spectral imagers, which detect mineral compositions in a nondestructive way, play a crucial role in analyzing the mineral compositions of the lunar surface and have become key payloads in scientific exploration missions. With the increasing demand for high-precision lunar exploration data and advancements in spectral imaging technology, there is a growing trend toward acquiring lunar remote sensing data with higher spatial and spectral resolution across a broad spectral range. This trend is shaping the future of lunar orbit exploration, allowing for unprecedented detail in probing the Moon's surface. However, the higher resolution of spatial and spectral data also introduces significant challenges in data processing. [Progress] This paper begins by summarizing existing lunar spectral orbit data, including payload parameters and associated scientific findings. It then explores specific technical challenges in the data processing chain, such as pre-processing and the calculation of lunar surface parameters. Mapping surface compositions through spectral remote sensing is particularly complex due to the mixing of minerals within rocks, which can obscure clear spectral signatures. To address these challenges, various theoretical and empirical approaches have been developed. This paper proposes technical methods and potential solutions to overcome these obstacles.[Conclusions] In conclusion, detailed studies of lunar surface characteristics and the acquisition of high-resolution spectral data are vital for advancing lunar science. Lunar hyperspectral data are expected to support manned lunar exploration and scientific research by enabling the identification of various minerals on the Moon's surface and determining their abundance through hyperspectral observations. Advances in spectral imaging technology and the development of solutions for processing high-resolution data will significantly enhance lunar and planetary science capabilities. These efforts will pave the way for deeper insights into the Moon's geology and potential resource utilization.

  • SHI Shihao, SHI Qunshan, ZHOU Yang, HU Xiaofei, QI Kai
    Journal of Geo-information Science. 2025, 27(7): 1596-1607. https://doi.org/10.12082/dqxxkx.2025.250015

    [Objectives] Small object detection is of great significance in both military and civil applications. However, due to challenges such as low resolution, high noise environments, target occlusion, and complex backgrounds, traditional detection methods often struggle to achieve the necessary accuracy and robustness. The problem of detecting small objects in complex scenes remains highly challenging. Therefore, this paper proposes a hybrid feature and multi-scale fusion algorithm for small object detection. [Methods] First, a Hybrid Conv and Transformer Block (HCTB) is designed to fully utilize local and global context information, enhancing the network's perception of small objects while optimizing computational efficiency and feature extraction capability. Second, a Multi-Dilated Shared Kernel Conv (MDSKC) module is introduced to extend the receptive field of the backbone network using dilated convolutions with varying expansion rates, thereby enabling efficient multi-scale feature extraction. Finally, the Omni-Kernel Cross Stage Model (OKCSM), constructed based on the concepts of Omni-Kernel and Cross Stage Partial, is integrated to optimize the small target feature pyramid network. This approach helps preserve small object information and significantly improves detection performance. [Results] Ablation and comparison experiments were conducted on the VisDrone2019 and TinyPerson datasets. Compared to the baseline model YOLOv8n, the proposed method improves precision, recall, mAP@50, and mAP@50:95 by 1.3%, 3.1%, 3%, and 1.9%, respectively on VisDrone2019, and by 3.6%, 1.3%, 2.1%, and 0.7%, respectively on TinyPerson. Additionally, the model size and GFLOPs are only 6.3 MB and 11.3 G, demonstrating its efficiency. Furthermore, compared with classical algorithms, such as HIC-YOLOv5, TPH- YOLOv5, and Drone-YOLO, the proposed algorithm demonstrates significant advantages and superior performance. [Conclusions] The algorithm effectively improves detection accuracy, confirming its strong performance in addressing small object detection in complex scenes.

  • MENG Yuebo, SU Shilong, HUANG Xinyu, WANG Heng
    Journal of Geo-information Science. 2025, 27(4): 930-945. https://doi.org/10.12082/dqxxkx.2025.240633

    [Objectives] To address issues in existing remote sensing building extraction models, including poor feature representation ability due to redundancy, unclear building boundaries, and the loss of small buildings, [Methods] we propose a detail enhancement and cross-scale geometric feature sharing network (DCS-Net). This network consists of an Information Decoupling and Aggregation Module (IRDM), a Local Mutual Similarity Detail Enhancement Module (LMSE), and a Cross-scale Geometric Feature Fusing Module (CGFF), designed to guide small target inference. The IRDM module separates and reconstructs redundant features by assigning weights, thereby suppressing redundancy in both spatial and channel dimensions and promoting effective feature learning. The LMSE module enhances the accuracy and completeness of building edge information by dynamically selecting windows and specifying pixel clustering based on local mutual similarity between encoder-decoder features. The CGFF module computes the feature block relationships between the original image and various semantic-level feature maps to compensate for information loss, thereby improving the extraction performance of small buildings. [Results] The experiments in this paper are based on two public datasets: the WHU aerial dataset and the Massachusetts building detection dataset. The experimental results demonstrate the following: (1) Compared with existing building extraction algorithms such as UNet, PSPNet, Deeplab V3+, MANet, MAPNet, DRNet, Build-Former, MBR-HRNet, SDSNet, HDNet, DFFNet, and UANet, DCS-Net has achieved significant improvements across various evaluation metrics, demonstrating the effectiveness of the proposed method. (2) On the WHU dataset, the Intersection over Union (IoU), F1 score, and 95% Hausdorff Distance (95%HD) reached 92.94%, 96.35%, and 75.79%, respectively, outperforming the current best algorithm by 0.79%, 0.44%, and 1.90%. (3) On the Massachusetts dataset, the metrics were 77.13%, 87.06%, and 205.26, with improvements of 0.72%, 0.43%, and 13.84%, respectively. [Conclusions] These results indicate that DCS-Net can more accurately and comprehensively extract buildings from remote sensing images, significantly alleviating the issue of small building loss.

  • LIU Xiaoqing, REN Fu, YUE Weiting, GAO Yunji
    Journal of Geo-information Science. 2025, 27(5): 1214-1227. https://doi.org/10.12082/dqxxkx.2025.240359

    [Objectives] Forests, as the backbone of terrestrial ecosystems, play crucial roles in climate regulation and soil and water conservation. Among the many threats to forests, the impact of forest fires is becoming increasingly severe. Analyzing the factors influencing forest fires is essential for preventing forest fires and formulating relevant strategies. [Methods] This study focuses on China, using multi-source data related to fires, vegetation, climate, topography, and human activities to analyze the spatial heterogeneity of forest fire driving forces from multiple perspectives. [Results] The findings reveal that: (1) At a global scale, the spatial distribution of forest fires is most influenced by FVC, with an explanatory power of 0.130 2, while climate factors exert a relatively strong influence. The interaction between driving factors is enhanced, and forest fire occurrence results from the combined influence of multiple factors. Moreover, a nonlinear relationship and impact threshold exist between these driving factors and the probability of forest fire occurrence. (2) At a local scale, climate and vegetation serve as key driving factors behind forest fires, significantly explaining their spatial distribution across different zones. Temperature is the most influential factor in the Cold Temperate Needle-leaf Forest region, the Temperate Coniferous and Broad-leaved Mixed Forest region, and the Alpine Vegetation of the Tibetan Plateau region, with explanatory powers of 0.313, 0.41, and 0.052, respectively. In contrast, wind speed is the dominant factor in the Warm Temperate Broad-leaved Forest region, with an explanatory power of 0.279. [Conclusions] The primary driving factors and their interactions vary across different regions, quantitatively confirming the spatial heterogeneity of forest fire driving forces. This research contributes to a national-scale understanding of forest fire drivers and fire hazard distribution in China, assisting policymakers in designing fire management strategies to mitigate potential fire risks.

  • QIN Chengzhi, ZHU Liangjun, CHEN Ziyue, WANG Yijie, WANG Yujing, WU Chenglong, FAN Xingchen, ZHAO Fanghe, REN Yingchao, ZHU Axing, ZHOU Chenghu
    Journal of Geo-information Science. 2025, 27(5): 1027-1040. https://doi.org/10.12082/dqxxkx.2025.240706

    [Objectives] Geographic modeling aims to appropriately couple diverse geographic models and their specific algorithmic implementations to form an effective and executable model workflow for solving specific, unsolved application problems. This approach is highly valuable and in high demand in practice. However, traditional geographic modeling is designed with an execution-oriented approach, which plays a heavy burden on users, especially non-expert users. [Methods] In this position paper, we advocate not only for the necessity of intelligent geographic modeling but also achieving it through a so-called recursive geographic modeling approach. This new approach originates from the user's modeling target, which can be formalized as an initial elemental modeling question. It then reasons backward to resolve the current elemental modeling question and iteratively updates new elemental modeling questions in a recursive manner. This process enables the automatic construction of an appropriate geographic workflow model tailored to the application context of the user's modeling problem, thereby addressing the limitations of traditional geographic modeling. [Progress] Building on this foundational concept, this position paper introduces a series of intelligent geographic modeling methods developed by the authors. These methods aim to reduce the geographic modeling burden on non-expert users while assuring the appropriateness of automatically constructed models. Specifically, each proposed intelligent geographic modeling method is designed to solve a specific type of elemental question within intelligent geographic modeling. The elemental questions include: (1) how to determine the appropriate model algorithm (or its parameter values) within the given application context, (2) how to select the appropriate covariate set as input for a model without a predetermined number of inputs (e.g., a soil mapping model without predetermined environmental covariates as inputs), (3) how to determine the structure of a model that integrates multiple coupled modules (e.g., a watershed system model incorporating diverse process simulation modules), and (4) how to determine the proper spatial extent of input data for a geographic model when a specific area of interest is assigned by the user. The key to solving these elemental questions lies in the effective utilization of geographic modeling knowledge, particularly application-context knowledge. However, since application-context knowledge is typically unsystematic, empirical, and implicit, we developed case formalization and case-based reasoning strategies to integrate this knowledge within the proposed methods. Based on the recursive intelligent geographic modeling approach and the correspondingly methods, we propose an application schema for intelligent geographic modeling and computing. This schema is grounded in domain modeling knowledge, particularly case-based application-context knowledge, and leverages the “Data-Knowledge-Model” tripartite collaboration. A prototype of this approach has been implemented in an intelligent geospatial computing system called EGC (EasyGeoComputing). [Prospect] Finally, this position paper discusses the emerging role of large language models in geographic modeling. Their potential applications, relationships with the research presented here, and prospects for future research directions are explored.

  • HE Li, WANG Rong
    Journal of Geo-information Science. 2025, 27(9): 2151-2164. https://doi.org/10.12082/dqxxkx.2025.250273

    [Significance] Space is not merely a physical place, but a productive arena of social relations. Social phenomena are inherently endowed with spatial attributes, making the spatial perspective a critical pathway for understanding complex social issues. With the deepening "spatial turn" in the social sciences and continuous advancements in Geographic Information Systems (GIS)—particularly in data acquisition, spatial analysis and modeling, and spatial visualization—GIS has become an essential tool for addressing social issues. However, disciplinary differences in theoretical paradigms, methodological logic, and scale cognition between geography and the social sciences constrain their deeper integration. Existing literature lacks a systematic synthesis of integration trends, underlying challenges, and empowerment pathways, necessitating a comprehensive clarification of fusion mechanisms, core obstacles, and emerging opportunities. [Progress] This paper identifies five key advantages of GIS in empowering social science research: expanding spatial analytical thinking, supporting spatiotemporal data, enhancing survey techniques, enriching representational forms, and strengthening analytical capabilities. We review representative GIS applications in economics, political science, and sociology. From dimensions such as spatial cognition, data capacity, methodological adoption, and research hotspots, we distill application characteristics across these disciplines, revealing both commonalities and differences. While all three disciplines recognize spatial effects, their theoretical orientations shape distinct technical approaches—economics emphasizes causal identification, political science focuses on geopolitical structures, and sociology prioritizes contextual representation. Through a three-dimensional analysis—data, methodology, and cognition—we examine three major challenges in addressing social issues: the mismatch between data and research questions, the difficulty of integrating methods with causal mechanisms, and the contextual misalignment of place and scale, which reflect deeper issues of data suitability, methodological coherence, and the validity of spatial reasoning. [Prospects] The advancement of artificial intelligence, especially large models, injects new methodological momentum into GIS-based spatial analysis and brings threefold opportunities for addressing social issues. First, large models are driving spatial analysis from correlation-based description toward transparent causal inference; Second, multi-source data fusion and the generation of "silicon-based samples" help overcome the limitations of traditional survey data. Third, an emerging "space-survey" integrated framework is constructing a "spatial cognitive infrastructure" to support social research. Future efforts should establish a synergistic "large model-spatial analysis" paradigm that integrates these three opportunities. By simultaneously addressing challenges of data matching, method integration, and contextual misalignment, this paradigm can elevate GIS from a supportive tool to a core engine for theory generation and mechanism interpretation. This transformation will enhance the scientific value and practical effectiveness of GIS and spatial analysis in addressing complex social issues, fostering a bidirectional interaction between methodological innovation and theoretical advancement.

  • SHAN Huilin, WANG Xingtao, LIU Wenxing, WU Xinyue, GAO Runze, LI Hongxu
    Journal of Geo-information Science. 2025, 27(6): 1381-1400. https://doi.org/10.12082/dqxxkx.2025.250009

    [Objectives] With the enhancement of spatial resolution, remote sensing images contain increasingly intricate information, encompassing a vast array of spatial and semantic features. The effective extraction and integration of these features play a pivotal role in semantic segmentation performance. However, most existing approaches focus solely on feature fusion improvements while neglecting the consistency between spatial and semantic features. Additionally, these methods often overlook the precise extraction of edge information, which significantly impacts segmentation accuracy. [Methods] This paper proposes a semantic segmentation model for high-resolution remote sensing images based on multi-scale deep supervision. First, separate feature extraction branches are designed for spatial and semantic features to fully exploit their respective information. Second, a spatial redundancy reduction residual module is incorporated into the spatial branch, integrating wavelet transformation and coordinate convolution to enhance spatial feature extraction and better capture edge details. Third, a residual attention Mamba module is added to the semantic branch to facilitate global-level semantic feature extraction. Finally, a multi-scale feature fusion mechanism is applied, utilizing a large-kernel grouped feature extraction module to progressively merge spatial, semantic, and deep-level features while suppressing irrelevant information and activating meaningful features. Additionally, a deep supervision mechanism is employed by introducing auxiliary supervision heads at each feature fusion stage to enhance training efficiency. [Results] Comparison and ablation experiments were conducted on the ISPRS Potsdam and Vaihingen datasets with random sampling and data augmentation, The experimental results demonstrate that the proposed algorithm achieves an average Intersection over Union (IoU) of 83.43% on ISPRS Potsdam and 86.49% on the augmented Vaihingen dataset. Compared to nine state-of-the-art methods, including CGGLNet and CMLFormer, the proposed approach improves the average IoU by at least 5.00% and 3.00%, respectively. [Conclusions] The results verify that the proposed algorithm effectively extracts and integrates spatial and semantic features, thereby enhancing the accuracy of semantic segmentation in remote sensing images.

  • XU Xinyuan, NIU Lei
    Journal of Geo-information Science. 2025, 27(4): 994-1010. https://doi.org/10.12082/dqxxkx.2025.240544

    [Objectives] Although the use of street view data to calculate the Green View Index (GVI) has emerged as a method for evaluating urban greening levels, systematic research on the spatiotemporal dynamics of GVI remains limited. [Methods] This study explores the spatiotemporal characteristics and influencing factors of urban GVI using street view big data, providing a new method for assessing urban street greening levels. This study proposes the GSENet semantic segmentation model for calculating and analyzing the GVI in Lanzhou's main urban area. The GSENet model incorporates a GSE-Block feature calibration module within its encoder, combining spatial and channel attention mechanisms. The decoder adopts an efficient self-attention module (Mix-transformer), which introduces a scaling factor and replaces the fully connected layer with a 1×1 convolution, combining the global modeling capability of Transformers with the local processing ability of convolution. Using the GSENet model, this study calculates the GVI of Lanzhou's main urban area based on Baidu Street View data and explores its spatiotemporal variation patterns through hotspot analysis, statistical analysis, and correlation analysis. [Results] The results reveal several key findings: (1) Utilizing ResNet50 as the backbone, the GSENet model achieves a Mean Intersection over Union (MIOU) of 74.7%, outperforming mainstream models such as PSPNet and DeepLabV3. The model demonstrates superior performance in identifying large-area categories such as vegetation and buildings, achieving an F1 score of 0.95. (2) Between 2019 and 2023, the average GVI increased by 2.3% compared to the period from 2014 to 2018. Notably, 70.9% of the sampled points showed a positive GVI trend, although only 8.4% experienced an increase greater than 10%. Anning District recorded the most substantial improvement, with a GVI rise of 3.5%, while Chengguan District saw the smallest growth, at only 1.9%. Spatial analysis identified that the central-western and northeastern parts of the study area experienced significant GVI increases, particularly in regions surrounding universities. In contrast, GVI declined notably in commercial centers and transportation hubs. (3) The influence of street view features and social factors on GVI changes exhibits spatiotemporal heterogeneity. Building density shows a negative correlation with GVI changes. The correlation between road width and GVI changes is relatively weak while the correlation between population density and GVI changes varies across different scales, with a stronger positive correlation at the street scale. [Conclusions] The experimental results highlight the effectiveness of this research in enhancing the perceived greening of urban streets. Furthermore, the findings provide valuable insights for urban planners aiming to optimize green space distribution and improve urban environments.

  • WANG Kaiqing, XIAO Yanyan, ZHANG Zhiwei, LI Yongle
    Journal of Geo-information Science. 2025, 27(7): 1738-1750. https://doi.org/10.12082/dqxxkx.2025.250148

    [Objectives] Points of Interest (POIs) have dual characteristics as geospatial entities and carriers of cultural information, serving as the data foundation for analyzing and identifying regional cultural expressions and functional traits. Identifying and analyzing the types and characteristics of tourism cultural scenes along the Grand Canal is of great significance for achieving differentiated and sustainable cultural tourism development. [Methods] By integrating POI data with scene theory, spatial entities are associated with cultural values, and quantitative statistics are combined with qualitative configuration analysis. A tourism-cultural amenity database was established using 476,968 POI records, categorized into 6 major categories and 24 sub-categories. The Delphi method was employed to determine scores for each subcategory related to tourism amenity scenes, which were then used to calculate the performance scores of tourism cultural scenes. Descriptive statistical analysis, K-means clustering, and hierarchical clustering were applied to identify types of tourism-cultural scenes. The clustering results were visualized on maps. Meanwhile, the characteristics, formation mechanisms, and corresponding countermeasures of these scene types were further analyzed. [Results] (1) The Jiangsu section of the Grand Canal exhibits distinctive local tourism-cultural characteristics, with strong regional identity and attractiveness. However, significant disparities exist in tourism-cultural value orientations, particularly in subcategories such as locality, glamour, exhibitionism, utilitarianism, and charisma, highlighting the heterogeneous features of tourism-cultural scenes in this area. (2) Cluster analysis classified 34 counties (cities or districts) along the Jiangsu section into four types: local scenes (10 regions), utilitarian scenes (8 regions), comfortable scenes (13 regions), and charming scenes (3 regions). Discriminant analysis validated the reliability of these clustering results. Each of the four scene types exhibits distinct characteristics. (3) The types of tourism-cultural scenes are influenced by the combined effects of multiple factors (economic development, urbanization, population, fiscal policy, transportation, and tourism resources), which can be summarized into three configuration-based influence paths. [Conclusions] This study introduces scene theory into cultural tourism research based on POI big data, offering a novel approach to promoting regionally differentiated and sustainable development of cultural tourism.

  • SUN Baodi, CHEN Keying, CHEN Zhaohui, WANG Chun, YAN Yuxi, TANG Jingchao, LIU Yifeng
    Journal of Geo-information Science. 2025, 27(7): 1671-1686. https://doi.org/10.12082/dqxxkx.2025.250058

    [Significance] As the basic unit of a city, the carbon emission levels and accuracy of community-scale accounting directly impact the overall effectiveness of emission reduction in the construction industry. This paper reviews the main methods of carbon accounting, evaluates their advantages and disadvantages, and proposes a new approach to enhance the accuracy and comprehensiveness of community carbon accounting using digital twin technology. [Progress] This paper first introduces three traditional carbon accounting methods, namely the carbon emission factor method, the mass balance method, and the direct measurement method, and discusses their applications. It then identifies digital twin technologies suitable for community-scale carbon accounting, including Building Information Modeling (BIM), Geographic Information System (GIS), and the Internet of Things (IoT). The paper analyzes current development trends, including: (i) expanding the scope of carbon accounting to the community level using digital twin technology, (ii) strengthening the integration and interoperability of digital twin systems, and (iii) establishing a community carbon accounting framework grounded in digital twin technology. It further proposes integrating BIM, GIS, and IoT into a unified system based on the city information model to build a comprehensive community carbon emission platform. [Prospect] Looking ahead, the application of digital twin technology holds promise for enabling accurate carbon accounting, emission forecasting, reduction pathway planning, and performance evaluation for communities of varying scales and geographical contexts. Furthermore, with advances in AI technology, it is anticipated that city information models for community carbon accounting will increasingly integrate AI agents, leveraging the power of big data, large models, and high-performance computing, to create intelligent carbon accounting systems for the smart city era.

  • CHEN Xiawei, LONG Yi, LIU Xiang, ZHANG Ling, LIU Shaojun
    Journal of Geo-information Science. 2025, 27(5): 1228-1245. https://doi.org/10.12082/dqxxkx.2025.240508

    [Objectives] The quality of the leisure environment is a critical factor influencing residents' leisure experiences and participation, and it is closely related to the vitality of urban areas and economic development. Therefore, exploring how environmental quality influences the vitality of leisure space is crucial for promoting urban development. [Methods] A human-centered approach is adopted to construct a research framework for exploring the relationship between leisure environment quality and leisure space vitality based on image-text fusion perception. Online review texts and street view images are used to comprehensively perceive the leisure environment quality of the city. Natural language processing and semantic segmentation techniques are used to assess the leisure environment quality, while mobile signaling data is utilized to quantitatively measure the vitality of leisure spaces through user trajectory semantic modeling. Finally, using an Optimal Parameter-based Geographical Detector (OPGD), an in-depth analysis is conducted on the impact mechanisms of individual leisure environment quality factors and their interactions with the vitality of leisure spaces at global and local spatial scales in Nanjing. [Results] The findings reveal that: (1) The spatial distribution of leisure space vitality exhibits a "single-core-multi-center" pattern. The vitality in the main urban area is concentrated around the Xinjiekou commercial district, while Jiangbei District forms a "three-point" pattern with interactions between the two ends and the center. In the Xianlin area, high-vitality zones are distributed around the university town, while in the Dongshan area, they are located along the Shuanglong Avenue corridor. (2) On a macro scale, the leisure space vitality of Nanjing is indirectly dominated by economic levels. On a local scale, the influence of 14 leisure environment quality factors on leisure space vitality demonstrates significant regional heterogeneity. However, in municipal and district-level core areas with high leisure space vitality, the effects of these environmental quality factors are all significant. (3) The formation mechanism of leisure space vitality in Nanjing is closely related to regional geographical location, population density and composition, and economic income levels. [Conclusions] The analysis of Nanjing indicates that the exploration of leisure environment quality through image-text fusion perception enhances the systematic and comprehensive understanding of the factors influencing leisure space vitality and its mechanisms. This provides a scientific basis for optimizing the quality of the urban leisure environment and enhancing the vitality of leisure space.

  • ZHENG Qiangwen, WU Sheng, WEI Jinghui
    Journal of Geo-information Science. 2025, 27(6): 1361-1380. https://doi.org/10.12082/dqxxkx.2025.250122

    [Background] Traditional methods, due to their static receptive field design, struggle to adapt to the significant scale differences among cars, pedestrians, and cyclists in urban autonomous driving scenarios. Moreover, cross-scale feature fusion often leads to hierarchical interference. [Methodology] To address the key challenge of cross-scale representation consistency in 3D object detection for multi-class, multi-scale objects in autonomous driving scenarios, this study proposes a novel method named VoxTNT. VoxTNT leverages an equalized receptive field and a local-global collaborative attention mechanism to enhance detection performance. At the local level, a PointSetFormer module is introduced, incorporating an Induced Set Attention Block (ISAB) to aggregate fine-grained geometric features from high-density point clouds through reduced cross-attention. This design overcomes the information loss typically associated with traditional voxel mean pooling. At the global level, a VoxelFormerFFN module is designed, which abstracts non-empty voxels into a super-point set and applies cross-voxel ISAB interactions to capture long-range contextual dependencies. This approach reduces the computational complexity of global feature learning from O(N2) to O(M2) (where M << N, M is the number of non-empty voxels), avoiding the high computational complexity associated with directly applying complex Transformers to raw point clouds. This dual-domain coupled architecture achieves a dynamic balance between local fine-grained perception and global semantic association, effectively mitigating modeling bias caused by fixed receptive fields and multi-scale fusion. [Results] Experiments demonstrate that the proposed method achieves a single-stage detection Average Precision (AP) of 59.56% for moderate-level pedestrian detection on the KITTI dataset, an improvement of approximately 12.4% over the SECOND baseline. For two-stage detection, it achieves a mean Average Precision (mAP) of 66.54%, outperforming the second-best method, BSAODet, which achieves 66.10%. Validation on the WOD dataset further confirms the method’s effectiveness, achieving 66.09% mAP, which outperforms the SECOND and PointPillars baselines by 7.7% and 8.5%, respectively. Ablation studies demonstrate that the proposed equalized local-global receptive field mechanism significantly improves detection accuracy for small objects. For example, on the KITTI dataset, full component ablation resulted in a 10.8% and 10.0% drop in AP for moderate-level pedestrian and cyclist detection, respectively, while maintaining stable performance for large-object detection. [Conclusions] This study presents a novel approach to tackling the challenges of multi-scale object detection in autonomous driving scenarios. Future work will focus on optimizing the model architecture to further enhance efficiency.

  • WANG Kuang, KE Rihong, LI Shengnan, WANG Pu
    Journal of Geo-information Science. 2025, 27(4): 967-978. https://doi.org/10.12082/dqxxkx.2025.240586

    [Objectives] Revealing the structural characteristics of tourist flow networks is a prerequisite for achieving complementary advantages and coordinated development among attractions.[Methods] In this study, we employs methods such as travel chain extraction, social network analysis, and community detection to construct a research framework to analyze multi-scale tourist flow networks based on large-scale mobile phone data. The structural characteristics of the tourist flow network in Changsha are explored at microscopic, mesoscopic, and macroscopic scales.[Results] (1) Microscopic scale: The tourist flow network of Changsha shows a significant centralization trend, where a few core attractions such as the Yuelu Mountain and Orange Island have great influences on the whole network. Only 33% of attractions show structural hole efficiency and effectiveness above average, while their constraint is below average, indicating prominent structural holes and limited overall connectivity and efficiency. (2) Mesoscopic scale: The tourist flows of Changsha are highly concentrated, showing obvious spatial clustering characteristics and forming six tourism communities. There are usually two core attractions in each community to drive tourists to visit the surrounding attractions. In addition, the development of tourism communities is unbalanced, with a highly large community centered on Yuelu Mountain and Orange Island. (3) Macroscopic scale: The spatial distribution of the tourist flow network presents the characteristics of single-core strong concentration and overall dispersion, showing a multi-layer structure with the city center as the core and spreading outwards. The global efficiency of the network is only 0.367, with some marginal attractions having poor accessibility. The core attraction plays limited "trickle-down" effects on marginal attractions.

  • YUE Zichen, ZHONG Shaobo, MEI Xin
    Journal of Geo-information Science. 2025, 27(6): 1289-1304. https://doi.org/10.12082/dqxxkx.2025.240715

    [Objectives] Knowledge graphs, as a cutting-edge technology for integrating multimodal data sources, have garnered significant attention in the GIS domain. These graphs are typically constructed using graph databases. However, mainstream graph databases still face challenges in effectively organizing and analyzing geospatial-temporal data. [Methods] To address this issue, this paper proposes an approach to modeling spatiotemporal semantics and query optimization that bridges graph and spatial data engine implemented within relational databases. In the graph database, geographic entities are stored as lightweight placeholder nodes (storing only mapping IDs) and linked to spatiotemporal index nodes (such as time trees and Geohash encodings) to enhance aggregation capabilities. Meanwhile, complete geospatial-temporal objects are stored in a relational database, while table partitioning strategies are employed to improve retrieval efficiency. This approach uses unified identifiers and JDBC for routing geographic entities across the databases. When users invoke pre-registered spatiotemporal functions in the graph database, a query rewriter transforms the graph queries into SQL statements based on entity identifiers, pushes them to the relational database for processing, and returns the results to the graph query pipeline. Additionally, a two-phase commit protocol ensures data consistency across the heterogeneous databases. [Results] We implemented a prototype system integrating Neo4j and PostGIS and conducted experiments on query and storage efficiency using a multisource spatiotemporal dataset from Shenzhen (including taxi trajectories, bike-sharing trajectories, road networks, POIs, and remote sensing imagery). Compared to mainstream graph database systems (e.g., Neo4j and GraphDB), our approach significantly improves performance for geospatial-temporal queries, reducing response times by 1~2 orders of magnitude in complex computational scenarios and enabling raster computations unsupported by native graph databases. By leveraging lightweight graph nodes and PostGIS data compression, storage space is reduced by approximately 3~5 times. Compared to virtual knowledge graph systems (e.g., Ontop), our method shows minimal differences in spatial query performance and storage overhead, while achieving notably faster response times for large-scale spatiotemporal queries. [Conclusions] Compared to existing methods, our approach leverages existing graph databases to construct materialized spatiotemporal knowledge graphs, enhancing modeling flexibility and query efficiency for geospatial-temporal data. It also supports user-defined extensions to the geospatial-temporal function library, offering a novel framework for efficiently managing and analyzing such data within knowledge graphs.

  • LI Xiao, WANG Shaohua, LIANG Haojian, ZHOU Liang, LIU Chang, WANG Runqiao, SU Cheng
    Journal of Geo-information Science. 2025, 27(8): 1822-1840. https://doi.org/10.12082/dqxxkx.2025.250144

    [Objectives] Sustainable development is an important issue for countries worldwide, encompassing key aspects such as sustainable transportation systems and inclusive, sustainable urbanization. As a crucial component of urban public service infrastructure, the public transportation network serves as a cornerstone of a city's stable operation, with the distribution of its stops and routes directly influencing residents' travel patterns. However, existing studies mainly focus on accessibility analysis, site selection optimization, and spatial coupling with factors such as population and land use, while lacking in-depth optimization approaches and clear mechanisms that address spatial heterogeneity and facility redundancy. [Methods] Taking Beijing as a case study, with a focus on Dongcheng and Xicheng Districts, this study constructs a system of influencing factors based on multi-source data, including public transportation networks, topography, and economic indicators, and employs the XGBoost machine learning method to reveal the impact weights of these driving factors on the distribution of bus stops. On this basis, a mathematical model incorporating stop redundancy is proposed to optimize the spatial layout of upstream and downstream stops, producing a spatial optimization map of bus stops in Beijing. [Results] The findings indicate that: (1) There is an imbalance in the distribution of public transportation facilities in Beijing, with the proportion of the population having convenient access to public transportation differing by more than 30% between central and peripheral urban areas. (2) Among the 19 influencing factors, population density is the key driving factor, accounting for 27.77%, while the number of scenic spots and parking facilities have minimal impact, with feature importance scores below 0.5%. (3) Compared to the p-median model, the proposed redundancy optimization model significantly reduces the redundancy of optimized stops while maintaining performance in minimizing weighted distance. The optimized stop layout is more evenly distributed along existing bus routes. [Conclusions] These findings provide valuable reference and theoretical support for the layout of bus stops and other public service facilities, contributing to the efficient utilization of public resources and promoting sustainable urban development.

  • HAO Yuanfei, LIU Zhe, ZHENG Xi, QIAN Yun
    Journal of Geo-information Science. 2025, 27(9): 2070-2085. https://doi.org/10.12082/dqxxkx.2025.250129

    [Objectives] Street space serves as the primary perceptual interface for pedestrians in urban environments, and the visual quality of these spaces plays a crucial role in enhancing their vitality. Traditional evaluation methods often rely on single-objective indicators, making it difficult to effectively link objective environmental features with pedestrians' subjective perceptions. [Methods] This study proposes a novel evaluation framework based on Large Language Models (LLMs), incorporating the style dimension of subjective perception and extending traditional single-indicator quantitative analysis to a comprehensive approach that integrates both quantification and stylization. This framework utilizes Baidu Street View imagery to quantitatively assess two objective indicators, namely green view index and sky view factor, through semantic segmentation techniques. Additionally, it evaluates six subjective indicators, including vegetation diversity, building typology, building continuity, sidewalk usage, roadway usage, and signage usage, by leveraging prompt-optimized LLMs. The study then categorizes street space visual quality features within the research area using the Latent Dirichlet Allocation (LDA) topic model, aiming to explore the spatial characteristics of different streets and identify optimization strategies. [Results] Using Beijing's Xicheng District as the study area, the results reveal spatial distribution patterns of vegetation density and sky openness, along with pedestrians' subjective evaluations of indicators such as vegetation diversity and building type. Cluster analysis identified comprehensive service streets centered around Xidan North Street, characteristic streets centered around Xihuangchenggen South Street, and mixed-type streets centered around Lingjing Hutong. [Conclusions] This study innovatively introduces a large language model with human-like perceptual capabilities, enhancing its performance through prompt engineering. The resulting framework enables efficient and integrated evaluation of street visual quality by combining both objective and subjective factors. This approach provides a practical reference for large-scale, automated analysis of street view imagery.

  • LUO Jianwei, ZHANG Yinsheng
    Journal of Geo-information Science. 2025, 27(5): 1195-1213. https://doi.org/10.12082/dqxxkx.2025.240717

    [Objectives] Deep Convolutional Neural Networks (DCNNs) have been successfully applied to semantic segmentation of high-resolution remote sensing images. However, such images often exhibit large intra-class variance, small inter-class variance, and significant variations in target scale. Convolutional operations struggle to handle these complexities due to their localized nature. While Transformer-based methods offer powerful global information modeling capabilities, they are less effective at capturing local information. The combination of Convolutional Neural Networks and Transformers is widely used, yet optimizing these strategies for more effective feature integration remains a challenge. Additionally, many existing models focus on multilevel and multiscale feature extraction but fail to fully account for diverse target types and scales in high-resolution remote sensing images. [Methods] To address these challenges, this paper proposes a dual-path high-resolution remote sensing image segmentation algorithm for enhanced multi-scale target perception, utilizing an asymmetric dual encoder structure based on DCNN and Transformer. First, a Scalable Channel Spatial Pyramid module is introduced, leveraging deep convolution to dynamically extract fused multi-channel information while maintaining a large receptive field, enhancing the model's ability to capture multiscale features. Second, a Multiscale Feature Enhanced Transformer module is proposed, incorporating feature anchor preprocessing to provide spatial induction bias information. Additionally, a learnable cosine similarity matrix is constructed within the self-attention mechanism, guiding the module to focus on target features of varying scales while reducing redundant information interference. Finally, a Bilateral Feature-Guided Fusion module is constructed to facilitate fusion and information exchange between different-scale features across both branches through an attention mechanism. [Results] Comparative and ablation experiments were conducted on the Vaihingen and Potsdam datasets. The proposed model achieved 83.29% mean Intersection over Union, 90.65% mean F1 score, and 91.69% Overall Accuracy on the Vaihingen dataset, and 73.29% mean Intersection over Union, 83.98% mean F1 score, and 88.47% Overall Accuracy on the Postdam dataset. Compared to the best baseline method, the proposed model improved mIoU by 0.76% on the Vaihingen dataset and 1.42% on the Potsdam dataset. Additionally, a comparison of model complexity and segmentation performance showed that DMFPNet achieves the best balance between floating-point computation capacity, parameter efficiency, and segmentation performance. [Conclusions] In summary, the proposed model demonstrates strong performance and high segmentation accuracy in addressing the complex challenges of high-resolution remote sensing image semantic segmentation, including large intra-class and inter-class variance and variable target scales.

  • PING Yifan, LU Jun, GUO Haitao, HOU Qingfeng, ZHU Kun, SANG Zehao, LIU Tong
    Journal of Geo-information Science. 2025, 27(7): 1608-1623. https://doi.org/10.12082/dqxxkx.2025.250051

    [Objectives] Cross-view image geolocation refers to a technology that determines the geographical location of an image by matching it with reference images taken from different perspectives and possessing precise location information. This technology plays a crucial role in real-world applications such as Unmanned Aerial Vehicle (UAV) navigation, environmental monitoring, and target positioning. Currently, most deep learning-based cross-view image retrieval and geolocation methods for drone-satellite tasks rely heavily on supervised learning. However, the scarcity of high-quality labeled data presents a significant limitation, hindering the generalization capability of these models. Moreover, existing methods often fail to effectively model the spatial layout of images, making it difficult to bridge the substantial domain gap between cross-view images, thereby limiting the accuracy and robustness of geolocation tasks. [Methods] To address these challenges, this paper proposes a novel cross-view image retrieval and localization architecture called DINO-MSRA. The architecture first employs the DINOv2 large model framework, fine-tuned by Conv-LoRA, as the feature encoder. This enhances the model's feature extraction capabilities with fewer parameters, improving both efficiency and accuracy. Second, we design a spatial relation-aware feature aggregator based on the Mamba module (MSRA) to more effectively aggregate image features. By embedding spatial configuration features into the global descriptor, this module significantly improves the model's performance in cross-view matching tasks, especially in complex scenarios where spatial relationships between objects are crucial. Finally, the InfoNCE loss function is adopted to train the model, optimizing contrastive learning and ensuring more accurate retrieval and localization results. [Results] Extensive comparative and ablation experiments were conducted on the University-1652 and SUES-200 datasets. The experimental results show that for drone-view target localization (drone→satellite) and drone navigation (satellite→drone) tasks, the proposed method achieves R@1 accuracies of 95.14% and 97.29%, respectively, on the University-1652 dataset, representing improvements of 0.68% and 1.14% over the current best algorithm, CAMP. On the SUES-200 dataset at an altitude of 150 meters, R@1 accuracies reach 97.2% and 98.75%, which are 1.8% and 2.5% higher than CAMP, respectively. Moreover, the proposed method requires significantly fewer parameters than existing algorithms, only 19.2% of those used by Sample4Geo. [Conclusions] In summary, the proposed DINO-MSRA architecture outperforms current state-of-the-art methods in cross-view image matching, achieving higher accuracy and faster inference speed. These results demonstrate its robustness and practical application potential in challenging real-world scenarios.

  • WANG Jiao, LI Junjiao, RUI Qiyao, CHENG Weiming
    Journal of Geo-information Science. 2025, 27(4): 820-834. https://doi.org/10.12082/dqxxkx.2025.240474

    [Objectives] The identification and classification of lunar impact craters are critical for selecting spacecraft landing sites and estimating the Moon's geological age. However, the complex morphological features created by impact processes post significant challenges to studying micro-scale lunar surface features, which are often indivisible at the pixel level. Addressing these challenges requires a scale-adaptive approach that incorporates micro-scale characteristics to refine lunar impact crater classification maps. [Methods] This study introduces a scale-adaptive algorithm based on geomorphons for the automatic classification of micro-scale lunar surface features. First, terrain parameters are optimized to define local ternary patterns of lunar geomorphology. These patterns are then used to determine lunar geomorphons. Next, the geomorphons are aggregated according to rules based on relief amplitude and slope to identify lunar impact geomorphic units on a larger scale. Finally, a classification map of lunar impact craters in the Gagarin Crater region is constructed using the identified geomorphons. [Results] The proposed method successfully identifies the optimal parameters for adaptively scaling lunar geomorphons by incorporating the unique characteristics of lunar surface features. Using a four-parameter constraint window, lunar geomorphons are refined at locally optimal spatial scales through the computation of local ternary patterns integrated with the theory of lunar geomorphological evolution. The results reveal that the generated maps of lunar geomorphons exhibit significant spatial aggregation, well-defined classification boundaries, and high accuracy in representing lunar impact craters. The method effectively captures the internal structural details of impact craters, providing a pixel-level depiction of their morphological features. The multi-scale identification of impact craters achieves a precision of 88.24%, a recall of 84.96%, and an F1 score of 86.57%. A classification schema for impact craters was established, including simple pit, small-scale bowl, small-scale flat bottom, small-scale central peak, medium flat bottom, medium central peak, large ring plain, and giant complex. [Conclusions] This method demonstrates robustness and high efficiency in crater identification, offering multi-scale geomorphological units and serving as a foundational tool for scale-based lunar scientific research. It provides technical support for identifying and classifying multi-scale lunar impact craters, contributing to advancements in lunar morphological and geological analysis.

  • LI Pengshuo, FENG Yongjiu, TONG Xiaohua, XI Mengrong, XU Xiong, LIU Shijie, HUANG Qian
    Journal of Geo-information Science. 2025, 27(4): 864-875. https://doi.org/10.12082/dqxxkx.2025.240401

    [Objectives] Rovers play an essential role in lunar exploration, serving as vital tools for scientists aiming to unravel the Moon's geological history and exploit its potential water-ice reserves. However, navigating the lunar surface with rovers presents significant safety risks due to the complex and often hazardous terrain, compounded by the lack of a consistent and reliable light source. The absence of pre-existing, high-resolution data—such as LiDAR—prior to exploration missions poses a considerable challenge in evaluating the safety of potential rover paths. Given these constraints, developing a reliable pre-assessment method is crucial for enhancing the success rate of lunar rover missions. [Methods] This paper introduces a 3D simulation method for lunar rover exploration, leveraging the Visualization Toolkit (VTK) to address these challenges. Our method integrates three critical aspects. Firstly, it offers high-resolution visualization of the lunar surface terrain, capturing intricate details down to the meter scale. Secondly, it simulates the dynamic illumination environment on the lunar surface, accounting for the varying illumination conditions due to the Moon 's rotation and orbital position. Thirdly, it models the rover's position and attitude transformations as it navigates the terrain. [Results] The effectiveness of this simulation approach is demonstrated through a case study focusing on the Shackleton Connecting Ridge region at the lunar South Pole, an area of significant interest due to its challenging topography and potential for water-ice deposits. The 3D simulation accurately depicts the undulating terrain of impact craters and allows for a thorough assessment of the rover's route safety by visualizing the potential hazards along the path. Moreover, the simulation offers an intuitive representation of the rover's movement, including real-time adjustments in position and attitude, which are critical for ensuring the rover’s stability and operational safety over long distances. Additionally, our method includes a real-time update feature for the dynamic illumination scene, enabling direct observation of how changing light conditions affect the rover's path during the mission. This capability is particularly important for assessing the feasibility of navigating through areas that may experience prolonged periods of darkness or extreme shadowing, which could impede the rover's progress or jeopardize its safety. The goal of this research is to improve the reliability and safety of future lunar rover missions by providing a robust pre-assessment tool that can verify the feasibility of proposed exploration routes. [Conclusions] This method thus offers crucial a priori information, serving as an essential guarantee for the successful execution of future lunar exploration endeavors.