Current Issue

  • Select all
    |
  • PAN Jiechen, XING Shuai, CAO Jiayin, DAI Mofan, HUANG Gaoshuang, ZHI Lu
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Significance] With rapid advances in remote sensing, surveying and mapping, and autonomous driving technologies, 3D point cloud semantic segmentation, a core technology of digital twin systems, is attracting increasing research attention. Airborne point cloud semantic segmentation is regarded as a key technology for enhancing the automation and intelligence of 3D geographic information systems. [Analysis] Driven by deep learning and sensing technologies such as LiDAR, depth cameras, and 3D laser scanners, point cloud semantic segmentation can automatically classify and accurately recognize large-scale point cloud data through precise feature extraction and efficient model training. However, compared with typical high-density, category-balanced point cloud datasets (e.g., those used in indoor scenes, autonomous driving, or robotics), airborne point clouds present significant challenges in areas such as registration and feature extraction. These challenges stem from their unique characteristics, including large-scale 3D terrain coverage, dynamic platform motion errors, considerable variations in ground-object spatial scales, and complex occlusions. Currently, deep-learning-based airborne point cloud semantic segmentation is still in its early stages. Due to heterogeneous data acquisition methods, varying resolutions, and diverse attribute information, there remains a gap between existing research and practical algorithm deployment. [Progress] This paper provides a comprehensive review of the field, covering adaptive algorithms, datasets, performance metrics, and emerging methods along with their advantages and limitations. It also offers quantitative comparisons with existing technologies, evaluating representative methods in terms of precision and applicability. [Prospect] A thorough analysis suggests that breakthroughs in airborne point cloud semantic segmentation necessitate systematic research innovations across multiple dimensions, including feature representation, multimodal fusion, few-shot learning, algorithm interpretability, and large-scale model benchmarking. These advancements are essential not only for overcoming current bottlenecks in real-world applications but also for establishing robust technical foundations for critical use cases such as digital twin cities and disaster emergency response.

  • ZANG Dongdong, WANG Jingxue, BU Lijing, ZHANG Xin, YU Jinzheng
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Due to the complex and diverse roof structures of buildings, as well as the uneven density and lack of semantic information of airborne LiDAR point clouds, it is challenging to accurately construct the topological relationships between geometric primitives using the existing data-driven building 3D model reconstruction frameworks. To address these issues, this paper proposes a 3D model reconstruction framework based on line topology matching across roof patches, building on existing roof segmentation and contour line extraction algorithms. [Methods] First, the improved D-P algorithm is used to simplify the contour lines of each roof patch to generate contour edges. Priori boundary characterizations are used as non-compulsory geometric constraints to regularize the contour edges of each roof patch. Second, for contour edges and their corresponding contour point sets on the boundaries of different roof patches, statistics on the number of neighboring contour point pairs in the horizontal direction are calculated. Based on the proportion of neighboring contour point pairs, the corresponding contour edges representing the same internal feature line across different roof patches are matched, determining the topological relationships between the roof patches. Finally, the gaps between the corresponding contour edges are merged, and the endpoints are optimized to generate a 3D structured building model with tightly arranged roof patches. [Results] To verify the proposed framework, typical building point clouds from the Vaihingen and Building3D datasets were selected. Experimental results show that the framework successfully establishes the topological relationships between roof patches by matching the corresponding contour edges in neighboring patches. This allows it to handle neighboring roof patches that involve both step and intersection relationships, ensuring consistency in the topology of the roof patches. For buildings with irregular boundaries, the non-compulsory single right-angle and double right-angle constraints help balance the shape of roof patch boundaries while preserving prior knowledge of typical roof structures, ensuring the framework's adaptability to various types of boundary structures. An analysis of the corner point coordinate deviations and the original point cloud coverage of the reconstructed model shows that the corner point horizontal plane deviations of the 3D building model constructed using the proposed framework are less than three times the average point spacing. Additionally, the fit between the model plane and the original point clouds is comparable to that of the reference plane. [Conclusions] In summary, the 3D building model constructed by the proposed framework features a complete roof structure, strong adaptability, and high precision.

  • ZHU Longbin, ZHAO Ruiyin
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] The National 14th Five-Year Plan clearly proposes the implementation of urban renewal, especially the residential renewal of old residential areas. However, the existing residential renewal practice still lacks a comprehensive arrangement and operational guidance on renewal sequence. [Methods] The study utilizes the PBL-BPNN algorithm and multi-source data to construct the evaluation methodology and indicator system. The study quantifies the renewal sequence by taking the residential renewal sensitivity as a metric value of renewal possibility. The method comprehensively considers multiple indicators such as built environment and population distribution of residential areas, realizing large-scale quantitative analysis of renewal sequence through feature extraction of renewed residential areas. [Results] Comparing the traditional evaluation model and the model after adding multi-source data, it is found that the latter has a 10-fold cross-validation RMSE of 0.142 2 and an F-score of 0.750 9 on the validation dataset, and the accuracy on the test dataset has been improved by 32.78%, which proves the validity of the methodology and evaluation indexes. The empirical analysis reveals that the sensitivity of residential renewal in the central district of Nanjing presents a spatial pattern of "high internal, low external, and scattered at multiple points"; at the same time, the six indicators of commercial resources, public space resources, number of working days, enclosure, richness, and sense of pleasantness in the multi-source data have a greater impact on the evaluation of the sensitivity of residential renewal. [Conclusions] The method utilizes data mining ideas and machine learning techniques to break through the limitations of strong subjectivity in the traditional renewal sequence evaluation methods. It can be used as a decision-making basis for the planning and implementation of residential renewal and provides technical and methodological support for the judgment of the time sequence of residential renewal.

  • HOU Yang, YANG Jian, FANG Li, ZHANG Bianying, ZHANG Meng, XIE Xiao, ZHENG Chenghao
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Abstract: As a fundamental geographic feature, road networks play a crucial role in spatial analysis and various applications. This paper studies vector data embedding models for road networks and their application in road network pattern recognition. These models not only facilitate the analysis of the spatial structure of road networks but also provide computational methods for information representation and processing in the digital twin of the Earth system. However, most existing road network pattern recognition methods are computationally complex, lack intelligent reasoning capabilities, rely heavily on large amounts of labeled data, and exhibit limited generalization ability. These limitations constrain their performance in pattern recognition under complex road network structures. [Methods] To address these challenges, this paper proposes a novel identification method based on geometric similarity graph representation learning, tailored for road network pattern recognition tasks. Firstly, the road network is modeled using spatial dual graphs, with graph node features designed based on cognitive heuristics to capture the intrinsic characteristics of the road network. Next, the model is trained in an unsupervised manner. Subgraph Isomorphism Counting (SIC) is introduced during road embedding learning to capture local structural patterns, while a Global Context Attention mechanism (GCA) is incorporated during graph embedding generation to capture global context, thereby enhancing the model's representation performance. Finally, geometric similarity in graph-level embeddings was utilized to effectively recognize road network patterns. To validate the effectiveness of the proposed method, a dataset containing five types of road network patterns was constructed, and extensive experiments were conducted. [Results] The SUGAR-3 model proposed in this paper achieved a classification accuracy of 93.18%, representing an improvement of more than 12% over classical road network pattern recognition methods and significantly outperforming baseline models such as Graph Convolutional Neural Networks (GCNN). Furthermore, an in-depth analysis of the graph embeddings and the model's expressive power was performed. The results demonstrate that the road network patterns represented by our model can be effectively clustered, forming clear boundaries between different patterns. [Conclusions] This verifies the effectiveness of SIC and GCA in enhancing road network pattern recognition performance and provides a new approach for further improving the expressive power of graph embeddings for road networks.

  • HAO Yuanfei, LIU Zhe, ZHENG Xi, QIAN Yun
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Street space serves as the primary perceptual interface for pedestrians in urban environments, and the visual quality of these spaces plays a crucial role in enhancing their vitality. Traditional evaluation methods often rely on single-objective indicators, making it difficult to effectively link objective environmental features with pedestrians' subjective perceptions. [Methods] This study proposes a novel evaluation framework based on Large Language Models (LLMs), incorporating the style dimension of subjective perception and extending traditional single-indicator quantitative analysis to a comprehensive approach that integrates both quantification and stylization. This framework utilizes Baidu Street View imagery to quantitatively assess two objective indicators, namely green view index and sky view factor, through semantic segmentation techniques. Additionally, it evaluates six subjective indicators, including vegetation diversity, building typology, building continuity, sidewalk usage, roadway usage, and signage usage, by leveraging prompt-optimized LLMs. The study then categorizes street space visual quality features within the research area using the Latent Dirichlet Allocation (LDA) topic model, aiming to explore the spatial characteristics of different streets and identify optimization strategies. [Results] Using Beijing's Xicheng District as the study area, the results reveal spatial distribution patterns of vegetation density and sky openness, along with pedestrians' subjective evaluations of indicators such as vegetation diversity and building type. Cluster analysis identified comprehensive service streets centered around Xidan North Street, characteristic streets centered around Xihuangchenggen South Street, and mixed-type streets centered around Lingjing Hutong. [Conclusions] This study innovatively introduces a large language model with human-like perceptual capabilities, enhancing its performance through prompt engineering. The resulting framework enables efficient and integrated evaluation of street visual quality by combining both objective and subjective factors. This approach provides a practical reference for large-scale, automated analysis of street view imagery.

  • ZHANG Chenglong, ZHOU Yang, HU Xiaofei, HUANG Gaoshuang, ZHAO Luying, GAN Wenjian
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Most early visual place recognition methods use manual features to represent images. However, these methods could not handle complex appearance environments with shifting image viewpoint, light, and season, and thus lack robustness and generalizability. With the development of deep learning technology, deep features have shown significant advantages over traditional manual features when coping with changes in image viewpoint and complex appearance environment. Though deep learning based visual place recognition methods have been widely studied, there is still a room for improvement. To address the problems in existing visual place recognition methods, this paper proposes a visual place recognition method that fuses multi-level features and relation-aware global attention. [Methods] Firstly, deep convolutional neural networks are used to extract feature maps at different levels of images. Secondly, the extracted multi-level features are fused to enable combination of semantic and spatial information, which addresses the lack of expression ability using single scale feature information and enables the model to better address complex environmental changes. Then, relation-aware global attention is used to enhance the fused features, suppress the redundant noise information to focus on the key targets in the scene image, and model the structural information of fused features from a global spatial perspective. This allows for better extraction of semantic information from static objects such as buildings and effectively addressing occlusion caused by dynamic objects such as pedestrians and cars. Finally, we use the vector of locally aggregated descriptors network MixVPR to construct robust and compact image descriptors and train the model based on multi similarity loss. [Results] The experiment is conducted on six open street view image datasets with image viewpoint, light, and seasonal changes, i.e., Pittsburgh250k, Pittsburgh30k, SF-XL-Val, Tokyo 24/7, Nordland, and SF-XL-Testv1. The recall accuracy of the method in this paper shows an increase on each dataset. Recall@1 reaches 94.20%, 91.56%, 85.15%, 83.81%, 75.43%, and 73.60%, respectively on each dataset. [Conclusions] The experimental results show that the proposed method exhibits robustness and generalizability to scene changes such as shifting image viewpoint and complex appearance environment.

  • DU Pei, SHEN Yangjie, LIU Zhenxia, YU Zhaoyuan
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Global climate change, accelerating sea-level rise, and intensifying anthropogenic pressures are rendering the intricate human-land-sea nexus within coastal zones increasingly complex, sensitive, and vulnerable. This growing challenge underscores the urgent need for integrated coastal research frameworks capable of synthesizing environmental sensing, dynamic process simulation, and scenario projection. Addressing this critical gap, Digital Twin (DT) technology emerges as a transformative paradigm. By integrating multi-source data, sophisticated models, and domain knowledge into intelligent systems, DT offers unprecedented potential for creating precise virtual replicas and enabling intelligent management of complex coastal socio-ecological systems. [Analysis] This paper systematically analyzes the state of coastal zone digitalization, highlighting the pressing need for robust digital frameworks that can effectively represent and analyze the strong coupling between natural processes and human activities under multifaceted pressures. Building on this foundation, we propose a novel conceptual framework and implementation pathway for constructing a Digital Twin Coastal Zone (DTCZ). This framework explicitly positions land-sea interface processes as the foundational scenario and centers on human-land-sea feedback mechanisms as the core analytical thread. The proposed DTCZ system architecture is articulated across four pivotal dimensions: (1) Comprehensive information integration and knowledge aggregation; (2) Simulation of natural processes integrated with coupled human-nature decision support; (3) Synergistic short-term forecasting and long-term monitoring capabilities; and (4) Realistic multidimensional representation enabling intelligent interaction. We critically discuss the key technological enablers supporting this vision, encompassing coastal data governance and fusion, multi-scale scenario modeling, predictive analytics for critical coastal elements, persistent long-term monitoring strategies, and the development of the integrated DTCZ platform itself. At its core, the envisioned DTCZ leverages spatiotemporally fused multi-source data as its foundation and prioritizes enhanced scenario simulation and intervention capabilities. [Prospects] This framework is designed to overcome the limitations, such as fragmented data and limited predictive power, that constrain traditional coastal digital systems. By significantly advancing the computational tractability and overall manageability of coastal systems, the DTCZ paradigm offers a powerful new methodological tool and operational framework. It holds strong potential for supporting sustainable coastal development and modernizing governance structures in the face of ongoing climate change, providing a robust platform for evidence-based planning and adaptive management.

  • NIU Chaoran, XUE Cunjin, XIANG Zheng, MA Ziyue
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] The ocean twin space consists of the real ocean, the virtual ocean, and the bidirectional links between them. Spatiotemporal modeling for twin spaces requires the simultaneous representation and modeling of all ocean phenomena, objects, and their relationships within the study area. However, existing models such as object-oriented models, spatiotemporal field models, event-based models, and process-based models that incorporate dynamic changes, primarily focus on modeling individual ocean phenomena, including objects, fields, events, and processes. The absence of a unified organizational structure makes comprehensive ocean environment modeling challenging. [Methods] Based on four types of ocean spatiotemporal models mentioned above, this study designs a unified spatiotemporal data organization structure and proposes a graph model for the integrated representation of oceanic static and dynamic elements in twin spaces. The core components of the model include: (1) Establishing a unified organization structure of "entity object-data description-data sequence" through hierarchical and attribute design of the entity object, enabling the unified organization of four object types: spatiotemporal objects, spatiotemporal fields, events, and processes; (2) Designing the relationship representation between oceanic static and dynamic elements in the twin space by analyzing the mapping process from the real ocean to the virtual ocean; (3) Integrating the unified structure of the four object types with the representation of relationships between static and dynamic elements, extracting five core components: time, entity object, twin object, twin scene, and relationship. Furthermore, entities and relationships within these core components are then abstracted into nodes and edges, constructing a five-layer graph representation framework: "twin scene-twin object-entity object-data sequence-time." [Results] A case study on the organizational management of ocean elements around Yin Island and its surrounding waters in the northeast of the Yongle Atoll, Xisha Islands, Sansha City, Hainan Province, China, validates the feasibility and effectiveness of the proposed graph model for integrating static and dynamic ocean elements in twin spaces. Comparative experiments with the hybrid object-field model, the geographic knowledge graph, and the geographic spatiotemporal process-based knowledge representation model demonstrate that the proposed model successfully unifies static objects and dynamic processes, providing a more comprehensive representation of relationships between ocean objects. [Conclusions] The proposed model resolves the fragmentation of static and dynamic data in twin spaces, enhances the efficiency of ocean data management and utilization, and advances ocean management from digitization to intelligence.

  • FU Xin, ZHANG Haoran, WANG Yuanbo, HUANG Chong, LIU Xiangye, ZHANG Hengcai, XU Zhenghe
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Soil salinity is one of the major and widespread challenges in the recent era, hindering global food security and environmental sustainability. Accurate evaluation and analysis of soil salinization are of great significance for the improvement and management of soil salinization. [Methods] To address the challenge of mapping three-dimensional spatial distribution of soil salinity, this study selected 819 effective field soil samples within a saline soil region of the Yellow River Delta. These samples, which have vertical stratifications from 0 to 100cm, were used for comprehensive analysis. The soil sample points were arranged in a grid of 5 km×5 km horizontally, and the sampling soil layer was set up every 10cm vertically. Following the principle of covering different land cover types and human accessibility, soil samples were collected from the depth range of 0~100 cm in the study area. The three-dimensional spatial differentiation of soil salinity in the coastal saline soil area was revealed from different perspectives using traditional geostatistical methods and 3D Empirical Bayesian Kriging interpolation. The effects of various factors on the spatial differentiation of soil salinity were analyzed using the Geodetector method. [Results] The results showed that the spatial distribution of soil salinity in the whole soil range and different vertical layers were highly variable. There were differences in the scale of spatial autocorrelation of soil salt content at different depths. In this study, the 3D Empirical Bayesian Kriging interpolation method was established to spatialize the soil salinity of soil samples, which effectively revealed the vertical fine-scale three-dimensional spatial characteristics of soil salinity. Soil salinity exhibited significant three-dimensional spatial differentiation, with diverse profile distribution types. The main types were homogeneous and surface aggregated, with some local areas showing bottom aggregated and fluctuating types. All influencing factors significantly affected the three-dimensional spatial differentiation of soil salinity, but the degree of influence varied for each factor. The order of explanatory power of each influencing factor is as follows: land use/land cover > distance to coastline > groundwater depth > groundwater conductivity > elevation > land surface temperature > soil bulk density > soil clay content. Compared with single factors, the pairwise interaction of any factor had a greater effect on the spatial differentiation of soil salinity, but the interaction strength of different factors varied. In the whole 0~100 cm soil depth range, GWD ∩ LULC had the largest impact (0.443), followed by LST ∩ LULC (0.326). [Conclusions] Although the q values of land surface temperature and soil bulk density were not high, their explanatory power on soil salinity was greatly improved after their interaction with land use/cover, better explaining the changes of soil salinity in the study area. Factors such as land use/cover, groundwater depth, surface temperature, and soil bulk density are closely related to the spatial distribution of soil salinity in the study area. The research results provide a theoretical basis and technical support for the formulation of comprehensive improvement measures and management systems for fine-scale saline-alkali land in the region. These findings have positive implications for promoting the achievement of the Sustainable Development Goal of Land Degradation Neutrality in coastal areas.

  • HE Li, WANG Rong
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Significance] Space is not merely a physical place, but a productive arena of social relations. Social phenomena are inherently endowed with spatial attributes, making the spatial perspective a critical pathway for understanding complex social issues. With the deepening "spatial turn" in the social sciences and continuous advancements in Geographic Information Systems (GIS)—particularly in data acquisition, spatial analysis and modeling, and spatial visualization—GIS has become an essential tool for addressing social issues. However, disciplinary differences in theoretical paradigms, methodological logic, and scale cognition between geography and the social sciences constrain their deeper integration. Existing literature lacks a systematic synthesis of integration trends, underlying challenges, and empowerment pathways, necessitating a comprehensive clarification of fusion mechanisms, core obstacles, and emerging opportunities. [Progress] This paper identifies five key advantages of GIS in empowering social science research: expanding spatial analytical thinking, supporting spatiotemporal data, enhancing survey techniques, enriching representational forms, and strengthening analytical capabilities. We review representative GIS applications in economics, political science, and sociology. From dimensions such as spatial cognition, data capacity, methodological adoption, and research hotspots, we distill application characteristics across these disciplines, revealing both commonalities and differences. While all three disciplines recognize spatial effects, their theoretical orientations shape distinct technical approaches—economics emphasizes causal identification, political science focuses on geopolitical structures, and sociology prioritizes contextual representation. Through a three-dimensional analysis—data, methodology, and cognition—we examine three major challenges in addressing social issues: the mismatch between data and research questions, the difficulty of integrating methods with causal mechanisms, and the contextual misalignment of place and scale, which reflect deeper issues of data suitability, methodological coherence, and the validity of spatial reasoning. [Prospects] The advancement of artificial intelligence, especially large models, injects new methodological momentum into GIS-based spatial analysis and brings threefold opportunities for addressing social issues. First, large models are driving spatial analysis from correlation-based description toward transparent causal inference; Second, multi-source data fusion and the generation of "silicon-based samples" help overcome the limitations of traditional survey data. Third, an emerging "space-survey" integrated framework is constructing a "spatial cognitive infrastructure" to support social research. Future efforts should establish a synergistic "large model-spatial analysis" paradigm that integrates these three opportunities. By simultaneously addressing challenges of data matching, method integration, and contextual misalignment, this paradigm can elevate GIS from a supportive tool to a core engine for theory generation and mechanism interpretation. This transformation will enhance the scientific value and practical effectiveness of GIS and spatial analysis in addressing complex social issues, fostering a bidirectional interaction between methodological innovation and theoretical advancement.

  • ZHU Ge, ZHANG Zheng, CAO Lianshuai, MA Kunyang, XU Xinyue, CHENG Yi
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Map compilation involves professional operations such as element selection, symbolization, and notation configuration. However, the process is often complex and inefficient. Leveraging Large Language Models (LLMs), text-to-map technology significantly simplifies the mapping process, lowers the barrier to entry for non-experts, and improves mapping efficiency. Nevertheless, challenges remain, including heavy reliance on manual debugging and fragmentation tool invocation. [Methods] This paper proposes a DeepSeek-based method for constructing text-to-map agents, which automates the entire process from user input to visualization output. This is achieved through the decomposition of natural language instructions and autonomous adaptation of tools. Centered on the DeepSeek model, the approach associates cartographic elements with specialized tools and usage descriptions, analyzes module structures and collaboration mechanisms, and organizes tools into five categories. By interpreting user instructions and reasoning through task-oriented chains of thought, the agent invokes appropriate visualization tools to achieve cross-modal mapping from natural language to maps, enabling autonomous task reasoning and automated map generation. [Results] To evaluate the agent's effectiveness, two types of mapping tasks—based on local map data and online map services—were conducted using DeepSeek-V3-0324 and R1 models as decision-making cores. The experiments demonstrated that the agent could autonomously complete mapping tasks from natural language using both local and tile-based data. Local map visualization experiments confirmed the agent's ability to reuse tools effectively in low-complexity scenarios. Tile-based map visualization experiments indicated the agent's capability in handling high-complexity scenarios involving multi-toolchain invocations. It accurately decomposed subtasks, assigned appropriate tools, and performed structured string-based input variable transmission or direct invocation without variables, all presented to users in a semi-transparent manner. Across forty repeated experiments, the V3 model outperformed the R1 model, achieving 6.56 times greater execution efficiency with an average processing speed of approximately 6.29 seconds per step, and demonstrated better modular adaptability with the LangChain agent framework. [Conclusions] The proposed construction method validates the feasibility of using DeepSeek-based agents for intelligent cartography. The V3 model exhibits strong potential in this field, with its performance (6.29 s/step) comparable to that of professional cartographers. The text-to-map intelligent agent significantly reduces the entry barrier for map creation, promotes the broader adoption of mapping tools in everyday use, and provides a valuable technical reference for integrating autonomous cartography with professional software platforms such as ArcGIS and QGIS.

  • LIU Zhaoqian, YAN Haowen, LU Xiaomin, LI Pengbo, MA Ben
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Building matching typically determines whether entities are the same by calculating the geometric similarity of buildings. Existing building group matching methods primarily focus on the similarity of groups’ overall outlines, while often ignoring the spatial relationships and internal layout of individual buildings within the groups. When the outlines appear similar but internal structures differ significantly, accurate matching cannot be achieved. [Methods] To address this issue, an improved building group matching method supported by spatial similarity relationships is proposed. First, the method utilizes building orientation and azimuth angles for preliminary matching of groups that contain different numbers of buildings. Then, an improved directional relationship matrix is used to calculate the specific orientations of each building within the group, while a distance matrix is constructed based on centroid distances. A structural similarity index is introduced to quantify the similarity of the direction and distance matrices. By combining the geometric characteristics of the groups’ outlines with the spatial distribution of individual buildings, and applying an entropy weight method to determine indicator weights, the method enables a comprehensive similarity assessment that accounts for multiple spatial factors. [Results] In a trial conducted in Lanzhou City, six building groups from the national basic geographic dataset were matched against 195 OpenStreetMap(OSM) building groups, resulting in accurate matches. For the quality evaluation experiment, an urban area containing 1 511 buildings was selected. Using basic geographic data and OSM data from 2023 and 2024, the resulting similarity scores were 78.35% and 79.12%, respectively, indicating a year-over-year improvement in OSM data completeness. [Conclusions] The results demonstrate the proposed method's capability to comprehensively assess building group similarity by considering individual building orientation, distance, and spatial density, alongside overall geometric similarity. This approach provides a robust, quantitative framework for evaluating both the internal spatial relationships and overall layout of building groups, enabling accurate group matching and effective quality evaluation of local geospatial datasets.

  • ZHANG Zhengjia, JIN Qingguang, WANG Chao, ZHANG Hong, WANG Mengmeng, LIU Xiuguo
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Significance] Permafrost monitoring represents one of the most critical application domains of Synthetic Aperture Radar Interferometry (InSAR) technology. In recent years, significant progress has been made in InSAR-based permafrost monitoring, driven by advancements in SAR satellite systems, the evolution of InSAR algorithms, and the integration of multi-source remote sensing technologies. These developments have established InSAR as a high-precision, large-scale technical approach for monitoring and assessing permafrost degradation under climate warming. [Progress] This paper systematically reviews recent innovations in InSAR-based permafrost monitoring and explores its interdisciplinary potential. First, we introduce commonly used Synthetic Aperture Radar (SAR) satellite systems, explain the fundamental principles and recent advancements in InSAR methodologies, and summarize both the geographical distribution of current permafrost study areas and quantitative trends in InSAR-related publications. Next, we comprehensively evaluate state-of-the-art applications of InSAR, including permafrost surface deformation monitoring, physically driven model construction, deformation prediction, and active layer thickness retrieval. Special emphasis is placed on addressing long-standing challenges such as interferometric decorrelation and imperfect permafrost parameter modeling through the integration of thermal-optical remote sensing data and hydrological models. [Prospect] In alignment with the sustainable development needs of permafrost regions, we analyze emerging research trends in InSAR permafrost studies, particularly in deep learning, multi-source data fusion, and sustainability assessment. This review not only provides a methodological framework and roadmap for addressing key challenges in InSAR-based permafrost research but also lays a technical foundation for tackling critical issues such as Arctic engineering safety and ecosystem stability evaluation.

  • ZHANG Yu, ZHUANG Huifu, ZHANG Xiang, TAN Zhixiang, LIU Yuhao, SHANG Jingjie, GUO Mingming
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Unsupervised change detection is a research hotspot in Synthetic Aperture Radar (SAR) image information extraction. However, existing studies often rely on single-method pseudo-label generation, leading to limited reliability. Moreover, most current methods mainly utilize spatial-domain features of multi-temporal images, with relatively few explorations into the fusion and utilization of spatial-frequency dual-domain features. To address these challenges, this study proposes a Mamba-based spatial-frequency feature fusion U-Net model for unsupervised SAR change detection. [Methods] The proposed approach first uses a difference segmentation-clustering fusion approach to generate high-quality pseudo-label samples, reducing dependence on manually labeled sample data. Next, a spatial-frequency dual-domain feature fusion U-Net model, integrating Mamba and wavelet convolution, is constructed to extract change information. Mamba is employed to efficiently capture global features, which are then fused with local spatial features extracted by convolutional networks. Simultaneously, wavelet convolution is used to enhance frequency-domain feature extraction. The fusion of dual-domain features is performed during the upsampling stage of the U-Net architecture. [Results] To validate the effectiveness of the proposed method, experiments were conducted on two SAR image datasets. Both qualitative and quantitative comparisons were made with traditional and deep learning-based methods. Compared to the best-performing baseline, the proposed method improved the average F1_Score by 2.35% and the Kappa coefficient by 2.65% across the two datasets, significantly enhancing the reliability of change detection results. [Conclusions] The proposed method effectively improves the automation and reliability of SAR image change detection, offering strong technical support for applications such as environmental monitoring, urban expansion analysis, and disaster assessment.

  • LIU Ying, FAN Yahui, YUE Hui
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Mineral resources are a vital material foundation for human survival and economic development. Conducting mine monitoring and establishing monitoring models are essential for the efficient utilization of mineral resources and environmental protection in mining areas. Given the complex backgrounds, diverse target scales, and dense distribution of small targets in open-pit mining areas, this study aims to develop a lightweight model that balances monitoring accuracy and efficiency, thereby improving the recognition of target objects in such environments. [Methods] Existing remote sensing datasets often suffer from limitations such as low sample diversity and regional constraints. To address this, we construct the OMTSFD (Open-pit Mine Typical Surface Features Dataset) based on 0.9 m TianDiTu and 1.8 m Google imagery. The dataset covers various climate backgrounds, large areas, and a wide range of surface features. For model training and validation, we propose an improved YOLO11-DAE algorithm. First, the C3K2-DBB module is integrated into both the backbone and feature pyramid networks to enhance multi-scale feature extraction. Second, the ADown module replaces traditional downsampling convolution layers, improving the representation of diverse features and reducing detail loss in low-contrast scenes. Finally, the E_Detect efficient detection head is introduced to reduce model complexity and the number of parameters, contributing to overall model lightweighting. [Results] Experimental results show that YOLO11-DAE achieves an FPS of 528.100, indicating high inference speed. The model achieves a precision (P) of 0.932, recall (R) of 0.894, F1-score of 0.913, and mean average precision (mAP) of 0.950, significantly outperforming YOLOv5n, YOLOv8n, and YOLOv10n algorithms. Compared to YOLOv11n, the proposed method improves performance by 7.600%, 10.000%, 8.800%, and 8.000% in the respective metrics. [Conclusions] The YOLO11-DAE algorithm meets real-time monitoring requirements in mining areas and is well-suited for complex scenarios involving multi-scale and multi-background targets. It achieves high-precision detection with a low miss rate and strikes an effective balance between model applicability and real-time performance.

  • PENG Daifeng, LI Yaning, ZHOU Dingwei, GUAN Haiyan
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] Traditional optical remote sensing Change Detection (CD) methods involve cumbersome procedures and exhibit low automation levels. On the contrary, deep learning-based CD approaches possess hierarchical feature representation capabilities, as well as automatic learning of change patterns which facilitates to end-to-end CD. This significantly enhances the accuracy and automation levels of CD algorithms, establishing them as the mainstream solutions in the era of Remote Sensing (RS) big data. However, high-resolution remote sensing images are characterized by high spatiotemporal complexity of ground objects. Meanwhile, existing deep learning CD methods typically employ Siamese encoder architectures to extract multi-temporal image features and calculate feature differences to identify changes. This conventional approach easily leads to insufficient utilization of differential information, limited modeling capacity, and susceptibility to interference from complex backgrounds, shadows, and illumination variations. [Methods] To address the abovementioned limitations, this paper proposes a Multi-Scale Differential Feature Enhancement Network (MSDFENet) based on a fully convolutional architecture. MSDFENet employs a Siamese encoder architecture to extract multi-scale features from bi-temporal remote sensing images. By introducing an Asymmetric Partial Double Convolution (APDC) module, it reduces the number of parameters and minimizes redundant information. Furthermore, differential operations are utilized to extract differential features that capture multi-scale details of change information. During the decoding phase, a Multi-scale Feature Attention (MSFA) module is designed to achieve collaborative optimization of deep semantic features and shallow geometric features through the incorporation of a spatial coordinate attention mechanism. Finally, progressive up-sampling is applied to gradually restore fine-grained details of changed regions, and a simple convolutional layer is used to generate change map. [Results] To validate the effectiveness of this method, extensive experiments and analysis are conducted against mainstream deep learning CD methods by using the LEVIR-CD, CDD, and WHU-CD datasets. Quantitative results indicate that MSDFENet achieves optimal accuracy metrics across all three datasets, with F1-scores reaching 90.68%, 94.65%, and 91.64%, and IoU values attaining 82.96%, 89.78%, and 84.56%, respectively. Visual results demonstrate that MSDFENet effectively suppresses complex background interference and enhances edge localization accuracy, yielding superior visual performance. Model complexity analysis confirms that MSDFENet achieves an optimal balance between CD accuracy and computational efficiency. [Conclusions] The proposed MSDFENet is capable of significantly enhancing differential feature representation, effectively suppressing complex background noise interference, and substantially improving multi-scale change capture capabilities, thereby advancing CD performance.

  • LIU Lin, ZHENG Senlin, XIAO Luzi
    Download PDF ( ) HTML ( )   Knowledge map   Save

    [Objectives] In the context of rapid urbanization and increasing population mobility in China, spatial variations in crime location choices among offenders with different household registrations (hukou) have drawn growing attention. However, most existing studies have focused on the differences between local and non-local offenders, without further differentiating among local offenders. This study aims to refine the classification of local offenders into two subgroups, those with matching hukou and residence and those with unmatched hukou and residence within the same city, to reveal differences in offending rates across different hukou types. It also examines the spatial distribution characteristics of their crime location choices and the underlying influencing factors. [Methods] Using ZG City in China as a case study, this research integrates offender data, census data, POI data, mobile phone signaling data, and remote sensing imagery to classify street theft offenders into three categories: local offenders with matching hukou and residence, local offenders with unmatched hukou and residence within the city, and non-local offenders. Kernel density estimation and a discrete choice model are used to analyze differences in crime location choices among these offenders types. [Results] The findings are as follows: (1) Offending rates are highest among non-locals, followed by local offenders with unmatched hukou and residence, and lowest among local offenders with matching hukou and residence. (2) Regarding journey-to-crime distance, non-locals travel the shortest distance, followed by unmatched locals, while matched local offenders travel the furthest. (3) All three offender types commit crimes in hotspots within the ring road, yet spatial patterns differ: matched local offenders tend to commit crimes in the old urban areas, unmatched local offenders are active in both old urban areas and Central Business Districts (CBDs), and non-local offenders mainly operate in CBDs. (4) Crime location preferences vary significantly by hukou types. A key finding is that the proportion of the migrant population has a positive effect on non-local offenders, no significant effect on unmatched locals, and a negative effect on matched local offenders. Matched local offenders tend to avoid neighborhoods with high rates of undergraduates and are more likely to commit crimes in urban villages, hospitals, and near bus stops. For unmatched local offenders, the elderly population proportion and proximity to factories have significant positive effects. For non-local offenders, elderly population proportion, urban villages, and factories all have significant positive effects. [Conclusions] By subdividing local offenders, this study reveals distinct spatial distribution patterns and influencing factors for street theft offenders with different hukou statuses. The findings provide valuable insights for optimizing urban security planning, enhancing law enforcement efficiency, and developing targeted crime prevention strategies.