Close×
    • Select all
      |
    • WANG Juanle, XIE Zhong, SONG Jia, SONG Chunqiao, CHEN Min, YU Zhuoyuan, QIU Qinjun, LI Kai, DUAN Bowen
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Significance] In the context of Open Science, the continuous emergence of open data has greatly expanded the available resources. However, due to the scattered, heterogeneous, and multi-semantic nature of these data, significant challenges remain for in-depth data mining and knowledge discovery. The Earth's surface system, characterized by strong inter-sphere interactions and intensive human activities, generates particularly rich scientific data. Data mining and knowledge discovery in this domain are at the forefront of global scientific research and a focal point of international competition. [Progress] This paper presents a systematic, full-chain study of key technologies for the discovery, management, mining, model sharing, and platform integration of scientific data related to the Earth's surface system. Using ontology updating and alignment methods, a large-scale scientific data catalog and associated network have been constructed, improving the accuracy and efficiency of data-sharing assessment. By integrating cutting-edge technologies such as cloud computing and container virtualization, intelligent service tools have been developed to enable efficient processing and information extraction from massive remote sensing datasets, thereby advancing standardized approaches for multi-source data management. High-precision parameter products of the Earth's surface system have been generated by fusing remote sensing big data with intelligent algorithms, supporting the efficient mining and analysis of spatiotemporal evolution patterns. The challenge of sharing and computing scientific models has been addressed through innovative heterogeneous model containerization technologies. Furthermore, a collaborative analysis and comprehensive service environment has been established with online computing capacity, applied to representative cases such as ecological barrier construction on the Mongolian Plateau and sustainable development in the Yangtze River Delta urban agglomeration. [Prospect] Building on these advancements, this paper highlights emerging research and development trends in Earth surface system science, emphasizing the progression of data mining and knowledge discovery towards FAIR principles, enhanced intelligence, productization, modeling, and scenario-based applications.

    • QIU Qinjun, LIU Jiandong, WU Liang, XIE Zhong, TAO Liufeng, HAO Mengqi, LI Weijie, WANG Yang
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Key node identification in directed weighted correlation networks of scientific data related to the Earth's surface system is crucial for accurate data recommendation and knowledge discovery. However, existing methods face challenges such as one-sided assessments, underutilization of network features, and unscientific weight allocation. [Methods] In this article, we propose a key node identification method based on the Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) using fused subjective and objective weights. First, a node similarity centrality index is introduced to balance local topology and global influence by integrating node correlation and strength. Second, a multi-indicator evaluation system is constructed, incorporating network topology, data correlation, and node similarity, to comprehensively assess node importance. A two-tier weight optimization strategy is proposed, combining the Analytic Hierarchy Process (AHP) and the Criteria Importance Through Intercriteria Correlation (CRITIC) method, to integrate subjective and objective weights and improve the scientific rigor of the evaluation. Finally, the TOPSIS method is applied for comprehensive node importance assessment. [Results] Experiments are conducted using scientific datasets of various spatial scales constructed by the research team. The proposed method was validated through simulations using the weighted Susceptibility-Infection-Recovery (SIR) model. Results show that, compared with traditional network-weighted centrality measures and TOPSIS methods based solely on subjective or objective weights, the proposed approach achieves higher Kendall correlation coefficients and TOP-K hit rates, demonstrating strong robustness across multi-scale networks. [Conclusions] The proposed method provides a novel approach to scientific data network analysis for Earth system research. It supports practical applications such as intelligent data recommendation, resource optimization, and system vulnerability analysis, thereby contributing to the deeper development of Earth system science.

    • ZHANG Jing, WU Tianjun, LUO Jiancheng, LI Manjia, FANG Zhiyang, LI Ziqi
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] The intelligent interpretation of spatial geographic elements of land use, addressing fundamental questions such as "where", "what", and "how", is a classic issue in geoscientific research. However, the existing interpretation methods often face challenges of inaccuracy and limited applicability in practical contexts. In recent years, treating geographic entities as fundamental spatial units has emerged as a key analytical approach. Yet, these entities exhibit distinct spatial and semantic characteristics at varying levels of granularity, making it difficult to accurately portray their morphology and attributes. To address these challenges, this study aims to systematically deconstruct land use space and provide accurate representations of geographic objects. It analyzes the critical roles of geographic partitioning, hierarchical object modeling, and land classification in interpreting complex surface features, integrating these approaches with remote sensing computational technologies. [Methods] This study aligns closely with the core principle of coupling geographic analysis and remote sensing computation. It breaks down the multi-granularity deconstruction problem into three key processes: (1) geographic spatial partitioning based on historical data, (2) fine-grained object modeling using high-resolution imagery, and (3) a progressive, iterative process for optimizing hierarchical structures. The proposed method adheres to a multi-granularity hierarchical spatial structure, progressing from broad to detailed scales of "whole-regional-local-object". It begins with geographic partitioning to establish a comprehensive domain background and subdivide the entire region into analytical units. Subsequently, basic geographic and thematic data are introduced to construct localized constraint units. Using image processing techniques and object stratification strategies, the method enables collaborative fine-grained object extraction. Finally, classificatory relationships are optimized to build a dynamically updated multi-granularity expression system. [Results] Driven by the need for comprehensive monitoring and management of national territorial space, an experimental application of the proposed spatial multi-granularity decomposition method was conducted in Kenli District, Dongying City. The study successfully extracted regional multi-element geographic objects, providing foundational data support for future research. [Conclusions] Preliminary empirical results demonstrate both the feasibility and the superiority of the proposed method.

    • ZHANG Ziqi, SHEN Gang, LI Yongkun, ZHANG Mengfei, CHEN Zhenghang, FU Shaokun, WU Feng
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Flash floods are characterized by their sudden onset, strong destructiveness, and tendency to evolve into complex disaster chains involving landslides and mudflows. Currently, flash flood prevention faces significant challenges due to the limitations of existing early warning models. Physical process-based models often suffer from complex parameter calibration and low computational efficiency. Rainfall threshold methods rely heavily on dense observation data and fail to cover the evolution of secondary disasters. Meanwhile, data-driven deep learning models, despite their precision, often lack interpretability due to their “black box” nature. Although integrating Knowledge Graphs (KG) with Bayesian Networks (BN) offers a solution, it is hindered by bottlenecks in automated knowledge extraction, the systematic mapping from cyclic knowledge networks to Directed Acyclic Graphs (DAG), and the difficulty in constructing structured datasets from unstructured reports. [Methods] This study proposes a novel modeling framework to overcome these hurdles. First, in the knowledge extraction stage, the LLM acts as a semantic engine to mine causal triples from over 3 000 domain-specific academic articles, constructing a comprehensive flash flood disaster chain KG. Second, during structural mapping, the complex knowledge network is transformed into a BN topology. This involves algorithmic pruning and LLM-assisted node aggregation and discretization to ensure the formation of a rigorous DAG. Third, in model instantiation, the LLM parses unstructured historical disaster reports into structured, discrete datasets. These datasets enable the learning of conditional probability tables, effectively fusing qualitative domain knowledge with quantitative evidence, with the reliability of LLM outputs validated via Precision, Recall, and F1 scores. [Results] Empirical validation on typical disaster cases demonstrates high consistency between the model's inference path and actual records. The model achieved an overall mean Brier score of 0.160 8, which indicates excellent probabilistic calibration.The Brier score of the batch case testing is 0.184 6, further confirming the model’s generalization stability across disaster chains of varying complexity. Sensitivity analysis accurately reveals the non-linear amplification effects of multi-hazard superposition, quantifying how minor triggers can propagate through the network to cause catastrophic outcomes. [Conclusions] This study successfully validates the potential of LLMs in disaster science,breaking through traditional bottlenecks in knowledge acquisition and data preparation. By realizing a “knowledge-data” dual-driven mechanism, the proposed method significantly enhances the interpretability, automation, and intelligence of disaster chain early warning systems. It provides a new theoretical pathway and a practical tool for disaster prevention and mitigation decision-making.

    • ZHONG Wen, SHAO Tong, WANG Lei, GUO Jiaxin
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objective] Street space perception is a critical dimension of the human-environment relationship and a vital metric for urban quality assessment. However, traditional street perception methodologies face significant limitations: questionnaire surveys are resource-intensive and lack spatial coverage, while conventional computer vision approaches often focus on low-level visual features, failing to capture high-level semantic information and the "why" behind subjective evaluations. Focusing on the area within Beijing's Fifth Ring Road, this study aims to overcome these "black box" limitations by developing a novel, Large Language Model (LLM)-driven multimodal analytical framework. The objective is to systematically characterize street-space perceptions, quantify subjective experiences, and interpret the underlying semantic drivers of urban environmental quality. [Methods] The study established a cascaded analytical pipeline integrating geospatial big data with advanced GeoAI techniques. First, 122 264 street-view images were collected at 50-meter intervals along the OpenStreetMap (OSM) road network using the Baidu Time Machine to ensure temporal consistency. Second, guided by the Triple Bottom Line (TBL) theory of sustainable development, a structured prompt system covering "Ecology-Society-Economy" dimensions was constructed. The TongyiQianwen Qwen2-VL-72B model was employed to interpret these images, generating detailed semantic descriptions and emotion labels. Third, to quantify these qualitative descriptors, a BERT model combined with a bi-directional Long Short-Term Memory (Bi-LSTM) network was trained to convert textual data into fine-grained continuous perception scores. Finally, the study aggregated these scores at the Traffic Analysis Zone (TAZ) scale to analyze spatial patterns using global and local Moran's I, while employing semantic mining techniques—including TF-IDF, Co-word networks, Latent Dirichlet Allocation (LDA) topic modeling, and Textual Knowledge Graphs—to deconstruct the semantic structure of positive and negative perceptions. [Results] The spatial analysis revealed a significant "center-periphery" decreasing gradient and strong spatial clustering in street space perception. Positiveperception zones were predominantly concentrated within the Second and Third Ring Roads, spatially correlating with historical preservation districts and mature commercial hubs. Semantic analysis indicated that these areas are driven by a "synergistic effect" of positive keywords such as "red walls" "greenery" "commercial vitality" and "cultural symbols". Conversely, negativeperception zones were clustered in the peripheral areas and transition zones. Notably, the study identified a "short-board effect" in negative perception areas, where the overall quality was disproportionately dragged down by specific negative semantic drivers like "ruins," "exposed soil," "construction waste," and "industrial noise," rather than a general lack of aesthetics. The LDA model further distilled four key thematic drivers influencing perception: natural ecology, functional efficiency, historical-cultural attributes, and commercial vitality. [Conclusions] This study demonstrates that integrating Multimodal Large Language Models with street-view data effectively bridges the gap between objective built-environment features and subjective human perception. Unlike traditional methods, this framework not only identifies where perception is low but explains why through interpretable semantic evidence. The research confirms that urban perception is non-linear, where eliminating negative "short-board" factors (e.g., disorder, pollution) is often more critical than aesthetic enhancement for improving low-quality spaces. The proposed framework offers a scalable, low-cost, and explainable technical pathway for micro-scale urban diagnostics, providing actionable insights for precision urban renewal and fine-grained spatial governance.

    • CHEN Weitong, XU Xin, WANG Shu, YANG Fei, ZHU Yunqiang, ZHAO Chen
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] High-quality geographic question-answering datasets constitute a fundamental resource for the training and fine-tuning of geographic large language models. In practical applications, once such datasets are illicitly used by non-copyright holders, infringers often provide commercial services solely through model APIs, thereby circumventing auditing and traceability of the original data sources. To address this issue, this paper proposes a copyright protection method for geographic question-answering datasets based on tone-driven backdoor watermarking. [Methods] First, a surrogate model is fine-tuned to rewrite portions of question-answering responses into watermarked versions that maintain semantic consistency while exhibiting positive voice characteristics. On this basis, additional tone-rewriting instructions are introduced to systematically rewrite the answers of a subset of question-answer pairs into watermark responses exhibiting positive tone features while preserving semantic consistency. A semantic consistency constraint is further applied to filter the generated outputs, thereby avoiding factual drift and semantic degradation. Subsequently, semantically natural and low-frequency words in the original dataset are selected as watermark triggers and embedded into the corresponding question instructions, forming a set of watermark question-answer pairs that incorporate both trigger conditions and watermark responses. These watermark samples are then combined with the unchanged samples to construct the final watermarked geographic question-answering dataset.During the copyright verification stage, to accurately identify whether tone biases induced by the watermarking mechanism are present in model outputs, we further construct and fine-tune a watermark discriminator tailored to the geographic question-answering scenario, which is used to distinguish watermark tone responses from non-watermark responses. By computing the proportion of outputs classified as watermark tone responses across multiple verification queries, the watermark verification success rate is obtained, thereby enabling black-box copyright verification to determine whether the protected dataset has been illicitly used. [Results] Experimental results on three mainstream open-source large language models, namely DeepSeek-Coder, Qwen3, and Llama-3, demonstrate that the watermarked models achieve semantic consistency and language fluency comparable to those of clean modelsunder a 20% watermark embedding rate, while maintaining a stable watermark verification success rate exceeding 78%.In addition, comparative experiments conducted on the Llama-3 model show that the proposed method achieves a watermark verification success rate of 86.75% on a Chinese geographic question-answering dataset, whereas baseline methods fail to obtain effective watermark detection results in this scenario. Furthermore, robustness experiments on the Qwen3 model indicate that, after two rounds of fine-tuning using a 30% clean data subset, the watermark verification success rate can still be maintained at 70.21%. [Conclusions] The proposed method provides a black-box copyright verification solution that operates without accessing the original dataset and relies solely on black-box model interfaces, offering effective technical support for the copyright protection of geographic question-answering datasets.

    • WANG Hui, PAN Xiao, WANG Shuhai, CHEN Xiao, LI Ning, WANG Zuocheng
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Sea Surface Temperature (SST) is a critical determinant of marine ecological balance and global climate regulation, with its long-term prediction vital for marine disaster early warning, resource development, and ecological protection. However, SST prediction faces dual challenges: SST data exhibits non-stationary fluctuations across multiple scales and is regulated by complex nonlinear interactions among multiple environmental variables. Traditional numerical models rely on complex physical equations and suffer from high computational costs, while existing deep learning models often fail to fully capture multi-scale dynamic features or neglect synergistic effects between variables, limiting long-term prediction performance in complex marine environments. This study aims to address these issues and enhance the accuracy and robustness of long-term SST prediction. [Methods] A time series prediction model named ACAFNet is proposed, integrating multi-scale temporal feature modeling and adaptive variable interaction mining. First, it dynamically selects Top-K key scales via seasonal-trend decomposition of SST time series—using Fast Fourier Transform (FFT) to extract periodic patterns and weighted average pooling to capture long-term trends—to match inherent multi-scale features. A dual-attention mechanism then captures local fine-grained fluctuations and global long-range dependencies, effectively addressing marine data non-stationarity. Second, variables are mapped from the time domain to the frequency domain via FFT to reveal hidden correlations obscured in the time domain. A learnable Mahalanobis distance quantifies variable correlations, generating a sparse mask matrix to emphasize key predictive variables and suppress noise. Finally, a fusion module integrates multi-scale features and variable dependencies via masked multi-head attention, combined with layer normalization and residual connections, for robust prediction. [Results] Comparative experiments were conducted on one private dataset (collected from anchored buoys in the coastal waters of Qinhuangdao, Bohai Sea) and three cross-latitude public buoy datasets (52 212, NTKM3, PRDA2) covering tropical to subarctic regions, against five baseline models (DLinear, Pathformer, PatchTST, Crossformer, GPT4TS) at four prediction steps (96, 168, 336, 720 steps). Results show ACAFNet outperforms Transformer-based models by an average of 3.72% (MSE), 5.03% (MAE), and 4.17% (RMSE). Notably, in 720-step long-term prediction on the private dataset, ACAFNet achieves an MSE of 0.299, MAE of 0.399, and RMSE of 0.547, outperforming all baselines. Ablation experiments further verify the effectiveness of adaptive scale selection, dual-attention, and variable correlation measurement modules in improving model performance. [Conclusions] ACAFNet effectively improves long-term SST prediction accuracy and robustness through adaptive multi-scale division, dual-attention mechanism, and frequency-domain variable measurement. It addresses core challenges of multi-scale fluctuation capture and nonlinear variable interaction mining, providing a new paradigm for marine multi-variable time series prediction. This study offers important reference value for complex marine environment forecasting and lays a foundation for future extensions to marine ecological variable prediction and multi-modal data fusion scenarios.

    • DUAN Tengfei, WU Yutong, LU Yuting, LIU Hui, WU Penghai
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objective] Accurate extraction of surface water bodies from high-resolution remote sensing imagery is critical for water resource management, flood disaster monitoring, and ecological environment protection. Despite the dominance of deep learning in this field, existing semantic segmentation models have inherent limitations. Traditional Convolutional Neural Networks (CNNs) perform well in capturing local textures but struggle to model long-range global dependencies owing to limited receptive fields. Conversely, Transformer-based models excel at global semantic modeling but often lack the inductive bias required for preserving fine-grained high-frequency details. These limitations inevitably lead to issues such as coarse boundary segmentation, fragmentation, or omission of tiny, narrow rivers, and significant confusion between water bodies and dark shadows cast by mountains or urban buildings under complex lighting conditions. [Methods] To address these challenges, this study proposes a semantic segmentation model based on dual-backbone dynamic feature fusion, termed the Dual Backbone Fusion Net (DBF-Net). The model employed an encoder-decoder architecture with three key innovations. First, heterogeneous dual-stream backbones were constructed to combine the strengths of different architectures, and lightweight MobileNetV3 was utilized to extract high-frequency local spatial details and boundary information, while a hierarchical Swin Transformer was employed to capture long-range global semantic contexts. This parallel design effectively compensated for the representational deficiencies of single-backbone models. Second, to solve the feature misalignment caused by domain differences between the CNN and Transformer features, a Dynamic Fusion Module (DFM) is designed. The DFM utilizes an adaptive weighting mechanism to selectively emphasize informative features and suppress noise, ensuring effective alignment and deep integration of heterogeneous features. Third, a chained atrous spatial pyramid pooling (Chained-ASPP) module was introduced at the bottleneck. By cascading atrous convolutions with different dilation rates, this module efficiently aggregates multiscale contextual information without significantly increasing the computational cost, thereby enhancing the robustness of the model to water bodies of varying scales. [Results] Extensive comparative experiments were conducted on two challenging public datasets: the ESWKB (containing diverse water types) and GID (featuring complex land cover). The results demonstrate that DBF-Net consistently outperforms six representative state-of-the-art methods, including Swin-Unet, SegFormer, and CMTFNet. Specifically, the proposed model achieved Overall Accuracy (OA) and intersection over union (IoU) values of 99.15% and 96.34% on ESWKB and 96.47% and 86.78% on GID, respectively. Compared with mainstream baseline models, DBF-Net improves the IoU by 0.15%~3.38% on ESWKB and 0.72%~4.45% on GID. Ablation studies further validated the importance of the dual-backbone design, showing that DBF-Net surpasses the single-stream MobileNetV3 and Swin Transformer baselines by 2.16% and 2.07%, respectively, in the IoU. Visual analysis confirmed that the model significantly improved the topological connectivity of tiny rivers and effectively suppressed false positives caused by shadows in complex urban and mountainous scenes. [Conclusions] By synergizing local detail extraction and global semantic modeling through dual-backbone collaboration and adaptive feature fusion, the DBF-Net provides a robust solution for water body extraction. This study not only significantly enhances the extraction accuracy in challenging scenarios, but also offers valuable technical insights for future research on multi-source feature fusion in remote sensing.

    • LIN Shiqiu, CHEN Xiaona
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Accurate prediction of future snow depth under climate change is essential for the sustainable development of the ice and snow economy. However, existing snow depth projections generally suffer from coarse spatial resolution, limiting their application. This study aims to improve the accuracy and applicability of CMIP6 multi-model snow depth data across the Northern Hemisphere and to analyze their spatiotemporal evolution under different emission scenarios. [Methods] Snow depth data from 21 major Global Climate Model (GCMs) and four Shared Socioeconomic Pathways (SSPs) of CMIP6, covering 1980—2100, were selected. Ground-based observations from GSOD, GHCN, and the Qinghai-Tibet Plateau Snow Depth Dataset (QTPSD) were collected, and 2 062 stations were retained after strict quality control for validation. MODIS snow cover products and high-resolution snow depth data were used as auxiliary datasets. A three-stage downscaling framework was developed by coupling the Delta method with spatial feature transfer, downscaling CMIP6 scenario data from 1-2.5° to 0.05°. Accuracy was assessed using the Pearson Correlation Coefficient (CORR), Dtandard Deviation (STD), Root Mean Square Error (RMSE), bias, and Mean Absolute Error (MAE). Long-term snow depth changes were analyzed using Theil-Sen trend analysis and the Mann-Kendall (MK) significance test. [Results] The downscaled snow depth dataset outperformed mainstream reanalysis products such as ERA5-Land and GLDAS during 1980—2023 in terms of RMSE, bias, and MAE. It also better captured spatial details, particularly in complex terrain, while reducing both overestimation and underestimation. Temporally, snow depth exhibited a steady decline from 1980 to 2014, followed by an accelerated decrease under intensified greenhouse gas emission scenarios from 2015 to 2100. Spatially, snow depth is projected to increase in Eastern Eurasia but decline in North America. [Conclusions] The proposed multidimensional coupled downscaling framework significantly improves the spatial resolution and accuracy of future snow depth projections. The resulting dataset provides robust support for climate change research, hydrological modeling, and the sustainable development of the winter tourism industry.

    • HUANG Liuhong, FAN Xiaomei, ZHAN Pengfei, SONG Chunqiao
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] River mouths, as critical transitional zones between land and sea, play a key role in coastal ecosystem services and human well-being. However, due to the combined influences of tides, river discharge, suspended matter, and colored dissolved organic matter, water environment in river mouths often exhibit high complexity and dynamic variability, making the development of effective remote sensing monitoring methods essential. The Forel-Ule Index (FUI), which quantifies water color using a 21-scale classification (1-21), serves as an effective indicator of water quality. Therefore, this study uses long-term satellite remote sensing data to conduct large-scale monitoring of FUI in river mouths along the South China Sea, aiming to reveal the spatial distribution patterns and temporal variations of water color and to explore the applicability and ecological significance of FUI in complex river mouths environments. [Methods] We calculated the FUI and analyzed the spatial distribution characteristics of 245 river mouths along the South China Sea coast using Landsat-8 and Landsat-9 remote sensing data from 2013 to 2024. The Theil-Sen median estimator and the Mann-Kendall trend test were used to analyze interannual trends. Finally, spatial and temporal distribution patterns were examined for both wet and dry seasons. [Results] The findings reveal that: (1) In terms of spatial patterns, the multi-year mean FUI across river mouths ranged from 11 to 21 (mean=17.30). Significant spatial differences were observed between coastal zones of the South China Sea (lowest in the Gulf of Thailand and highest in the eastern sea area) and among countries (lowest in Cambodia and highest in the Philippines). (2) Over the past decade, the FUI trends varied among river mouths. Water color in river mouths in China and Vietnam tended to shift toward green, while those in Malaysia and Indonesia showed a shift toward a more turbid yellowish-brown. (3) Spatial distribution of FUI in river mouth differed between dry and wet seasons. During the dry season, lower FUI values were observed in Guangdong Province (China), southern Thailand, and Cambodia, whereas high FUI values during the wet season were concentrated from northern Vietnam to the west coast of the Malay Peninsula. Regarding interannual variability, the mean annual change rate was -0.02/year in the dry season and -0.004/year in the wet season. Overall fluctuations in FUI values between the two seasons from 2013 to 2024 were small. However, FUI values in the wet season were consistently higher than those in the dry season, with differences remaining stable. [Conclusions] The spatial and temporal variations in FUI in river mouth across the South China Sea reveal significant regional and seasonal differences, reflecting the combined effects of human activities and hydrological conditions. This study extends the applicability of the FUI water color index to complex river-mouth environments and provides essential baseline information for long-term water quality monitoring, ecological assessment, and transboundary water environment management in river mouths along the South China Sea.

    • CAI Run, QIAO Yina, YANG Hui, FAN Huaiwei, YAO Yuejing, CUI Liu, WANG Yong, FENG Jian, WANG Wenfeng
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] As a typical region with intense interactions between arid ecosystems and human activities, Xinjiang's methane cycle is significantly influenced by both anthropogenic and natural geographical factors. Analyzing the spatiotemporal variations and driving mechanisms of XCH4 concentrations in Xinjiang, using multi-source data, is of great significance for addressing climate change and formulating precise regional methane emission reduction strategies. [Methods] Focusing on Xinjiang's unique physiographic and anthropogenic characteristics, this study leverages Sentinel-5P satellite XCH4 data from 2019 to 2023 and integrates diverse spatiotemporal variables, including surface relief, meteorological parameters, NDVI, livestock intensity, coal mining intensity, and nighttime light data. The SHAP method is employed to quantitatively assess the contributions and interaction mechanisms of various influencing factors on temporal variations in methane concentrations. Core influencing variables are identified through feature importance ranking, based on which a hybrid XGBoost-DF model is developed for XCH4 data reconstruction. This approach enables a comprehensive understanding of the spatiotemporal distribution patterns and dynamic evolution of column methane concentrations in Xinjiang. [Results] (1) The developed XGBoost-DF hybrid model demonstrates superior prediction accuracy compared to individual models and effectively reconstructs missing satellite observations, providing reliable data support for studying methane's spatiotemporal variations and influencing mechanisms in complex regions. The SHAP-XGBoost framework serves as an interpretable tool for precise identification of methane sources and sinks in Xinjiang. (2) Analysis of the influencing factors reveals that livestock activity intensity is the dominant anthropogenic driver, with cattle farming accounting for 88.7% of the total methane emission increase from the livestock sector. Among natural factors, land surface temperature positively influences methane concentration by enhancing methanogenic microbial activity, while near-surface 10 m wind speed reduces local accumulation via dispersion. (3) The spatial distribution of XCH4 shows a "south-high-north-low" pattern, with higher concentrations in basins than those in mountainous areas. From 2019 to 2023, annual mean XCH4 concentrations ranged from 1 727.3 to 1 972.61 ppb, showing an overall upward trend with a growth rate of 1.5%. Seasonally, a bimodal variation is observed, with distinct peaks in summer and autumn. [Conclusions] The methodology proposed in this study, combining satellite data reconstruction and influencing factor analysis, proves effective in investigating the spatiotemporal variations and driving factors behind XCH4 concentrations in Xinjiang. The findings provide a theoretical basis and technical support for future methane reduction initiatives and environmental management strategies in the region.

    • CHEN Minjie, ZHANG Zheng, CAO Yibing, ZHANG Jiangshui, YANG Zhenkai, LU Zhenglun
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Massive online travel itinerary texts have become an important source of big data in tourism geography, providing new informational support for industry analysis, planning, and travel recommendations. Rule matching and deep learning technologies can improve the accuracy and efficiency of extracting travel itinerary texts. However, there are still problems such as insufficient flexibility of methods, large workload of data annotation, and incomplete coverage of the extracted content. [Methods] This paper proposes an automatic extraction method for travel itinerary chains based on the DeepSeek model. The method mainly consists of four core steps: constructing a travel itinerary chain description model, generating itinerary chain data through prompt engineering, matching travel node names via retrieval-augmented generation, and geocoding based on the AMap API. First, a travel itinerary chain description model is constructed at three levels: the itinerary chain layer, the element layer, and the feature layer, to enable comprehensive representation of travel itinerary chains. Second, a prompt strategy is designed and a corresponding prompt template is developed to guide the DeepSeek model in automatically generating JSON-formatted travel itinerary chain data. Subsequently, external knowledge files are established, and a retrieval-augmented generation approach is employed to facilitate the matching of travel node names. Finally, by integrating geocoding technology, precise conversion from travel node names to geographic coordinates is achieved, resulting in the creation of a travel itinerary chain dataset enriched with complete spatio-temporal information. To validate the method's effectiveness, 2 834 online travel itinerary texts about Henan Province were collected from three platforms(Mafengwo, Qunar, and Ctrip) as data sources to carry out the task of extracting travel itinerary chains, and the results were compared with those of the HanLP model. [Results] The experimental results show that the Macro-Precision and Macro-F1 scores achieved by the proposed method for extracting travel nodes range from 92% to 95%, outperforming the HanLP model, which yields scores between 87% and 91%. The Macro-Recall ranges from 94% to 96%, slightly lower than HanLP model's range of 94% to 97%. Furthermore, the average similarity of the travel itinerary chains reaches 94%~95%, significantly surpassing HanLP's performance of 84%~87%. [Conclusions] The method demonstrates higher accuracy and practicality, with flexible and convenient operation. It can complete the extraction task with only a small amount of sample prompts. In addition, in the extracted dataset, besides the information of travel nodes, it also contains more information such as time, tourist behaviors, and transportation modes.

    • ZHONG Tao, LI Qiang, LIU Pei, SHI Huabin, ZHAO Tongtiegang
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Flood disasters are one of the most severe natural disasters in China, leading to huge socioeconomic losses. [Methods] This study establishes a flood disaster impacts assessment method based on news text mining via Large Language Models (LLMs). Firstly, retrieval strategies are designed using DeepSeek-R1-0528 to obtain flood-related news from the Wise Search database, which are then processed using the TF-IDF algorithm for deduplication to form a news dataset. Secondly, a classification system covering 18 socioeconomic impact dimensions is constructed, and flood impact statements are extracted from news texts after optimizing model temperature parameters through repeated experiments. Finally, the extracted results are validated against official disaster statistics, and the dynamic evolution processes of disaster situations are presented based on specific flood cases. [Results] To estimate the socioeconomic impacts of the flood disasters in 2024, a total of 14 778 flood impact statements are extracted by the LLM from 10 556 flood-related news articles. The results show that the LLM performs well in the extraction task, with a median accuracy of 0.91 and a median F1 score of 0.73. At the provincial level, the number of flood impact statements is positively correlated with the official disaster data with a correlation coefficient of 0.68 and their spatial distribution is consistent with the areas of flood disasters in the top 10 natural disasters in 2024. Furthermore, the LLM effectively captures the dynamic evolution of the rainstorm and flood disasters in northern Guangdong (Qingyuan and Shaoguan) in April 2024 and in Yueyang, Hunan Province from June to early July, revealing a shift in news coverage from emergency responses to post-disaster recovery. [Conclusions] Overall, the results indicate that LLMs can serve as an effective tool for traditional disaster survey and evaluation, with important potential and reference value for post-disaster assessment and emergency decision-making.

    • YU Zhenyan, LI Shaomei, MA Jingzhen, LI Fengchang, YOU Ning, REN Liunan
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] The goal of style transfer for landform landscape maps is to endow them with specific artistic forms while maintaining the accuracy of landform expression, thereby enriching and optimizing the effect of landform landscape maps. Generative Adversarial Networks (GANs),which are widely applied in image generation tasks, provide a new approach for the style transfer of landform landscape maps. However, when traditional GANs are used for this task, there are two key issues: first, it is difficult to capture the contextual relationships between long-distance pixels in the image, leading to poor style transfer effects; second, high-frequency information is easily lost during the training process, resulting in blurry generated images and loss of edge features, which fails to ensure the accuracy of landform mapping expression. To address the above problems, this study improves the GAN and proposes a style transfer model for landform landscape maps, namely RGAN. [Methods] Firstly, a Mixed convolution-Transformer feature extraction module MixTrans is designed in the GAN encoder,which enhances the ability to extract long-distance contextual relationships. Secondly, the loss function of the traditional GAN is improved by constructing a Laplacian High-Frequency Loss module (LPCLoss), thereby reducing high-frequency detail loss of images during the training process.Then, a dataset of Thousand miles of mountains and rivers is manually constructed for model training. Finally, the model is applied to the task of generating landform landscape maps for real regions, producing landform landscape maps with the artistic style of Thousand miles of mountains and rivers. [Results] Based on the self-built dataset, RGAN is compared with mainstream existing style transfer models (CycleGAN, StyTr, ArtFlow). Quantitative results demonstrate that the MSE, PSNR, SSIM and LPIPS values of RGAN reach 0.0178, 17.90, 0.324, and 0.371 1 respectively, which are the optimal values among all the compared models. In comparison with CycleGAN, the model with the best comprehensive performance, RGAN achieves a 23.9% reduction in MSE, a 3% increase in PSNR, a 2.6% improvement in SSIM, and a 4.6% decrease in LPIPS. Meanwhile, visual results further demonstrate the superiority of RGAN. In the task of generating landform landscape maps for three regions with distinct geomorphic types—Lushan in Jiangxi Province, Liangshan in Sichuan Province, and Wuyi Shan in Fujian Province—landform landscape maps featuring the artistic style of Thousand miles of mountains and rivers were successfully generated. This result demonstrates that the RGAN model exhibits good adaptability to various geomorphic types. [Conclusions]The RGAN model proposed in this paper demonstrates excellent capabilities in image feature extraction and image generation. It exhibits outstanding performance in the style transfer task of landform landscape maps, successfully achieving the artistic cartographic expression of landforms and thereby providing a novel automated method for the production of landform landscape maps.

    • CHEN Moran, REN Hongyan, LU Weili
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] As urbanization progresses at an accelerating pace, outbreaks of infectious diseases have become more frequent in certain highly urbanized areas of Chinese cities, posing substantial challenges to traditional prevention and control frameworks. There is a pressing need to develop novel solutions for identifying and characterizing a set of specific urban nodes that act as critical hubs for the spatial spread of infectious diseases. Such advancements are essential to sufficiently enhance the capabilities of targeted prevention and control against these diseases in highly urbanized areas. [Methods] Based on the classic co-location pattern mining algorithm, this study establishes a typical urban facility identification method for the spread and prevalence of infectious diseases in urban areas, by adjusting the determination method of spatial proximity threshold, optimizing the search strategy, and introducing the Monte Carlo simulation method. Through the case verification of the spatial node identification of dengue fever transmission and prevalence in Guangzhou from 2017 to 2019, the participation index (Participation Index, PI) was calculated to measure the co-location relationship between infectious disease cases and urban facilities, followed by a significance test at the 0.05 level. Based on this, the effectiveness of identifying spatial nodes for dengue fever transmission and prevalence in Guangzhou was compared and analyzed from perspectives including the number of spatial nodes, spatial proximity distance and case density. [Results] Significance tests based on Monte Carlo simulation method can effectively eliminate urban facilities exhibiting spurious spatial associations with dengue fever cases. Compared to uniform threshold strategy, the adaptive threshold determination strategy using K-D tree search identifies more spatial nodes, whose spatial proximity to cases better reflects actual situation. Simultaneously, dengue epidemic intensity (case density) of areas near spatial nodes such as newsstands (0.22 <PI< 0.73) and convalescent facilities (0.41 <PI< 0.61) was significantly higher than that of district and non-node areas. Furthermore, during 2017-2019, the spatial nodes of dengue fever transmission and prevalence in Guangzhou exhibited distinct regional and interannual variations, and the spatial node results were closely associated with case characteristics such as age and occupation. [Conclusions] The method for identifying spatial nodes of urban disease transmission and prevalence based on co-location pattern mining algorithm can accurately pinpoint key urban facilities influencing the spread of infectious diseases within complex urban environments and populations, which demonstrates promising application outcomes. This study deepens our understanding of transmission processes of urban infectious diseases, which will provide methodological support for targeted prevention and control of infectious diseases such as dengue fever.

    • GUO Fengyi, HUANG Kaixin, CHEN Keyu, SUN Jun
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Rainfall-induced landslide displacement sequences are characterized by strong non-stationarity, complex cross-scale coupling, and dynamic mismatch during step-like abrupt deformation stages. Existing deep learning models, which often rely on single-signal decomposition or pure data-driven strategies, frequently exhibit critical limitations at the onset of these abrupt changes. Common failures include significant phase lag, amplitude attenuation, and spurious oscillation, rendering them incapable of simultaneously maintaining prediction accuracy for both long-term gravitational creep and short-term impulsive responses. To address the structural prediction distortion that occurs during the mutation triggering stage, this paper proposes a novel deep learning prediction model. This model integrates an improved adaptive secondary decomposition strategy with physics-consistency constraints to enhance predictive fidelity. [Methods] The proposed framework is implemented through a rigorous three-step process. First, to mitigate the "mode mixing" problem inherent in original displacement sequences, Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) is employed as a primary filter to extract long-term trend terms and separate them from non-stationary components. Second, addressing the challenge of extracting high-frequency rainfall response signals from the residuals, the Crested Porcupine Optimizer (CPO) is introduced to adaptively optimize the key parameters (K and α) of Variational Mode Decomposition (VMD). This allows for a refined secondary decomposition of high-frequency residuals, effectively isolating the impulse signals caused by heavy rainfall. Furthermore, the model constructs a multi-dimensional kinematic feature set, including velocity, acceleration, and the improved tangent angle. A hybrid LSTM-Transformer network is then utilized, where the Long Short-Term Memory (LSTM) module captures local temporal dependencies and the Transformer module identifies global cross-scale correlations. Crucially, a physics-consistency regularization term is incorporated into the loss function to constrain the model's output, ensuring the prediction results adhere to the dynamic laws of landslide evolution. [Results] The model was validated using monitoring data from the Zaoshuwa landslide in Yunxi County, Shiyan City, Hubei Province. Experimental results demonstrate that the proposed method significantly outperforms traditional approaches. During periods of strong rainfall-induced abrupt deformation, the model achieved a phase lag of less than one day, virtually eliminating the delay common in standard models. The comprehensive prediction accuracy was high, with a Root Mean Square Error (RMSE) of 2.975 mm and a coefficient of determination (R2) of 0.985. This represents an improvement of approximately 38% compared to single decomposition models. Furthermore, the physics constraints effectively suppressed spurious fluctuations and false alarms during non-rainfall periods, overcoming a major defect of pure data-driven models. [Conclusions] The study not only validates the effectiveness of physical feature constraints but also provides a new paradigm for the refined separation of sudden signals in landslides, while significantly enhancing the physical interpretability and engineering reliability of deep learning models in geological disaster prediction.

    • YIN Daolong, CHANG Ming, XU Qiang, CHEN Ming, ZHAO Boju, DONG Xiujun, LIANG Jingtao
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] In the early morning of July 5, 2025, a flash flood and debris flow occurred in Mozi Gully, Fuxiang Township, Hanyuan County, Ya’an City, Sichuan Province, causing road damage but no casualties. The area is located in the transition zone on the eastern margin of the Tibetan Plateau, characterized by complex geological structures and a fragile geological environment. Concentrated precipitation and frequent short-duration intense rainfall contribute to a high risk of flash flood and debris flow disasters in this region.Against the backdrop of increasing extreme weather events due to global climate change, Therefore, unraveling the disaster-causing mechanisms under extreme rainfall is crucial for enhancing prevention and mitigation capabilities in mountainous regions. [Methods] This study proposes an innovative multi-factor coupling analytical framework termed "Water-Soil-Air-Biology," which systematically integrates four key dimensions: hydrological convergence potential, sediment supply conditions, climatic triggers, and biological response. By comprehensively utilizing Unmanned Aerial Vehicle (UAV) aerial survey to obtain high-precision terrain and material source data, integrating multi-source remote sensing images for vegetation monitoring and snow cover interpretation, and combining ground field investigation verification, and based on physical mechanisms, reproducing the entire process of numerical simulation, the system analyzes the causes and dynamic evolution process of disasters, realizing the full chain dynamic simulation from rainfall infiltration, slope instability, material transport to final accumulation. [Results] Short-term intense rainfall is the main triggering factor, and its spatial distribution is consistent with the high-value areas of terrain humidity index and runoff intensity index, significantly exacerbating channel erosion. Sediment availability, particularly the extensive and readily mobilizable historical deposits within U-shaped channels, coupled with high sediment connectivity, formed the material basis of the disaster. Vegetation degradation and human activities further compromised slope stability and altered surface runoff-infiltration relationships.Numerical inversion successfully reconstructed the dynamic process, revealing significant hazard risk at the gully mouth, with a maximum flow depth of 14.17 m, a peak velocity of 23.22 m/s, posing a severe threat to local infrastructure. [Conclusions] This studysystematically elucidates the complex disaster-causing mechanisms of flash floods and debris flows induced by extreme rainfall, addressing the shortcomings of traditional methods in capturing multi-factor interactions. The framework not only provides a solid scientific basis and a practical technical pathway for refined early warning and risk prevention of mountain torrent disasters but also offers a valuable reference for simulating similar events via its full-process numerical modeling approach. Future research should integrate artificial intelligence, real-time sensing, and multi-source monitoring data to develop intelligent early-warning and risk management platforms, thereby advancing disaster prevention and mitigation towards greater precision, intelligence, and proactivity.

    • WU Xiaoli, ZHANG Yuanbin
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] The review synthesizes the research advances of spaceborne LiDAR technology—characterized by high penetration capability and resolution, in the spatial distribution structure of forest communities and species diversity.In contrast to traditional remote sensing, this technology actively acquires three-dimensional vertical information of forests, enabling precise characterization of forest profile structures and providing key technical support for understanding the relationship between spatial heterogeneity and species diversity. [Progress] It systematically interrogates and in-depth analyzes the methodology's content and applicability for estimating forest structural parameters, including detection precision, calibration outcomes, along with an objective commentary on the existing advantages and limitations of this technology. Trends in utilizing structural indicators derived from diverse LiDAR data to estimate forest structural diversity and differentiate forest types and successional stages are explored. [Significance] The technology's utility in elucidating the interplay between forest community dynamics and ecosystem composition/functioning is discussed, thereby supporting the monitoring and optimization of nutrient cycling, enhanced soil-water conservation, resilience assessment, and insights into soil microbial functional groups. [Prospect] Future efforts should prioritize refining the accuracy and transferability of inverse models for vegetation parameters under varying conditions, alongside deepened research into multi-source data fusion and the spatial analysis of understory shrub strata.

    • YANG Jun, ZHU Qimin
      Download PDF ( ) HTML ( )   Knowledge map   Save

      [Objectives] Hyperspectral images have attracted much attention because of their rich spectral information. However, due to the limitations of imaging hardware conditions, it is usually difficult to obtain hyperspectral images with high spatial resolution directly. To improve the resolution, it is an economical and effective method to fuse hyperspectral images with multispectral images with high spatial resolution taken from the same scene. However, most existing deep learning-based methods do not fully explore the spatial and spectral correlations between images, resulting in limited fusion performance. [Methods] Therefore, a hyperspectral image super-resolution fusion method combining image de-noising, spectral feature enhancement, and spatial feature enhancement is proposed. Firstly, by applying Gaussian blur kernels with different standard deviations to hyperspectral and multispectral images, the noise contained in these two modal images can be effectively reduced. Secondly, to improve the accuracy of the fused image, channel attention and spatial attention are introduced respectively in the reconstruction of high-resolution images by using the spatial and spectral correlations of different modalities, and better spatial and spectral correlations between different modalities are obtained by enhancing the key information of the images. Finally, by using the enhanced spatial and spectral correlations, the high-resolution image features obtained from the mapping are aggregated to reconstruct hyperspectral images with high spatial resolution. [Results] The PSNR values of the fusion results on the ZY-m and Chikusei datasets are 53.586 and 53.738, respectively, which is 2.8% higher than the suboptimal method Spatial-Spectral Unfolding Network with Mutual Guidance (SMGU-Net) on the ZY-m dataset and 1.70% higher than the suboptimal method Diffusion Model with two Conditional Modulation Modules (DDIF) on the Chikusei dataset. The SAM values reach 0.006 and 0.018, which is 14.28% lower than the suboptimal method SMGU-Net on the ZY-m dataset and 5.26% lower than the suboptimal method DDIF on the Chikusei dataset. [Conclusions] The proposed method has good spectral fidelity and spatial detail enhancement capabilities, offering an effective technical solution for hyperspectral image super-resolution and showing strong potential for applications in fields such as land resource exploration and environmental monitoring.