Journal of Geo-information Science ›› 2020, Vol. 22 ›› Issue (8): 1597-1606.doi: 10.12082/dqxxkx.2020.190385

Previous Articles     Next Articles

Urban Area Extraction based on Independent Component Analysis and Random Forest Algorithm

PU Dongchuan1,2(), WANG Guizhou1,3,4, ZHANG Zhaoming1,3,4,*(), NIU Xuefeng2, HE Guojin1,3,4, LONG Tengfei1,3,4, YIN Ranyu1,3,4, JIANG Wei1,3,4, SUN Jiayue2   

  1. 1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
    2. College of Geo-exploration Science and Technology, Jilin University, Changchun 130026, China
    3. Key Laboratory of Earth Observation Hainan Province, Sanya 572029, China
    4. Sanya Institute of Remote Sensing, Sanya 572029, China
  • Received:2019-07-19 Revised:2019-11-25 Online:2020-08-25 Published:2020-10-25
  • Contact: ZHANG Zhaoming;
  • Supported by:
    National Natural Science Foundation of China(61731022);Strategic Priority Research Program of the Chinese Academy of Sciences(XDA19090300);National Key Research and Development Project(2016YFA0600302);National Key Research and Development Project(2016YFB0501502)


Urban area information is of great significance for human development, in the 2030 United Nations (UN) Sustainable Development Agenda. Urban area expanded rapidly in many places of the world. Accurate and timely urban area information is very important for decision makers. However, land cover in urban area is highly complex, including artificial buildings, trees, grasslands, water bodies, etc. Extraction of urban land cover information based on traditional manual survey is time-consuming and difficult to update in time. Free access to remote sensing satellite data such as Landsat provides a rich source of data for urban area extraction. Urban area information extracted from space borne remote sensing images can provide basic scientific data for decision-making and city construction and management. Based on supervised classification method and satellite remote sensing data, it is possible to extract urban areas fast. However, choosing appropriate feature variables is very important for obtaining accurate urban area extraction result, especially linear correlations between different features has a significant impact on the extraction accuracy. After implementing independent component analysis (ICA) transformation to satellite remote sensing image data, linearly independent feature variables can be obtained, therefore accuracy of urban area extraction can be effectively improved. Taking Beijing city as the study area and Landsat 8 Operational Land Imager (OLI) imagery (path/row: 123/32) acquired on July 10th, 2017 as the experimental data, preprocessing, texture extraction, independent component analysis, and principal component analysis were performed, 29 features in 4 dimensions and 7 feature variable combinations were selected. Then, Random Forest (RF) algorithm was chosen for urban area extraction owing to its stable performance, high classification accuracy and feature importance evaluation capability. Based on the random forest algorithm, feature importance evaluation, urban area extraction, and accuracy assessment were carried out to determine the optimal feature combination for urban area extraction. It was found that: (1) the overall accuracy of urban area extraction with spectral and ICA transformed features is 93.1% and the Kappa coefficient is 0.86, which is superior to the results with other features; (2) Based on the random forest algorithm, the data is trained to obtain normalized importance of each feature. There is a similarity between the normalized importance of features and the standard deviation of mean values of the features, indicating that the importance estimate of features has a close relationship with the standard deviation of mean values of the features and both can be used to estimate importance of the variables.

Key words: random forest, independent component analysis, principal component analysis, gray level co-occurrence matrix, satellite remote sensing, urban area, Landsat 8, feature importance