李广洋(1994- ),男,河北唐山人,硕士生,主要从事林业遥感与信息技术研究。E-mail: li15130574839@163.com |
收稿日期: 2020-09-17
要求修回日期: 2020-12-30
网络出版日期: 2021-05-25
Multiple Kernel Learning Algorithm and its Application Research Progress in Hyperspectral Image Classification
Received date: 2020-09-17
Request revised date: 2020-12-30
Online published: 2021-05-25
Supported by
National Natural Science Foundation of China(31760181)
National Natural Science Foundation of China(31400493)
National Natural Science Foundation of China(31860181)
National Natural Science Foundation of China(1860320)
The Joint Special Project of Basic Agricultural Research in Yunnan Province(2017FG001-034)
The Joint Special Project of Basic Agricultural Research in Yunnan Province(2018FG001-059)
Digital Development and Application of Biological Resources(202002AA10007)
Yunnan "Ten Thousand People Plan" Youth Top-notch Talent Project
Southwest Forestry University Scientific Research Start-up Fund project(111821)
李广洋 , 寇卫利 , 陈帮乾 , 代飞 , 强振平 , 吴超 . 多核学习算法及其在高光谱图像分类中的应用研究进展[J]. 地球信息科学学报, 2021 , 23(3) : 492 -504 . DOI: 10.12082/dqxxkx.2021.200536
Hyperspectral images have been widely used in target detection, spectral decomposition, classification, and many other fields. They have higher recognition ability than grayscale images, panchromatic images, and multispectral images. However, how to effectively use hyperspectral images with large number of bands, huge data volume, and increased information redundancy is an important topic. Multiple kernel learning is a typical multi-view learning method that can make different kernel functions according to different feature spaces and group multiple kernel functions into an optimal kernel function for hyperspectral image classification. Compared to single kernel method, multiple kernel learning has unique advantages in solving problems such as the uneven spatial distribution of high-dimensional features, using information more efficiently, and improving classification accuracy greatly. At present, the research difficulty of multiple kernel learning algorithm are the combination of kernel functions and the selection of optimal weight coefficients. In order to improve the classification accuracy and promote the application of multiple kernel learning algorithm in hyperspectral image classification, we review the development history and current research progress of multiple kernel learning algorithms. First, the kernel learning method and the framework of multiple kernel learning algorithm are introduced. The specific methods of kernel function combination used in multiple kernel learning algorithm are summarized. According to many researches, it can be concluded that the linear method has been widely used because of its simplicity and efficiency. Moreover, according to the methods of determining weight coefficients in multiple kernel learning algorithm combination, multiple kernel learning algorithms can be generally divided into two categories: fixed-rule multiple kernel learning algorithm and optimization-based multiple kernel learning algorithm. Then, the applications of different multiple kernel learning algorithms from each category in hyperspectral image classification are reviewed. In order to facilitate researchers to discuss the problems of multiple kernel learning algorithm and hyperspectral image classification, the commonly used kernel functions and the widely used data sets in hyperspectral image classification are also reviewed. Finally, we discuss the deficiencies of multiple kernel learning algorithms in the field of hyperspectral image classification and point out the future research direction to help solve practical application problems.
表1 常用的核函数公式和特点比较Tab. 1 Commonly used kernel functions |
名称 | 公式 编号 | 参数介绍 | 特点 |
线性核函数 | (3) | 表示和的内积 | 线性核函数参数少,速度快,对于线性可分的数据具有优良的性能,但不能对线性不可分的数据进行分类 |
多项式核函数 | (4) | 参数为阶数,对核函数的特征空间维数起决定性作用。参数用于空值不同阶数的单项式权重,一般令 | 多项式核依靠升维使线性不可分的数据变得可分,但其参数较多,计算复杂程度也相对较高,因此应用相对较少 |
径向基核函数 (高斯核) | (5) | 为尺度参数,用来控制核函数对样本的相似性度量能力 | 该核函数以欧氏距离来度量样本之间的相似性,因其计算相对简单且具有平移不变性等优点而成为应用最广泛的核函数,但也存在一定缺陷,如可解释性差、容易过拟合等 |
sigmoid核函数 | (6) | 参数是控制输入数据的幅度调节参数,是控制映射阈值的位移参数 | Sigmoid核来自神经网络的阈值函数,具有很强的分类能力,但由于其包含两个参数,增加了应用难度,且其只有在特定条件下才是半正矩阵,因此该核不是Mercer核,受到一定的应用限制 |
小波核函数 | (7) | 为尺度参数 | 小波核函数在多尺度分析方面具有优势,其可以逼近任意非线性函数,提升分类精度 |
表2 多核学习线性组合方法公式和特点比较Tab. 2 Multiple kernel learning linear combination method |
名称 | 公式 编号 | 参数介绍 | 特点 |
直接求和核 | (11) | 为核的个数 | 构造简单,学习性能较为突出 |
加权求和核 | (12) | 表示核权重,即组合系数,M为常数 | 通过调整权重,可以更加灵活的调节混合核的学习能力,提升分类性能。相比于其他组合方式,更加适用于高光谱影像分类 |
加权多项式 扩展核 | (13) | 是核函数的扩展 | 当选取合适的权重系数时,能有效地保证模型的局部学习特性和全局扩展能力,使测试误差和训练误差均保留在一个较小的范围 |
表3 高光谱图像分类常用数据集信息比较Tab. 3 Commonly used data sets for hyperspectral image classification |
编号 | 数据集 名称 | 采集 年份 | 传感器 | 数据 地点 | 像素 | 空间 分辨率/m | 所包含地物类型数量 | 参考 文献 |
1 | Indian Pines数据集 | 1992 | 机载可视/红外成像光谱仪(AVIRIS) | 美国印第安纳州一块印度松树 | 145×145 | 20 | 16类 | [12][29] |
2 | Pavia University数据集 | 2003 | 机载反射光学光谱成像仪(ROSIS-03) | 意大利的帕维亚城所成的像的部分高光谱数据 | 610×340 | 1.3 | 9类 | [26][27] |
3 | Salinas数据集 | 机载可视/红外成像光谱仪(AVIRIS) | 美国加利福尼亚州的Salinas山谷 | 512×217 | 3.7 | 16类 | [56] | |
4 | Cuprite数据集 | 1997 | 机载可视/红外成像光谱仪(AVIRIS) | 美国内华达州的Cuprite地区 | 20 | [57] | ||
5 | Kennedy Space Center数据集 | 1996 | 机载可视/红外成像光谱仪(AVIRIS) | 佛罗里达州肯尼迪航天中心(KSC) | 18 | 13类 | [58] | |
6 | Botswana数据集 | 2001 | NASA EO-1卫星 | 博茨瓦纳的奥卡万戈三角洲 | 30 | 14类 | [59] | |
7 | Washington DC数据 | Hydice传感器 | 华盛顿购物中心 | 1208×307 | 7类 | [60] | ||
8 | Houston数据 | 2013 | ITRES CASI-1500传感器 | 休斯顿大学校园和邻近的城市地区 | 349×1905 | 2.5 | 15类 | [61] |
9 | Houston数据 | 2012 | Compact Spectrographic Image(CASI) | 休斯敦大学校园及其附近地区 | 1905×349 | 2.5 | 20类 | [62] |
10 | 航空高光谱影像 Chikusei | 2014 | Headwall Hyperspec-VNIR-C传感器 | 日本筑西市(Chikusei) | 2517×2335 | 2.5 | 19类 | [63] |
