高级搜索

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于双错测度的极限学习机选择性集成方法

夏平凡 倪志伟 朱旭辉 倪丽萍

夏平凡, 倪志伟, 朱旭辉, 倪丽萍. 基于双错测度的极限学习机选择性集成方法[J]. 电子与信息学报. doi: 10.11999/JEIT190617
引用本文: 夏平凡, 倪志伟, 朱旭辉, 倪丽萍. 基于双错测度的极限学习机选择性集成方法[J]. 电子与信息学报. doi: 10.11999/JEIT190617
Pingfan XIA, Zhiwei NI, Xuhui ZHU, Liping NI. Selective Ensemble Method of Extreme Learning Machine Based on Double-fault Measure[J]. Journal of Electronics and Information Technology. doi: 10.11999/JEIT190617
Citation: Pingfan XIA, Zhiwei NI, Xuhui ZHU, Liping NI. Selective Ensemble Method of Extreme Learning Machine Based on Double-fault Measure[J]. Journal of Electronics and Information Technology. doi: 10.11999/JEIT190617

基于双错测度的极限学习机选择性集成方法

doi: 10.11999/JEIT190617
基金项目: 国家自然科学基金(91546108, 71521001),安徽省自然科学基金 (1908085QG298, 1908085MG232),过程优化与智能决策教育部重点实验室开放课题,中央高校基本科研业务费专项资金(JZ2019HGTA0053, JZ2019HGBZ0128)
详细信息
    作者简介:

    夏平凡:女,1994年生,博士生,研究方向为机器学习、人工智能和集成学习等

    倪志伟:男,1963年生,教授,博士生导师,研究方向为人工智能、机器学习和云计算

    朱旭辉:男,1991年生,讲师,硕士生导师,研究方向为进化计算和机器学习

    倪丽萍:女,1981年生,副教授,硕士生导师,研究方向为分形数据挖掘、人工智能和机器学习

    通讯作者:

    朱旭辉 zhuxuhui@hfut.edu.cn

  • 中图分类号: TP391

Selective Ensemble Method of Extreme Learning Machine Based on Double-fault Measure

Funds: The National Natural Science Foundation of China (91546108, 71521001), The Anhui Provincial Natural Science Foundation (1908085QG298, 1908085MG232), The Open Research Fund Program of Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, the Fundamental Research Funds for the Central Universities (JZ2019HGTA0053, JZ2019HGBZ0128)
  • 摘要: 极限学习机(ELM)具有学习速度快、易实现和泛化能力强等优点,但单个ELM的分类性能不稳定。集成学习可以有效地提高单个ELM的分类性能,但随着数据规模和基ELM数目的增加,计算复杂度会大幅度增加,消耗大量的计算资源。针对上述问题,该文提出一种基于双错测度的极限学习机选择性集成方法,同时从理论和实验的角度进行了详细分析。首先,运用bootstrap 方法重复抽取训练集,获得多个训练子集,在ELM上进行独立训练,得到多个具有较大差异性的基ELM,构成基ELM池;其次,计算出每个基ELM的双错测度,将基ELM按照双错测度的大小进行升序排序;最后,采用多数投票算法,根据顺序将基ELM逐个累加集成,直至集成精度最优,即获得基ELM最优子集成,并分析了其理论基础。在10个UCI数据集上的实验结果表明,较其他方法使用了更小规模的基ELM,获得了更高的集成精度,同时表明了其有效性和显著性。
  • 图  1  在不同基ELM集成规模下基于成对差异性测度排序集成分类准确率的趋势

    表  1  两个分类器的联合分布

    ${f_i}({x_k}) = {y_k}$${f_i}({x_k}) \ne {y_k}$
    ${f_j}({x_k}) = {y_k}$$a$$b$
    ${f_j}({x_k}) \ne {y_k}$$c$$d$
    下载: 导出CSV

    表  2  UCI数据集

    数据集实例个数属性个数类别
    Heart270132
    Cleveland303135
    Bupa34562
    Wholesale44072
    Diabetes76882
    German1000202
    QSAR1055412
    CMC147393
    Spambase4601572
    Wineq-w4898117
    下载: 导出CSV

    表  3  在不同规模基ELM (100, 200, 300)下的集成分类准确率

    数据集100200300
    DFSEE最高平均最低DFSEE最高平均最低DFSEE最高平均最低
    Heart80.4875.0063.6051.1482.4876.3363.6349.2482.2476.5763.6848.57
    Cleveland57.0155.8748.6638.8657.6156.6248.6838.1157.9156.8248.6837.71
    Bupa75.3770.6760.2148.1876.9571.5160.1547.0977.4072.1860.2346.49
    Wholesale94.0489.5682.7874.1994.7890.2682.7873.4895.1590.5982.7373.04
    Diabetes71.8870.6261.8352.6473.1071.4961.7351.0373.7771.8161.7450.65
    German77.4075.1769.6163.6378.0876.1269.6362.8378.5876.4069.6462.62
    QSAR86.2682.6774.4565.6387.7883.6374.5264.9288.2883.8974.4963.98
    CMC62.9960.4554.2146.5763.4161.0354.2545.9263.8861.3354.2345.46
    Spambase80.7877.5770.1363.3281.5578.1770.1262.7981.7078.4270.1362.53
    Wineq-w51.3850.8046.9744.5251.7351.0346.9444.2151.9051.2046.9444.07
    下载: 导出CSV

    表  4  在不同规模基ELM (100, 200, 300)下DFSEE与Bagging对比分析

    数据集100200300
    Bagging本文DFSEEnBagging本文DFSEEnBagging本文DFSEEn
    Heart72.1080.481371.6782.481171.7182.2411
    Cleveland49.2557.01449.2557.61649.2557.916
    Bupa65.6175.371264.1476.951264.9877.4015
    Wholesale86.4494.04886.0094.781086.1195.1511
    Diabetes63.7971.88763.5173.10763.6373.778
    German74.1377.401474.4078.08974.3878.589
    QSAR80.2286.26980.4187.78880.4788.289
    CMC58.2262.99958.4463.411258.4563.8813
    Spambase73.3480.781273.3781.551373.4681.7011
    Wineq-w46.5851.38846.5751.731146.5651.9014
    下载: 导出CSV

    表  5  与其他方法在集成精度和集成规模方面对比分析(基ELM规模200)

    数据集本文DFSEEnAGOBnPOBEnMOAGnEP-FPnSCG-Pn
    Heart82.481174.144977.529674.864374.389575.2438
    Cleveland57.61654.432251.0913250.852549.259556.251
    Bupa76.951269.933772.959969.475965.896676.8948
    Wholesale94.781089.673692.749988.592786.119687.859
    Diabetes73.10766.032668.9910266.275263.738965.3058
    German78.08975.153676.609675.303874.478675.1854
    QSAR87.78883.482484.3710083.943280.438882.0237
    CMC63.411259.634760.9210359.675158.469759.5167
    Spambase81.551376.183279.129776.476776.649376.6658
    Wineq-w51.731150.372349.489348.103448.609650.9846
    下载: 导出CSV

    表  6  与其他方法在运行时间方面的对比分析(s)

    数据集本文DFSEEAGOBPOBEMOAGEP-FPSCG-P
    Heart0.8010.960.790.8718.240.86
    Cleveland0.7722.360.731.102.771.10
    Bupa0.8613.970.850.9541.260.95
    Wholesale1.1617.261.151.2721.971.26
    Diabetes1.2917.951.281.4030.401.39
    German1.7912.581.781.8611.581.86
    QSAR2.2914.012.292.3723.552.37
    CMC2.2524.442.212.6230.852.61
    Spambase8.5443.868.528.80110.468.78
    Wineq-w7.7179.187.588.8548.568.82
    下载: 导出CSV
  • [1] HUANG Guangbin, ZHU Qinyu, and SIEW C K. Extreme learning machine: Theory and applications[J]. Neurocomputing, 2006, 70(1/3): 489–501. doi:  10.1016/j.neucom.2005.12.126
    [2] YANG Yifan, ZHANG Hong, YUAN D, et al. Hierarchical extreme learning machine based image denoising network for visual Internet of Things[J]. Applied Soft Computing, 2019, 74: 747–759. doi:  10.1016/j.asoc.2018.08.046
    [3] 吴超, 李雅倩, 张亚茹, 等. 用于表示级特征融合与分类的相关熵融合极限学习机[J]. 电子与信息学报, 2020, 42(2): 386–393. doi:  10.11999/JEIT190186

    WU Chao, LI Yaqian, ZHANG Yaru, et al. Correntropy-based fusion extreme learning machine for representation level feature fusion and classification[J]. Journal of Electronics &Information Technology, 2020, 42(2): 386–393. doi:  10.11999/JEIT190186
    [4] 陆慧娟, 安春霖, 马小平, 等. 基于输出不一致测度的极限学习机集成的基因表达数据分类[J]. 计算机学报, 2013, 36(2): 341–348. doi:  10.3724/SP.J.1016.2013.00341

    LU Huijuan, AN Chunlin, MA Xiaoping, et al. Disagreement measure based ensemble of extreme learning machine for gene expression data classification[J]. Chinese Journal of Computers, 2013, 36(2): 341–348. doi:  10.3724/SP.J.1016.2013.00341
    [5] LAN Y, SOH Y C, and HUANG Guangbin. Ensemble of online sequential extreme learning machine[J]. Neurocomputing, 2009, 72(13/15): 3391–3395. doi:  10.1016/j.neucom.2009.02.013
    [6] KSIENIEWICZ P, KRAWCZYK B, and WOŹNIAK M M. Ensemble of Extreme Learning Machines with trained classifier combination and statistical features for hyperspectral data[J]. Neurocomputing, 2018, 271: 28–37. doi:  10.1016/j.neucom.2016.04.076
    [7] 李炜, 李全龙, 刘政怡. 基于加权的K近邻线性混合显著性目标检测[J]. 电子与信息学报, 2019, 41(10): 2442–2449. doi:  10.11999/JEIT190093

    LI Wei, LI Quanlong, and LIU Zhengyi. Salient object detection using weighted K-nearest neighbor linear blending[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2442–2449. doi:  10.11999/JEIT190093
    [8] YKHLEF H and BOUCHAFFRA D. An efficient ensemble pruning approach based on simple coalitional games[J]. Information Fusion, 2017, 34: 28–42. doi:  10.1016/j.inffus.2016.06.003
    [9] CAO Jingjing, LI Wenfeng, MA Congcong, et al. Optimizing multi-sensor deployment via ensemble pruning for wearable activity recognition[J]. Information Fusion, 2018, 41: 68–79. doi:  10.1016/j.inffus.2017.08.002
    [10] MARTÍNEZ-MUÑOZ G, HERNÁNDEZ-LOBATO D, and SUÁREZ A. An analysis of ensemble pruning techniques based on ordered aggregation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(2): 245–259. doi:  10.1109/TPAMI.2008.78
    [11] MARTÍNEZ-MUÑOZ G and SUÁREZ A. Pruning in ordered bagging ensembles[C]. The 23rd International Conference on Machine learning, New York, USA, 2006: 609-616. doi: 10.1145/1143844.1143921.
    [12] GUO Li and BOUKIR S. Margin-based ordered aggregation for ensemble pruning[J]. Pattern Recognition Letters, 2013, 34(6): 603–609. doi:  10.1016/j.patrec.2013.01.003
    [13] DAI Qun, ZHANG Ting, and LIU Ningzhong. A new reverse reduce-error ensemble pruning algorithm[J]. Applied Soft Computing, 2015, 28: 237–249. doi:  10.1016/j.asoc.2014.10.045
    [14] ZHOU Zhihua, WU Jianxin, and TANG Wei. Ensembling neural networks: Many could be better than all[J]. Artificial Intelligence, 2002, 137(1/2): 239–263. doi:  10.1016/S0004-3702(02)00190-X
    [15] CAVALCANTI G D C, OLIVEIRA L S, MOURA T J M, et al. Combining diversity measures for ensemble pruning[J]. Pattern Recognition Letters, 2016, 74: 38–45. doi:  10.1016/j.patrec.2016.01.029
    [16] MAO Shasha, CHEN Jiawei, JIAO Licheng, et al. Maximizing diversity by transformed ensemble learning[J]. Applied Soft Computing, 2019, 82: 105580. doi:  10.1016/j.asoc.2019.105580
    [17] TANG E K, SUGANTHAN P N, and YAO Xin. An analysis of diversity measures[J]. Machine Learning, 2006, 65(1): 247–271. doi:  10.1007/s10994-006-9449-2
    [18] GIACINTO G and ROLI F. Design of effective neural network ensembles for image classification purposes[J]. Image and Vision Computing, 2001, 19(9/10): 699–707. doi:  10.1016/S0262-8856(01)00045-2
    [19] FUSHIKI T. Estimation of prediction error by using K-fold cross-validation[J]. Statistics and Computing, 2011, 21(2): 137–146. doi:  10.1007/s11222-009-9153-8
    [20] ZHOU Hongfa, ZHAO Xuehan, and WANG Xiao. An effective ensemble pruning algorithm based on frequent patterns[J]. Knowledge-Based Systems, 2014, 56: 79–85. doi:  10.1016/j.knosys.2013.10.024
  • [1] 张江, 范淑琴.  关于非对称含错学习问题的困难性研究, 电子与信息学报. 2020, 42(2): 327-332. doi: 10.11999/JEIT190685
    [2] 王一宾, 裴根生, 程玉胜.  基于标记密度分类间隔面的组类属属性学习, 电子与信息学报. 2020, 42(5): 1179-1187. doi: 10.11999/JEIT190343
    [3] 吴超, 李雅倩, 张亚茹, 刘彬.  用于表示级特征融合与分类的相关熵融合极限学习机, 电子与信息学报. 2020, 42(2): 386-393. doi: 10.11999/JEIT190186
    [4] 刘彬, 杨有恒, 赵志彪, 吴超, 刘浩然, 闻岩.  一种基于正则优化的批次继承极限学习机算法, 电子与信息学报. 2020, 42(7): 1734-1742. doi: 10.11999/JEIT190502
    [5] 朱旭辉, 倪志伟, 倪丽萍, 金飞飞, 程美英, 李敬明.  融合改进二元萤火虫算法和互补性测度的集成剪枝方法, 电子与信息学报. 2018, 40(7): 1643-1651. doi: 10.11999/JEIT170984
    [6] 郭威, 徐涛, 于建江, 汤克明.  基于M-estimator与可变遗忘因子的在线贯序超限学习机, 电子与信息学报. 2018, 40(6): 1360-1367. doi: 10.11999/JEIT170800
    [7] 李佩佳, 石勇, 汪华东, 牛凌峰.  基于有序编码的核极限学习顺序回归模型, 电子与信息学报. 2018, 40(6): 1287-1293. doi: 10.11999/JEIT170765
    [8] 徐涛, 郭威, 吕宗磊.  基于快速极限学习机和差分进化的机场噪声预测模型, 电子与信息学报. 2016, 38(6): 1512-1518. doi: 10.11999/JEIT150986
    [9] 芮兰兰, 李钦铭.  基于组合模型的短时交通流量预测算法, 电子与信息学报. 2016, 38(5): 1227-1233. doi: 10.11999/JEIT150846
    [10] 张立, 陈海华, 何明, 孙桂玲.  频率选择性信道中的多用户分布式波束形成技术, 电子与信息学报. 2015, 37(11): 2664-2671. doi: 10.11999/JEIT150137
    [11] 蒋寓文, 谭乐怡, 王守觉.  选择性背景优先的显著性检测模型, 电子与信息学报. 2015, 37(1): 130-136. doi: 10.11999/JEIT140119
    [12] 叶新荣, 朱卫平, 张爱清, 孟庆民.  OFDM系统双选择性慢衰落信道的压缩感知估计, 电子与信息学报. 2015, 37(1): 169-174. doi: 10.11999/JEIT140247
    [13] 张文博, 姬红兵.  融合极限学习机, 电子与信息学报. 2013, 35(11): 2728-2732. doi: 10.3724/SP.J.1146.2013.00251
    [14] 刘忠宝, 王士同.  基于熵理论和核密度估计的最大间隔学习机, 电子与信息学报. 2011, 33(9): 2187-2191. doi: 10.3724/SP.J.1146.2010.01434
    [15] 李金秀, 高新波, 杨越, 肖冰.  一种基于E-HMM的选择性集成人脸识别算法, 电子与信息学报. 2009, 31(2): 288-292. doi: 10.3724/SP.J.1146.2007.01224
    [16] 刘英男, 蒋伟, 任术波, 梁庆林.  时间-频率双选择性衰落信道的最小二乘估计方法, 电子与信息学报. 2008, 30(9): 2185-2188. doi: 10.3724/SP.J.1146.2007.00301
    [17] 杨炜伟, 蔡跃明.  双选择性信道条件下OFDM系统的频域Turbo均衡, 电子与信息学报. 2007, 29(5): 1150-1154. doi: 10.3724/SP.J.1146.2005.00807
    [18] 于晓燕, 王加庆, 杨绿溪.  基于多项式内插的MIMO时间-频率双选择性信道的信道估计, 电子与信息学报. 2006, 28(5): 871-874.
    [19] 郝跃.  双极模拟集成电路的参数统计性分析, 电子与信息学报. 1991, 13(1): 78-82.
    [20] 陈定钦, 张晓玲, 熊思强, 高翠华, 周帆.  耗尽型选择性掺杂异质结晶体管, 电子与信息学报. 1990, 12(1): 100-102.
  • 加载中
  • 图(1) / 表ll (6)
    计量
    • 文章访问数:  206
    • HTML全文浏览量:  97
    • PDF下载量:  8
    • 被引次数: 0
    出版历程
    • 收稿日期:  2019-08-12
    • 修回日期:  2020-06-21
    • 网络出版日期:  2020-07-17

    目录

      /

      返回文章
      返回

      官方微信,欢迎关注