高级搜索

显著性背景感知的多尺度红外行人检测方法

赵斌 王春平 付强

引用本文: 赵斌, 王春平, 付强. 显著性背景感知的多尺度红外行人检测方法[J]. 电子与信息学报, doi: 10.11999/JEIT190761 shu
Citation:  Bin ZHAO, Chunping WANG, Qiang FU. Multi-scale Pedestrian Detection in Infrared Images with Salient Background-awareness[J]. Journal of Electronics and Information Technology, doi: 10.11999/JEIT190761 shu

显著性背景感知的多尺度红外行人检测方法

    作者简介: 赵斌: 男,1990年生,博士生,研究方向为深度学习、目标检测;
    王春平: 男,1965年生,博士生导师,研究方向为图像处理、火力控制理论与应用;
    付强: 男,1981年生,讲师,博士,研究方向为计算机视觉、网络化火控与指控技术
    通讯作者: 王春平,wang_c_p@163.com
摘要: 超大视场红外成像系统探测范围大、不受光照限制,但存在尺度多样、小目标丰富的特点。为此该文提出一种具备背景感知能力的多尺度红外行人检测方法,在提高小目标检测性能的同时,减少冗余计算。首先,构建了4尺度的特征金字塔网络分别独立预测目标,补充高分辨率细节特征。其次,在特征金字塔结构的横向连接中融入注意力模块,产生显著性特征,抑制不相关区域的特征响应、突出图像局部目标特征。最后,在显著性系数的基础上构建了锚框掩膜生成子网络,约束锚框位置,排除平坦背景,提高处理效率。实验结果表明,显著性生成子网络仅增加5.94%的处理时间,具备轻量特性;超大视场(U-FOV)红外行人数据集上的识别准确率达到了93.20%,比YOLOv3高了26.49%;锚框约束策略能节约处理时间18.05%。重构模型具有轻量性和高准确性,适合于检测超大视场中的多尺度红外目标。

English

    1. [1]

      BLOISI D D, PREVITALI F, PENNISI A, et al. Enhancing automatic maritime surveillance systems with visual information[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(4): 824–833. doi: 10.1109/TITS.2016.2591321

    2. [2]

      KANG J K, HONG H G, and PARK K R. Pedestrian detection based on adaptive selection of visible light or far-infrared light camera image by fuzzy inference system and convolutional neural network-based verification[J]. Sensors, 2017, 17(7): 1598. doi: 10.3390/s17071598

    3. [3]

      KIM S, SONG W J, and KIM S H. Infrared variation optimized deep convolutional neural network for robust automatic ground target recognition[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, 2017: 195–202. doi: 10.1109/CVPRW.2017.30.

    4. [4]

      王晨, 汤心溢, 高思莉. 基于人眼视觉的红外图像增强算法研究[J]. 激光与红外, 2017, 47(1): 114–118. doi: 10.3969/j.issn.1001-5078.2017.01.022
      WANG Chen, TANG Xinyi, and GAO Sili. Infrared image enhancement algorithm based on human vision[J]. Laser &Infrared, 2017, 47(1): 114–118. doi: 10.3969/j.issn.1001-5078.2017.01.022

    5. [5]

      MUNDER S, SCHNORR C, and GAVRILA D M. Pedestrian detection and tracking using a mixture of view-based shape-texture models[J]. IEEE Transactions on Intelligent Transportation Systems, 2008, 9(2): 333–343. doi: 10.1109/TITS.2008.922943

    6. [6]

      DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886–893. doi: 10.1109/CVPR.2005.177.

    7. [7]

      ZHANG Shanshan, BAUCKHAGE C, and CREMERS A B. Informed haar-like features improve pedestrian detection[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 947–954. doi: 10.1109/CVPR.2014.126.

    8. [8]

      WATANABE T and ITO S. Two co-occurrence histogram features using gradient orientations and local binary patterns for pedestrian detection[C]. Proceedings of the 2nd IAPR Asian Conference on Pattern Recognition, Naha, Japan, 2013: 415–419. doi: 10.1109/ACPR.2013.117.

    9. [9]

      余春艳, 徐小丹, 钟诗俊. 面向显著性目标检测的SSD改进模型[J]. 电子与信息学报, 2018, 40(11): 2554–2561. doi: 10.11999/JEIT180118
      YU Chunyan, XU Xiaodan, and ZHONG Shijun. An improved SSD model for saliency object detection[J]. Journal of Electronics &Information Technology, 2018, 40(11): 2554–2561. doi: 10.11999/JEIT180118

    10. [10]

      LIU Songtao, HUANG Di, and WANG Yunhong. Adaptive NMS: Refining pedestrian detection in a crowd[C]. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6452–6461. doi: 10.1109/CVPR.2019.00662.

    11. [11]

      LIU Wei, LIAO Shengcai, REN Weiqiang, et al. Center and scale prediction: A box-free approach for pedestrian and face detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Angeles, USA, 2019: 5187–5196. (查阅所有网上资料, 未找到本条文献信息, 请联系作者确认)

    12. [12]

      车凯, 向郑涛, 陈宇峰, 等. 基于改进Fast R-CNN的红外图像行人检测研究[J]. 红外技术, 2018, 40(6): 578–584. doi: 10.11846/j.issn.1001_8891.201806010
      CHE Kai, XIANG Zhengtao, CHEN Yufeng, et al. Research on infrared image pedestrian detection based on improved fast R-CNN[J]. Infrared Technology, 2018, 40(6): 578–584. doi: 10.11846/j.issn.1001_8891.201806010

    13. [13]

      王殿伟, 何衍辉, 李大湘, 等. 改进的YOLOv3红外视频图像行人检测算法[J]. 西安邮电大学学报, 2018, 23(4): 48–52. doi: 10.13682/j.issn.2095-6533.2018.04.008
      WANG Dianwei, HE Yanhui, LI Daxiang, et al. An improved infrared video image pedestrian detection algorithm[J]. Journal of Xi'an University of Posts and Telecommunications, 2018, 23(4): 48–52. doi: 10.13682/j.issn.2095-6533.2018.04.008

    14. [14]

      GIRSHICK R. Fast R-CNN[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448. doi: 10.1109/ICCV.2015.169.

    15. [15]

      REDMON J and FARHADI A. YOLOv3: An incremental improvement[EB/OL]. http://arxiv.org/abs/1804.02767, 2018.

    16. [16]

      郭智, 宋萍, 张义, 等. 基于深度卷积神经网络的遥感图像飞机目标检测方法[J]. 电子与信息学报, 2018, 40(11): 2684–2690. doi: 10.11999/JEIT180117
      GUO Zhi, SONG Ping, ZHANG Yi, et al. Aircraft detection method based on deep convolutional neural network for remote sensing images[J]. Journal of Electronics &Information Technology, 2018, 40(11): 2684–2690. doi: 10.11999/JEIT180117

    17. [17]

      CHEN Long, ZHANG Hanwang, XIAO Jun, et al. SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6298–6306. doi: 10.1109/CVPR.2017.667.

    18. [18]

      WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[J]. Proceedings of the 15th European Conference on Computer Vision, Munich, 2018: 3–19. doi: 10.1007/978-3-030-01234-2_1

    19. [19]

      DOLLÁR P, WOJEK C, SCHIELE B, et al. Pedestrian detection: An evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4): 743–761. doi: 10.1109/TPAMI.2011.155

    20. [20]

      FU Chengyang, LIU Wei, RANGA A, et al. DSSD: Deconvolutional single shot detector[J]. arXiv: 1701.06659, 2017. (查阅所有网上资料, 未找到本条文献信息, 请联系作者确认)

    21. [21]

      HE Kaiming, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2980–2988. doi: 10.1109/ICCV.2017.322.

    22. [22]

      BERG A, AHLBERG J, and FELSBERG M. A thermal object tracking benchmark[C]. Proceedings of the 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance, Karlsruhe, Germany, 2015: 1–6. doi: 10.1109/AVSS.2015.7301772.

    23. [23]

      LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[J]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016: 21–37. doi: 10.1007/978-3-319-46448-0_2

    24. [24]

      REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031

    25. [25]

      DAI Jifeng, LI Yi, HE Kaiming, et al. R-FCN: Object detection via region-based fully convolutional networks[C]. Advances in Neural Information Processing Systems, Barcelona, Spain, 2016: 379–387.

    1. [1]

      郭晨, 简涛, 徐从安, 何友, 孙顺. 基于深度多尺度一维卷积神经网络的雷达舰船目标识别. 电子与信息学报,

    2. [2]

      刘宏哲, 杨少鹏, 袁家政, 王雪峤, 薛建明. 基于单一神经网络的多尺度人脸检测. 电子与信息学报,

    3. [3]

      杜兰, 刘彬, 王燕, 刘宏伟, 代慧. 基于卷积神经网络的SAR图像目标检测算法. 电子与信息学报,

    4. [4]

      吴泽民, 王军, 胡磊, 田畅, 曾明勇, 杜麟. 基于卷积神经网络与全局优化的协同显著性检测. 电子与信息学报,

    5. [5]

      邓苗, 张基宏, 柳伟, 梁永生. 基于全变分的权值优化的多尺度变换图像融合. 电子与信息学报,

    6. [6]

      刘政怡, 段群涛, 石松, 赵鹏. 基于多模态特征融合监督的RGB-D图像显著性检测. 电子与信息学报,

    7. [7]

      杜兰, 魏迪, 李璐, 郭昱辰. 基于半监督学习的SAR目标检测网络. 电子与信息学报,

    8. [8]

      杨宏宇, 王峰岩. 基于深度卷积神经网络的气象雷达噪声图像语义分割方法. 电子与信息学报,

    9. [9]

      秦华标, 曹钦平. 基于FPGA的卷积神经网络硬件加速器设计. 电子与信息学报,

    10. [10]

      王鑫, 李可, 宁晨, 黄凤辰. 基于深度卷积神经网络和多核学习的遥感图像分类方法. 电子与信息学报,

    11. [11]

      吕晓琪, 吴凉, 谷宇, 张明, 李菁. 基于深度卷积神经网络的低剂量CT肺部去噪. 电子与信息学报,

    12. [12]

      李寰宇, 毕笃彦, 查宇飞, 杨源. 一种易于初始化的类卷积神经网络视觉跟踪算法. 电子与信息学报,

    13. [13]

      王巍, 周凯利, 王伊昌, 王广, 袁军. 基于快速滤波算法的卷积神经网络加速器设计. 电子与信息学报,

    14. [14]

      贺丰收, 何友, 刘准钆, 徐从安. 卷积神经网络在雷达自动目标识别中的研究进展. 电子与信息学报,

    15. [15]

      刘勤让, 刘崇阳. 利用参数稀疏性的卷积神经网络计算优化及其FPGA加速器设计. 电子与信息学报,

    16. [16]

      伍家松, 达臻, 魏黎明, SENHADJILotfi, 舒华忠. 基于分裂基-2/(2a)FFT算法的卷积神经网络加速性能的研究. 电子与信息学报,

    17. [17]

      袁野, 贾克斌, 刘鹏宇. 基于深度卷积神经网络的多元医学信号多级上下文自编码器. 电子与信息学报,

    18. [18]

      黄立勤, 朱飘. 车载视频下改进的核相关滤波跟踪算法. 电子与信息学报,

    19. [19]

      夏朝阳, 周成龙, 介钧誉, 周涛, 汪相锋, 徐丰. 基于多通道调频连续波毫米波雷达的微动手势识别. 电子与信息学报,

    20. [20]

      王斐, 吴仕超, 刘少林, 张亚徽, 魏颖. 基于脑电信号深度迁移学习的驾驶疲劳检测. 电子与信息学报,

  • 图 1  超大视场红外图像行人特性

    图 2  多尺度红外行人检测网络结构

    图 3  注意力模块结构

    图 4  显著性特征与卷积特征融合方法

    图 5  锚框掩膜生成过程

    图 6  不同输入图像的锚框掩膜

    图 7  不同二值化阈值下的锚框掩膜

    图 8  红外行人检测可视化结果

    表 1  不同IoU阈值下的行人检测平均准确率

    方法主干网络训练集平均准确率(AP)
    IoU=0.3IoU=0.45IoU=0.5IoU=0.7
    Faster R-CNNResNet101U-FOV0.5932
    SSDMobilenet_v1U-FOV0.5584
    R-FCNResNet101U-FOV0.6312
    CSPResnet50U-FOV0.8414
    YOLOv3Darknet53U-FOV0.65950.66710.66280.6461
    YOLOv3+FSDarknet53U-FOV0.88800.88700.88280.8511
    YOLOv3+FSDarknet53Caltech+U-FOV0.90570.90780.90840.8961
    Proposed methodDarknet53Caltech+U-FOV0.92010.93200.93150.9107
    下载: 导出CSV

    表 2  参数量对比

    方法总参数量可训练参数量不可训练参数量
    YOLOv361,576,34261,523,73452,608
    Proposed method64,861,97664,806,29655,680
    下载: 导出CSV

    表 3  U-FOV测试集图像总处理时间

    方法YOLOv3YOLOv3+AttentionFS+AttentionProposed method
    总时间(s)90.3595.72125.39107.25
    处理帧率7.326.915.276.16
    下载: 导出CSV
  • 加载中
图(8)表(3)
计量
  • PDF下载量:  2
  • 文章访问数:  47
  • HTML全文浏览量:  262
文章相关
  • 通讯作者:  王春平, wang_c_p@163.com
  • 收稿日期:  2019-09-30
  • 网络出版日期:  2020-05-20
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

返回文章