高级搜索

基于双向门控循环单元的3D人体运动预测

桑海峰 陈紫珍

引用本文: 桑海峰, 陈紫珍. 基于双向门控循环单元的3D人体运动预测[J]. 电子与信息学报, doi: 10.11999/JEIT180978 shu
Citation:  Haifeng SANG, Zizhen CHEN. 3D Human Motion Prediction Based on Bi-directionalGated Recurrent Unit[J]. Journal of Electronics and Information Technology, doi: 10.11999/JEIT180978 shu

基于双向门控循环单元的3D人体运动预测

    作者简介: 桑海峰: 男,1978年生,教授,博士,研究方向为视觉检测技术与图像处理,人工智能;
    陈紫珍: 女,1994年生,硕士,研究方向为计算机视觉与图像处理,人工智能;
    通讯作者: 陈紫珍, chenziz@126.com
  • 基金项目: 国家自然科学基金(61773105),辽宁省自然科学基金(20170540675),辽宁省教育厅科研项目(LQGD2017023)

摘要: 在机器视觉领域,预测人体运动对于及时地人机交互及人员跟踪等是非常有必要的。为了改善人机交互及人员跟踪等的性能,该文提出一种基于双向门控循环单元(GRU)的编-解码器模型(EBiGRU-D)来学习3D人体运动并给出一段时间内的运动预测。EBiGRU-D是一种深递归神经网络(RNN),其中编码器是一个双向GRU (BiGRU)单元,解码器是一个单向GRU单元。BiGRU使原始数据从正反两个方向同时输入并进行编码,编成一个状态向量然后送入解码器进行解码。BiGRU将当前的输出与前后时刻的状态关联起来,使输出充分考虑了前后时刻的特征,从而使预测更加准确。在human3.6m数据集上的实验表明EBiGRU-D不仅极大地改善了3D人体运动预测的误差还大大地增加了准确预测的时间。

English

    1. [1]

      FOKA A F and TRAHANIAS P E. Probabilistic autonomous robot navigation in dynamic environments with human motion prediction[J]. International Journal of Social Robotics, 2010, 2(1): 79–94. doi: 10.1007/s12369-009-0037-z

    2. [2]

      MAINPRICE J and BERENSON D. Human–robot collaborative manipulation planning using early prediction of human motion[C]. Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 2013: 299–306.

    3. [3]

      BÜTEPAGE J, BLACK M J, KRAGIC D, et al. Deep representation learning for human motion prediction and classification[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 1591–1599.

    4. [4]

      TEKIN B, MÁRQUEZ–NEILA P, SALZMANN M, et al. Learning to fuse 2D and 3D image cues for monocular body pose estimation[C]. Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 3961–3970.

    5. [5]

      YASIN H, IQBAL U, KRÜGER B, et al. A dual–source approach for 3D pose estimation from a single image[C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4948–4956.

    6. [6]

      肖俊, 庄越挺, 吴飞. 三维人体运动特征可视化与交互式运动分割[J]. 软件学报, 2008, 19(8): 1995–2003.
      XIAO Jun, ZHUANG Yueting, and WU Fei. Feature visualization and interactive segmentation of 3D human motion[J]. Journal of Software, 2008, 19(8): 1995–2003.

    7. [7]

      潘红, 肖俊, 吴飞, 等. 基于关键帧的三维人体运动检索[J]. 计算机辅助设计与图形学学报, 2009, 21(2): 214–222.
      PAN Hong, XIAO Jun, WU Fei, et al. 3D human motion retrieval based on key-frames[J]. Journal of Computer-Aided Design &Computer Graphics, 2009, 21(2): 214–222.

    8. [8]

      LI Rui, LIU Zhenyu, and TAN Jianrong. Human motion segmentation using collaborative representations of 3D skeletal sequences[J]. IET Computer Vision, 2018, 12(4): 434–442. doi: 10.1049/iet-cvi.2016.0385

    9. [9]

      TAYLOR G W, HINTON G E, and ROWEIS S. Modeling human motion using binary latent variables[C]. Proceedings of the 19th International Conference on Neural Information Processing Systems, Hong Kong, China, 2006: 1345–1352.

    10. [10]

      FRAGKIADAKI K, LEVINE S, FELSEN P, et al. Recurrent network models for human dynamics[C]. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4346–4354.

    11. [11]

      HOLDEN D, SAITO J, and KOMURA T. A deep learning framework for character motion synthesis and editing[J]. ACM Transactions on Graphics, 2019, 41.

    12. [12]

      ASHESH J, ZAMIR A R, SAVARESE S, et al. Structural-RNN: Deep learning on spatio–temporal graphs[J]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 5308–5317.

    13. [13]

      MARTINEZ J, BLACK M J, and ROMERO J. On human motion prediction using recurrent neural networks[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4674–4683.

    14. [14]

      TANG Yongyi, MA Lin, LIU Wei, et al. Long–term human motion prediction by modeling motion context and enhancing motion dynamic[J/OL]. arXiv: 1805.02513. http://arxiv.org/abs/1805.02513, 2018.

    15. [15]

      ZHANG Yachao, LIU Kaipei, QIN Liang, et al. Deterministic and probabilistic interval prediction for short&-term wind power generation based on variational mode decomposition and machine learning methods[J]. Energy Conversion and Management, 2016, 112: 208–219. doi: 10.1016/j.enconman.2016.01.023

    16. [16]

      CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J/OL]. arXiv: 1406.1078, 2014.

    1. [1]

      张舸张鹏远潘接林颜永红. 基于递归神经网络的语音识别快速解码算法. 电子与信息学报, doi: 10.11999/JEIT160543

    2. [2]

      唐伦赵培培赵国繁陈前斌. 基于深度信念网络资源需求预测的虚拟网络功能动态迁移算法. 电子与信息学报, doi: 10.11999/JEIT180666

    3. [3]

      王星周一鹏周东青陈忠辉田元荣. 基于深度置信网络和双谱对角切片的低截获概率雷达信号识别. 电子与信息学报, doi: 10.11999/JEIT160031

    4. [4]

      侯志强戴铂胡丹余旺盛陈晨范舜奕. 基于感知深度神经网络的视觉跟踪. 电子与信息学报, doi: 10.11999/JEIT151449

    5. [5]

      杨宏宇王峰岩. 基于深度卷积神经网络的气象雷达噪声图像语义分割方法. 电子与信息学报, doi: 10.11999/JEIT190098

    6. [6]

      孙锐张广海高隽. 基于深度分层特征表示的行人识别方法. 电子与信息学报, doi: 10.11999/JEIT150982

    7. [7]

      侯志强王鑫余旺盛戴铂金泽芬芬. 基于自适应深度稀疏网络的在线跟踪算法. 电子与信息学报, doi: 10.11999/JEIT160762

    8. [8]

      谢湘张立强王晶. 残差网络在婴幼儿哭声识别中的应用. 电子与信息学报, doi: 10.11999/JEIT180276

    9. [9]

      程帅孙俊喜曹永刚刘广文韩广良. 多示例深度学习目标跟踪. 电子与信息学报, doi: 10.11999/JEIT150319

    10. [10]

      李寰宇毕笃彦查宇飞杨源. 一种易于初始化的类卷积神经网络视觉跟踪算法. 电子与信息学报, doi: 10.11999/JEIT150600

    11. [11]

      吴泽民王军胡磊田畅曾明勇杜麟. 基于卷积神经网络与全局优化的协同显著性检测. 电子与信息学报, doi: 10.11999/JEIT180241

    12. [12]

      张烨许艇冯定忠蒋美仙吴光华. 基于难分样本挖掘的快速区域卷积神经网络目标检测研究. 电子与信息学报, doi: 10.11999/JEIT180702

    13. [13]

      伍家松达臻魏黎明SENHADJILotfi舒华忠. 基于分裂基-2/(2a)FFT算法的卷积神经网络加速性能的研究. 电子与信息学报, doi: 10.11999/JEIT160357

    14. [14]

      冯浩黄坤李晶高榕刘东华宋成芳. 基于深度学习的混合兴趣点推荐算法. 电子与信息学报, doi: 10.11999/JEIT180458

    15. [15]

      程帅曹永刚孙俊喜赵立荣刘广文韩广良. 基于增强群跟踪器和深度学习的目标跟踪. 电子与信息学报, doi: 10.11999/JEIT141362

    16. [16]

      李寰宇毕笃彦杨源查宇飞覃兵张立朝. 基于深度特征表达与学习的视觉跟踪算法研究. 电子与信息学报, doi: 10.11999/JEIT150031

    17. [17]

      孙志军薛磊许阳明. 基于深度学习的边际Fisher分析特征提取算法. 电子与信息学报, doi: 10.3724/SP.J.1146.2012.00949

    18. [18]

      智洪欣于洪涛李邵梅高超王艳川. 一种基于深度度量学习的视频分类方法. 电子与信息学报, doi: 10.11999/JEIT171141

    19. [19]

      陈鸿昶吴彦丞李邵梅高超. 基于行人属性分级识别的行人再识别. 电子与信息学报, doi: 10.11999/JEIT180740

    20. [20]

      王勇吴金君田增山周牧王沙沙. 基于FMCW雷达的多维参数手势识别算法. 电子与信息学报, doi: 10.11999/JEIT180485

  • 图 1  EBiGRU-D网络结构

    图 2  GRU内部结构图

    图 3  BiGRU部分结构图

    图 4  1 s内关于walking动作预测性能对比

    图 5  1 s内关于discussion动作的预测性能的对比

    图 6  2 s内关于walking动作预测性能的对比

    图 7  2 s内关于复杂动作的EBiGRU-D网络 和Res-GRU网络性能的对比

    图 8  训练时间对比

    表 1  human3.6m数据集下1 s内各模型预测误差的对比(ms)

    预测时间(ms)801603204005606407201000
    Walking
    ERD[10]0.770.901.121.251.441.451.461.49
    LSTM-3LR[10]0.730.811.051.181.341.361.371.36
    Res-GRU[13]0.390.680.991.151.351.371.371.32
    MHU[14]0.320.530.690.770.900.940.971.06
    EBiGRU-D0.310.310.330.350.350.360.360.37
    Greeting
    ERD[10]0.851.091.451.641.931.891.921.98
    LSTM-3LR[10]0.800.991.371.541.811.761.791.85
    Res-GRU[13]0.520.861.301.471.781.751.821.96
    MHU[14]0.540.871.271.451.751.711.741.87
    EBiGRU-D0.480.440.490.490.520.510.520.49
    Walkingdog
    ERD[10]0.911.071.391.531.811.851.902.03
    LSTM-3LR[10]0.800.991.371.541.811.761.792.00
    Res-GRU[13]0.560.951.331.481.781.811.881.96
    MHU[14]0.560.881.211.371.671.721.811.90
    EBiGRU-D0.510.640.610.620.620.590.610.60
    Discussion
    ERD[10]0.760.961.171.241.571.701.842.04
    LSTM-3LR[10]0.710.841.021.111.491.621.761.99
    Res-GRU[13]0.310.691.031.121.521.611.701.87
    MHU[14]0.310.670.931.001.371.561.661.88
    EBiGRU-D0.330.440.500.450.481.510.500.49
    下载: 导出CSV

    表 2  human3.6m数据集下2 s内EBiGRU-D网络和Res-GRU网络预测误差的对比(ms)

    预测时间(ms)80320560720100010801320156017202000
    Walking
    Res-GRU[13]0.420.891.021.161.371.391.461.591.651.89
    EBiGRU-D0.360.350.360.390.410.410.440.470.480.48
    Greeting
    Res-GRU[13]0.650.891.211.351.561.771.852.022.162.22
    EBiGRU-D0.450.460.500.510.550.540.560.560.550.56
    Walkingdog
    Res-GRU[13]0.661.201.731.952.202.272.342.412.512.52
    EBiGRU-D0.490.580.600.600.610.600.590.600.610.61
    Discussion
    Res-GRU[13]0.891.231.561.691.852.012.122.322.492.56
    EBiGRU-D0.420.430.430.450.490.500.550.550.540.56
    下载: 导出CSV
  • 加载中
图(8)表(2)
计量
  • PDF下载量:  7
  • 文章访问数:  76
  • HTML全文浏览量:  67
  • 引证文献数: 0
文章相关
  • 通讯作者:  陈紫珍, chenziz@126.com
  • 收稿日期:  2018-10-19
  • 录用日期:  2019-03-08
  • 网络出版日期:  2019-04-09
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

返回文章