img

QQ群聊

img

官方微信

高级检索

黄金科学技术 ›› 2021, Vol. 29 ›› Issue (1): 14-24.doi: 10.11872/j.issn.1005-2518.2021.01.216

• 智慧矿山专栏 • 上一篇    下一篇

基于视频序列的矿卡司机不安全行为识别

毕林1,2(),周超1,2(),姚鑫1,2   

  1. 1.中南大学资源与安全工程学院,湖南 长沙 410083
    2.中南大学数字矿山研究中心,湖南 长沙 410083
  • 收稿日期:2020-12-09 修回日期:2021-02-19 出版日期:2021-02-28 发布日期:2021-03-22
  • 通讯作者: 周超 E-mail:mrbilin@163.com;zhouchao@csu.edu.cn
  • 作者简介:毕林(1975-),男,四川通江人,副教授,博士,从事铲运机智能化、数字矿山和地质建模等研究工作。mrbilin@163.com
  • 基金资助:
    国家重点研发计划项目“基于大数据的金属矿开采装备智能管控技术研发与示范”(2019YFC0605300)

Unsafe Behavior Identification of Mining Truck Drivers Based on Video Sequences

Lin BI1,2(),Chao ZHOU1,2(),Xin YAO1,2   

  1. 1.School of Resources and Safety Engineering,Central South University,Changsha 410083,Hunan,China
    2.Digital Mine Research Center,Central South University,Changsha 410083,Hunan,China
  • Received:2020-12-09 Revised:2021-02-19 Online:2021-02-28 Published:2021-03-22
  • Contact: Chao ZHOU E-mail:mrbilin@163.com;zhouchao@csu.edu.cn

摘要:

目前许多矿山对于矿卡司机的不安全行为监督仍依赖于人为监管,无法及时准确地发现问题,利用计算机技术识别不安全行为是替代人工检测的一条高效途径。本文利用深度学习来解决视频序列的矿卡司机不安全行为识别,深度学习方法不依赖人工设计特征,而是自适应地学习更好的高维特征,具有稳健性更好、速度更快及准确率更高的优点。首先,对帧图像采用翻转、旋转和添加噪点等方法进行数据增强,以降低样本的不均衡性;其次,利用本文优化的模型训练数据。结果表明:网络测试准确率达到93.445%,相比原始双流网络模型提高了15%。将本文模型与不考虑时序动态信息的深度学习模型进行试验比较,证明了时域特征信息对于行为识别的重要性。综上,本文提出的网络模型对于矿卡司机不安全行为的识别率较高,对矿卡司机不安全行为的识别及采矿生产作业安全具有重要实践意义。

关键词: 不安全行为, 视频序列, 深度学习, 矿卡司机, 行为识别, 双流网络, 融合模型

Abstract:

At present,many mines still rely on human supervision to supervise the unsafe behavior of mining truck drivers,and cannot find problems timely and accurately.This consumes a certain amount of manpower and material resources but cannot solve the problem.With the development of computer technology and artificial intelligence technology,more and more fields are beginning to use artificial intelligence technology to supervise the unsafe behavior of mining truck drivers,such as intelligent security,unmanned driving,and intelligent transportation.Behavior recognition is a hot issue in the field of computer vision.Using computer technology to identify unsafe behaviors is an efficient way to replace manual detection.This paper uses deep learning to solve the unsafe behavior recognition of mining truck drivers in video sequences.The traditional deep learning method does not rely on artificial design features,but adaptively learns better high-dimensional features,better robustness,and faster speed,the accuracy rate is higher.Firstly,according to the actual obtained video data,by analyzing the relative position relationship between the camera and the driver’s area,the video is clipped to obtain video data with less redundant information.At the same time,in order to reduce the imbalance of the data samples,by using flipping,methods such as panning and adding noise were used to enhance the data set,and then use Opencv to re-convert the enhanced image data into a video file and use the dense_flow method to obtain an optical flow diagram.Secondly,use the network for training and testing.In order to conduct com-parative experiments,firstly,a traditional classification model that does not consider time sequence information was used for training and testing,and the transfer learning method was used to train Resnet,Xception,and Inception.And fusion of three single models to get a new fusion model.At the same time,the time domain and spatial domain channels of the dual-stream network model are set to the pre-trained VGG16 using migration learning under the consideration of timing information,and the comparison experiment was carried out with the C3D-two-stream proposed in this paper.The experimental results show that the improved Vgg-two-stream model can reach an accuracy rate of 89.539%,and the accuracy rate of the C3D-two-stream model can reach 93.445%.In summary,the C3D-two-stream model proposed in this paper has a high recognition rate.It also proves that for behavior recognition,the acquisition of characteristic information in the time dimension can make the recognition results more accurate,which has important practical significance for the recognition of unsafe behaviors of mining truck drivers.

Key words: unsafe behavior, video sequence, deep learning, mining truck driver, behavior recognition, two stream network, fusion model

中图分类号: 

  • TD76

图1

增广后的数据集及光流图(a)原始帧图像;(b)剪裁后的图像;(c)旋转图像;(d)翻转图像;(e)添加噪点的图像;(f)X方向光流图;(g)Y方向光流图"

图2

基于视频序列的矿卡司机不安全行为识别框架"

图3

2D和3D卷积操作"

图4

C3D网络结构"

表1

C3D模型与稠密轨迹方法的运行时间分析"

方法IDT(CPU)Brox’s(CPU)Brox’s(GPU)C3D(GPU)
运行时间/h202.22 513.9607.82.2
每秒传输帧数3.50.31.2313.9
X Slower91.41 135.9274.61

图5

VGG16网络结构"

图6

dropout层的结构"

图7

融合模型"

图8

卡车司机行为类别(a)正常驾驶(DN);(b)驾驶室无人(N);(c)双手离开方向盘(H);(d)玩手机(P)"

图9

不考虑时序信息的模型准确率"

图10

训练准确率"

图11

训练损失"

表2

各行为类别测试精度对比"

行为类别精度/%
C3DTwo-streamVGG-two-streamC3D-two-stream
平均准确率76.63978.17589.53993.445
双手离开方向盘69.06767.97283.01688.366
无人93.51697.19499.577100.000
正常驾驶73.59176.44388.91594.038
玩手机70.38271.09186.64891.375

图12

不安全行为误判图"

Cai Qiang,Deng Yibiao,Li Haisheng,al et,2020.Review of human behavior recognition methods based on deep learning[J].Computer Science,47(4):85-93.
Dalal N,Triggs B,2005.Histograms of oriented gradients for human detection[C]//2005 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),San Diego,CA,USA. Boston:IEEE. 1:886-893.
Dalal N,Triggs B,Schmid C,2006.Human detection using oriented histograms of flow and appearance[C]//European Conferences on Computer Vision.Heidelberg:Springer:428-441.
Gao J,Liu J,Han J,2019.A study for real-time identification of unsafe behavior of taking off safety helmet based on VSM model[C]// Proceedings of the 11th International Conference on Computer Modeling and Simulation.New York:Association for Computing Machinery.
Hacefendiolu K,Baaa H B,Demir G,2021.Automatic detection of earthquake-induced ground failure effects through Faster R-CNN deep learning-based object detection using satellite images[J].Natural Hazards,105:383-403.
Huang Youwen,Wan Chaolun,Feng Heng,2019.Multi-feature fusion human behavior recognition algorithm based on convolutional neural network and long-short-term memory neural network[J].Progress in Laser and Optoelectronics,56(7):243-249.
Ji S,Xu W,Yang M,al et,2013.3D convolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,35(1):221-231.
Klaser A,Marszalek M,Cordelia S,2008.A spatio-temporal descriptor based on 3D-gradients[C]//British Machine Vision Conference, Aberystwyth, UK. Guildford:BMVC.
Laptev I,Marszalek M,Schmid C,al et,2008.Learning realistic human actions from movies[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE:1-8.
Li K,Zou C,Bu S,al et,2018.Multi-modal feature fusion for geographic image annotation[J].Pattern Recognition,73: 1-14.
Mao Zhiqiang,Ma Cuihong,Cui Jinlong et al,2019.Research on behavior recognition based on two-stream convolution and two-center loss[J].Microelectronics and Computer,36(3):96-100.
Mazda T,Kajita Y,Akedo T,al et,2020.Recognition of nonlinear hysteretic behavior by neural network using deep learning[J].IOP Conference Series Materials Science and Engineering,809:012010.
Yue-Hei Ng J,Hausknecht M,Vijayanarasimhan S,al et,2015.Beyond short snippets:Deep networks for video classification[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Boston:IEEE,4694-4702.
Simonyan K,Zisserman A,2014.Two-stream convolutional networks for action recognition in videos[J].Advances in Neural Information Processing Systems.
Sun Y,Fu J,Ma Q,al et,2020.Research on wear recognition of electric worker’s helmet based on neural network[J].Journal of Physics:Conference Series,1449(1):012057.
Tran D,Bourdev L,Fergus R,al et,2015.Learning spatio temporal features with 3D convolutional networks [C]//Proceedings of the IEEE International Conference on Computer Vision. Boston:IEEE:4489-4497.
Wang H,Kläser A,Schmid C,al et,2011.Action recognition by dense trajectories[C]//2011 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Boston:IEEE:3169-3176.
Wang H,Schmid C,2013.Action recognition with improved trajectories[C]//IEEE International Conference on Computer Vision(ICCV).Boston:IEEE:3551-3558.
Wang L,Xiong Y,Wang Z,al et,2016.Temporal segment networks:Towards good practices for deep action recognition[C]//European Conference on Computer Vision.Cham:Springer:20-36.
Wang Yi,Ma Cuihong,Mao Zhiqiang,2020.Behavior recognition based on space-time dual-stream fusion network and attention model[J].Computer Applications and Software,37(8):156-159,193.
蔡强,邓毅彪,李海生,等,2020.基于深度学习的人体行为识别方法综述[J].计算机科学,47(4):85-93.
黄友文,万超伦,冯恒,2019.基于卷积神经网络与长短期记忆神经网络的多特征融合人体行为识别算法[J].激光与光电子学进展,56(7):243-249.
毛志强,马翠红,崔金龙,等,2019.基于双流卷积与双中心loss的行为识别研究[J].微电子学与计算机,36(3):96-100.
王毅,马翠红,毛志强,2020.基于时空双流融合网络与Attention模型的行为识别[J].计算机应用与软件,37(8):156-159,193.
[1] 毕林,李亚龙,郭昭宏. 基于深度卷积神经网络的卡车装载矿石量估计研究[J]. 黄金科学技术, 2019, 27(1): 112-120.
[2] 毕林,谢伟,崔君. 基于卷积神经网络的矿工安全帽佩戴识别研究[J]. 黄金科学技术, 2017, 25(4): 73-80.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘健 ,樊满华 ,邓志高 ,张青松 ,郑成英. 那林金矿全面留矿法试验研究[J]. J4, 2008, 16(6): 48 -50 .
[2] 张宝林,蔡新平,王杰,梁光河,丁汝福,肖骑彬,宋保昌,祁民. 晋冀蒙交界地区隐伏新类型金矿的找矿前景以堡子湾、九对沟、水晶屯金矿为例[J]. J4, 2004, 12(2): 5 -11 .
[3] 宋贺民, 张文钊, 徐述平. 胶东大磨曲家金矿地球化学异常模式及找矿意义[J]. J4, 2006, 14(6): 13 -23 .
[4] 李振江,王吉青,王苹,孟凡丽. 金洲矿业集团提高矿产资源综合利用率的有效途径[J]. J4, 2008, 16(4): 78 -80 .
[5] 韦龙明. 秦岭地区与滇黔桂接壤区微细浸染型金矿地质地球化学特征对比[J]. J4, 1995, 3(5): 49 -52 .
[6] 孙振佐, 武际春, 哈本海, 滕元生. 灵山沟金矿的矿床地质特征及找矿方向[J]. J4, 2003, 11(5): 16 -22 .
[7] 刘党权, 孙增鹏. 分段凿岩阶段矿房法的矿柱回收及放矿管理[J]. J4, 2005, 13(1-2): 58 -62 .
[8] 孙丽娜,王建国. 张家口—赤峰金成矿区金矿床的地质地球化学特征及其成因[J]. J4, 1995, 3(3): 3 -9 .
[9] 刘智明. 东安浅成低温热液型金矿床金的赋存状态[J]. J4, 2005, 13(05): 19 -22 .
[10] 张群喜. 江西临川茅排金矿韧·性剪切带特征及其金成矿动力机制探讨[J]. J4, 2007, 15(5): 1 -7 .