



  • CN 62-1112/TF 
  • ISSN 1005-2518 
  • 创刊于1988年


  • 毕林 ,
  • 周超 ,
  • 姚鑫
  • 1.中南大学资源与安全工程学院,湖南 长沙 410083
    2.中南大学数字矿山研究中心,湖南 长沙 410083

收稿日期: 2020-12-09

  修回日期: 2021-02-19

  网络出版日期: 2021-03-22



Unsafe Behavior Identification of Mining Truck Drivers Based on Video Sequences

  • Lin BI ,
  • Chao ZHOU ,
  • Xin YAO
  • 1.School of Resources and Safety Engineering,Central South University,Changsha 410083,Hunan,China
    2.Digital Mine Research Center,Central South University,Changsha 410083,Hunan,China

Received date: 2020-12-09

  Revised date: 2021-02-19

  Online published: 2021-03-22




毕林 , 周超 , 姚鑫 . 基于视频序列的矿卡司机不安全行为识别[J]. 黄金科学技术, 2021 , 29(1) : 14 -24 . DOI: 10.11872/j.issn.1005-2518.2021.01.216


At present,many mines still rely on human supervision to supervise the unsafe behavior of mining truck drivers,and cannot find problems timely and accurately.This consumes a certain amount of manpower and material resources but cannot solve the problem.With the development of computer technology and artificial intelligence technology,more and more fields are beginning to use artificial intelligence technology to supervise the unsafe behavior of mining truck drivers,such as intelligent security,unmanned driving,and intelligent transportation.Behavior recognition is a hot issue in the field of computer vision.Using computer technology to identify unsafe behaviors is an efficient way to replace manual detection.This paper uses deep learning to solve the unsafe behavior recognition of mining truck drivers in video sequences.The traditional deep learning method does not rely on artificial design features,but adaptively learns better high-dimensional features,better robustness,and faster speed,the accuracy rate is higher.Firstly,according to the actual obtained video data,by analyzing the relative position relationship between the camera and the driver’s area,the video is clipped to obtain video data with less redundant information.At the same time,in order to reduce the imbalance of the data samples,by using flipping,methods such as panning and adding noise were used to enhance the data set,and then use Opencv to re-convert the enhanced image data into a video file and use the dense_flow method to obtain an optical flow diagram.Secondly,use the network for training and testing.In order to conduct com-parative experiments,firstly,a traditional classification model that does not consider time sequence information was used for training and testing,and the transfer learning method was used to train Resnet,Xception,and Inception.And fusion of three single models to get a new fusion model.At the same time,the time domain and spatial domain channels of the dual-stream network model are set to the pre-trained VGG16 using migration learning under the consideration of timing information,and the comparison experiment was carried out with the C3D-two-stream proposed in this paper.The experimental results show that the improved Vgg-two-stream model can reach an accuracy rate of 89.539%,and the accuracy rate of the C3D-two-stream model can reach 93.445%.In summary,the C3D-two-stream model proposed in this paper has a high recognition rate.It also proves that for behavior recognition,the acquisition of characteristic information in the time dimension can make the recognition results more accurate,which has important practical significance for the recognition of unsafe behaviors of mining truck drivers.


Cai Qiang,Deng Yibiao,Li Haisheng,al et,2020.Review of human behavior recognition methods based on deep learning[J].Computer Science,47(4):85-93.
Dalal N,Triggs B,2005.Histograms of oriented gradients for human detection[C]//2005 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),San Diego,CA,USA. Boston:IEEE. 1:886-893.
Dalal N,Triggs B,Schmid C,2006.Human detection using oriented histograms of flow and appearance[C]//European Conferences on Computer Vision.Heidelberg:Springer:428-441.
Gao J,Liu J,Han J,2019.A study for real-time identification of unsafe behavior of taking off safety helmet based on VSM model[C]// Proceedings of the 11th International Conference on Computer Modeling and Simulation.New York:Association for Computing Machinery.
Hacefendiolu K,Baaa H B,Demir G,2021.Automatic detection of earthquake-induced ground failure effects through Faster R-CNN deep learning-based object detection using satellite images[J].Natural Hazards,105:383-403.
Huang Youwen,Wan Chaolun,Feng Heng,2019.Multi-feature fusion human behavior recognition algorithm based on convolutional neural network and long-short-term memory neural network[J].Progress in Laser and Optoelectronics,56(7):243-249.
Ji S,Xu W,Yang M,al et,2013.3D convolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,35(1):221-231.
Klaser A,Marszalek M,Cordelia S,2008.A spatio-temporal descriptor based on 3D-gradients[C]//British Machine Vision Conference, Aberystwyth, UK. Guildford:BMVC.
Laptev I,Marszalek M,Schmid C,al et,2008.Learning realistic human actions from movies[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE:1-8.
Li K,Zou C,Bu S,al et,2018.Multi-modal feature fusion for geographic image annotation[J].Pattern Recognition,73: 1-14.
Mao Zhiqiang,Ma Cuihong,Cui Jinlong et al,2019.Research on behavior recognition based on two-stream convolution and two-center loss[J].Microelectronics and Computer,36(3):96-100.
Mazda T,Kajita Y,Akedo T,al et,2020.Recognition of nonlinear hysteretic behavior by neural network using deep learning[J].IOP Conference Series Materials Science and Engineering,809:012010.
Yue-Hei Ng J,Hausknecht M,Vijayanarasimhan S,al et,2015.Beyond short snippets:Deep networks for video classification[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Boston:IEEE,4694-4702.
Simonyan K,Zisserman A,2014.Two-stream convolutional networks for action recognition in videos[J].Advances in Neural Information Processing Systems.
Sun Y,Fu J,Ma Q,al et,2020.Research on wear recognition of electric worker’s helmet based on neural network[J].Journal of Physics:Conference Series,1449(1):012057.
Tran D,Bourdev L,Fergus R,al et,2015.Learning spatio temporal features with 3D convolutional networks [C]//Proceedings of the IEEE International Conference on Computer Vision. Boston:IEEE:4489-4497.
Wang H,Kl?ser A,Schmid C,al et,2011.Action recognition by dense trajectories[C]//2011 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Boston:IEEE:3169-3176.
Wang H,Schmid C,2013.Action recognition with improved trajectories[C]//IEEE International Conference on Computer Vision(ICCV).Boston:IEEE:3551-3558.
Wang L,Xiong Y,Wang Z,al et,2016.Temporal segment networks:Towards good practices for deep action recognition[C]//European Conference on Computer Vision.Cham:Springer:20-36.
Wang Yi,Ma Cuihong,Mao Zhiqiang,2020.Behavior recognition based on space-time dual-stream fusion network and attention model[J].Computer Applications and Software,37(8):156-159,193.
