QQ群聊

• CN 62-1112/TF
• ISSN 1005-2518
• 创刊于1988年

## 基于深度卷积神经网络的卡车装载矿石量估计研究

1. 中南大学资源与安全工程学院，湖南 长沙 410083

## Study on the Estimation of Ore Loading Quantity of Truck Based on Deep Convolutional Neural Network

BI Lin, LI Yalong, GUO Zhaohong,*

1. School of Resources and Safety Engineering，Central South University，Changsha 410083，Hunan，China

 基金资助: 国家自然科学基金项目“基于深度学习和距离场的复杂金属矿体三维建模技术研究”（编号：41572317）资助

Received: 2017-09-15   Revised: 2018-03-16   Online: 2019-03-11

Abstract

In the daily production and management of mines， the load measurement of trucks is an important work.The ore loading quantity of truck is usually counted by manual power， but the subjectivity of artificial statistics may affect the performance evaluation of truck drivers.Some mines used laser scanning technique or loadometer to measure the volume of ore accurately， but the equipment is expensive.The method of binocular stereo vision is used to measure the volume of the stacked material in China.By taking the photo of the stacked material at two angles in the same scene， the position of the feature points in the scene is matched， and the three-dimensional coordinates of the feature point are calculated， so as to calculate the volume of stacked material.The factors affecting the accuracy of measurement include the accuracy of camera calibration， the accuracy of stereo matching， and the error introduced by the discretization method of calculation of stacked material，etc.When the truck is loading ore， there will be a situation in which the truck body wall obscures the lower part of the ore pile，and the background of the picture is relatively complicated.In order to save cost and improve the accuracy of measurement， the research based on deep convolutional neural network was conducted to estimate the ore loading quantity in this paper.It is inconvenient to get the pictures in natural scene， so use the three-dimensional physics engine naming Chrono to simulate a trunk of ore falling into the truck， generating images of truck with different ore quantity and different ore distribution areas.The truck model was made by 3DMAX software and imported into Chrono， and the ore heap was a cube randomly generated within a certain size range.A total of 2 800 sample data were obtained for the entire experiment.The parameters were adjusted based on the network structure used to test the cifar-10 data set in Caffe.The specific training parameters are set as follows， the maximum iteration number MaxIter is 4 000， the learning rate α is 0.001， the momentum factor μ is 0.9， the regular term coefficient WeightDecay is 0.004 and the optimization algorithm adopts Nestedov.Then a deep convolutional neural network was constructed.The generated samples were divided into training sets and test sets according to the ratio of 3$∶$1， and the label values of the samples were normalized.Then the Euclidean distance between predicted value and real value of the last layer of neuron was used as the cost function to fit the generated sample data.Finally， the convolution kernel and feature map was visualized to analyze the process of convolution neural network realizing the estimation of ore quantity.The image showed that the features extracted by each convolution kernel are different， and the convolution kernel extracting the ore information effectively ensures the reliability of the model for ore quantity estimation.It was proved that deep convolution neural network constructed in this paper has good accuracy in the experimental test set. The prediction error is less than 4$%$ for most of the test sample and the prediction error is less than10$%$ for almost all test sample， which is completely acceptable in practical applications.So it not only indicates that the network model is good enough to fit the experimental data set， but also proves the feasibility of using deep learning to estimate the ore loading quantity in actual scene and deep learning method has a good application prospect.

Keywords： ore quantity estimation ; artificial intelligence ; deep learning ; convolutional neural network ; physics engine

BI Lin, LI Yalong, GUO Zhaohong. Study on the Estimation of Ore Loading Quantity of Truck Based on Deep Convolutional Neural Network[J]. Gold Science and Technology, 2019, 27(1): 112-120 doi:10.11872/j.issn.1005-2518.2019.01.112

### 1.1 物理引擎

Irr各模块的功能解释如表1

Table 1  Function of each module of Irr

Core包括一些核心引擎类，各种数据结构，自定义结构类型
Gui包括一些常用的图像用户接口类，实现了各种常用控件
Io一些输入输出，xml，zip，ini文件读写等操作接口
Scene负责管理场景，包括场景节点，摄像机，粒子系统，公告板，Mesh，灯光，动画器，地形等大部分的3D功能
Video负责设置视频驱动，渲染2D和3D场景，控制纹理，灯光，材质，顶点，图片等渲染属性

### 图1

Fig.1   Using process of Irr engine

### 1.2 生成样本

Table 2  Unit parameter of each object

### 图2

Fig.2   Sample coordinate system

Table 3  Parameter settings for each batch of data

（-3~3，5~15，-10~10）0~9991 000
（-3~3，5~15，-10~0）0~599600
（-3~3，5~15，-5~5）0~599600
（-3~3，5~15，0~10）0~599600

Composition of test samples

0~49，200~249，400~449，600~649，800~850250
100~149，300~349，500~549150
50~99，250~299，450~499150
150~199，250~399，550~599150

### 2.1 构建DCNN网络

Table 5  DCNN network structure parameters

### 图3

Fig. 3   Structure diagram of convolutional neural networks

### 2.2 向前计算

$xjl=Relu∑i∈Mjxil-1*kijl+bjl$

$xjl=downxjl-1$

Relu函数定义如式（3）所示。在采用随机梯度下降优化参数时，用Relu激活函数的训练速度比tanhsigmoid这些非线性函数要快许多，深度神经网络用Relu的训练速度比tanh快几倍[13]

$Relux=max0,x$

$bx,yi=ax,yi/k+α∑j=max0,i-n/2minN-1,i+n/2ax,yj2β$

### 2.3 梯度计算

$EN=12N∑i=1Nxi-yi22$

$En=12xn-yn22$

$ul=Wlxl-1+bl$

$∂E∂b=∂E∂u∂u∂b=δ$

$xl$表示当前层的输出。因为$∂u∂b=1$，所以Eb的偏导同误差对一个节点的全部输入的偏导相等。这个导数就是从高层到低层反向传播得到的，对第l层的导数如下：

$δl=Wl+1Tδl+1∘f'ul$

$∘$表示逐元素相乘，输出层的敏感度不同（L表示输出层），故

$δL=f'uL∘xn-yn$

$∆Wl=-η∂E∂Wl$

$δjl=βjl+1f'ujl∘upδjl+1$

$upx=x⨂1n×n$

$∂E∂bj=∑u,vδjluv$

$∂E∂kijl=∑u,vδjluvpil-1uv$

$pil-1uv$$xil-1$中与$kijl$相乘而得到特征图$xjl$中位于（u，v）的元素的那部分神经元。直觉上可能会认为寻找该部分神经元和相应的敏感度map很困难，但是式（16）能够用一行MATLAB代码实现：

$∂E∂kijl=rot180conv2xil-1,rot180δjl,‘valid’$

### 3.1 实验参数与实验平台

DCNN网络的训练参数设置如下：最大迭代次数MaxIter设置为4 000，学习率α设置为0.001，动量因子μ设置为0.9，正则项系数WeightDecay设置为0.004，优化算法采用Nesterov[18]

### 图4

Fig.4   Convolution kernel of conv1 layer

5所示为提取卡车颜色信息的卷积核。其中，上部分是卷积核图，下部分是对应的特征图。左半部分是提取卡车信息的图，可以明显看出卡车车身的轮廓，因为卡车车身的颜色与卷积核的颜色相匹配；右半部分是提取背景信息的图，可以看出卷积核的颜色与图片的背景色很接近，于是所对应的特征图里显示的是背景信息。

### 图5

Fig.5   Extracting truck information and background information

### 图6

Fig.6   The loss change of training and testing

### 图7

Fig.7   Error distribution of test set samples

## 参考文献 原文顺序 文献年度倒序 文中引用次数倒序 被引期刊影响因子

[J].计算机系统应用，2014235）：126-133.

Gao Ruxin Wang Junmeng .

Volume measurement of coal based on binocular stereo vision

[J].Computer Systems and Applications2014235）：126-133.

[D].杭州中国计量学院2015.

Mao Linlin .

Research on Measurement Method for Piles of Material Volume Based on Binocular Stereo Vision

[D].HangzhouChina Jiliang University2015.

[D].青岛山东科技大学2010.

Duan Huapeng .

Research and Application of Physics Engine Key Techniques in Virtual Reality

[D].QingdaoShandong University of Science and Technology2010.

[D].长春吉林大学2012.

Kang Yu .

Design and Implementation of 3D Game Based on the Irrlicht Engine

[D].ChangchunJilin University2012.

Lécun Y Bottou L Bengio Y et al .

Gradient-based learning applied to document recognition

[J].Proceedings of the IEEE19988611）：2278-2324.

Hinton G E Osindero S Teh Y W .

A fast learning algorithm for deep belief nets

[J].Neural Computer2006187）：1527-1554.

Ciresan D C Meier U Gambardella L M et al .

Deep，big，simple neural nets for handwritten digit recognition

[J].Neural Computation201012）：3207-3220.

Pitts W .

A Logical Calculus of the Ideas Immanent in Nervous Activity

[M]//NeurocomputingFoundations of Research.Cambridge:The Massachusetts Institute of Technology Press1988115-133.

Hagan M T Beale M Beale M .

Neural Network Design

[M].BeijingChina Machine Press2002.

Wang T Wu D J Coates A et al .

End-to-end text recognition with convolutional neural networks

[C]//International Conference on Pattern Recognition IEEE20133304-3308.

Li H Lin Z Shen X et al .

A convolutional neural network cascade for face detection

[C]//Computer Vision and Pattern Recognition IEEE20155325-5334.

Lecun Y Boser B Denker J S et al .

Backpropagation applied to handwritten zip code recognition

[J].Neural Computation201414）：541-551.

Gulcehre C Moczulski M Denil M et al .

Noisy activation functions

[C]//International Conference on Machine Learning.New York：Journal of Machine Learning Research20163059-3068.

Ioffe S Szegedy C .

Batch normalization：Accelerating deep network training by reducing internal covariate shift

[C]//International Conference on Machine Learning. Lille：Journal of Machine Learning Research2015448-456.

Bouvrie J .

Notes on convolutional neural networks

[R].MassachusettsCenter for Biological and Computational Learning200638-44.

Glorot X Bengio Y .

Understanding the difficulty of training deep feedforward neural networks

[J].Journal of Machine Learning Research20109249-256.

Nair V Hinton G E .

Rectified linear units improve restricted boltzmann machines

[C]//International Conference on Machine Learning.Haifa：Journal of Machine Learning Research2010807-814.

Su W Boyd S Candes E J .

A differential equation for modeling nesterov’s accelerated gradient method：Theory and insights

[J].Advances in Neural Information Processing Systems201531）：2510-2518.

Wallach I Dzamba M Heifets A .

AtomNet：A deep convolutional neural network for bioactivity prediction in structure-based drug discovery

[J].Mathematische Zeitschrift2015471）：34-46.

Krizhevsky A Sutskever I Hinton G E .

ImageNet classification with deep convolutional neural networks

[J].Communications of the Association for Computing Machinery2017606）：84-90.

Zeiler M D Fergus R .

Visualizing and understanding convolutional networks

[C]//European Conference on Computer Vision. Switzerland：Springer2014818-833.

/

 〈 〉