基于深度卷积神经网络的卡车装载矿石量估计研究

doi:10.11872/j.issn.1005-2518.2019.01.112

摘要/Abstract

摘要：

卡车装载矿石量一般采用人工方式进行统计，人工统计不具有客观性，可能影响卡车司机的绩效考核；此外还有使用激光扫描技术或地磅对装载矿石量进行精确统计，但设备成本过高。为节约成本并提高测量精度，研究采用基于深度卷积神经网络的技术实现卡车装载矿石量的估计。由于实际场景下矿车装载矿石的图片不便获取，故使用三维物理引擎Chrono模拟矿石堆落入卡车的过程，从而生成装载矿石量和矿石分布区域均不同的卡车图片。通过构建深度卷积神经网络对生成的样本进行拟合，用最后一层神经元的预测值与真实值的欧式距离作为代价函数。然后，对卷积核与特征图进行可视化，分析卷积神经网络实现矿石量估计的过程。试验结果表明，构建的深度卷积神经网络在实验测试集上的准确度较好，测试样本预测误差大部分在4%以内。说明用深度学习技术实现自然场景下卡车装载矿石量的估计切实可行，且具有较好的应用前景。

关键词: 矿石量估计, 人工智能, 深度学习, 卷积网络, 物理引擎

Abstract:

In the daily production and management of mines， the load measurement of trucks is an important work.The ore loading quantity of truck is usually counted by manual power， but the subjectivity of artificial statistics may affect the performance evaluation of truck drivers.Some mines used laser scanning technique or loadometer to measure the volume of ore accurately， but the equipment is expensive.The method of binocular stereo vision is used to measure the volume of the stacked material in China.By taking the photo of the stacked material at two angles in the same scene， the position of the feature points in the scene is matched， and the three-dimensional coordinates of the feature point are calculated， so as to calculate the volume of stacked material.The factors affecting the accuracy of measurement include the accuracy of camera calibration， the accuracy of stereo matching， and the error introduced by the discretization method of calculation of stacked material，etc.When the truck is loading ore， there will be a situation in which the truck body wall obscures the lower part of the ore pile，and the background of the picture is relatively complicated.In order to save cost and improve the accuracy of measurement， the research based on deep convolutional neural network was conducted to estimate the ore loading quantity in this paper.It is inconvenient to get the pictures in natural scene， so use the three-dimensional physics engine naming Chrono to simulate a trunk of ore falling into the truck， generating images of truck with different ore quantity and different ore distribution areas.The truck model was made by 3DMAX software and imported into Chrono， and the ore heap was a cube randomly generated within a certain size range.A total of 2 800 sample data were obtained for the entire experiment.The parameters were adjusted based on the network structure used to test the cifar-10 data set in Caffe.The specific training parameters are set as follows， the maximum iteration number MaxIter is 4 000， the learning rate α is 0.001， the momentum factor μ is 0.9， the regular term coefficient WeightDecay is 0.004 and the optimization algorithm adopts Nestedov.Then a deep convolutional neural network was constructed.The generated samples were divided into training sets and test sets according to the ratio of 3 $∶$ 1， and the label values of the samples were normalized.Then the Euclidean distance between predicted value and real value of the last layer of neuron was used as the cost function to fit the generated sample data.Finally， the convolution kernel and feature map was visualized to analyze the process of convolution neural network realizing the estimation of ore quantity.The image showed that the features extracted by each convolution kernel are different， and the convolution kernel extracting the ore information effectively ensures the reliability of the model for ore quantity estimation.It was proved that deep convolution neural network constructed in this paper has good accuracy in the experimental test set. The prediction error is less than 4 $%$ for most of the test sample and the prediction error is less than10 $%$ for almost all test sample， which is completely acceptable in practical applications.So it not only indicates that the network model is good enough to fit the experimental data set， but also proves the feasibility of using deep learning to estimate the ore loading quantity in actual scene and deep learning method has a good application prospect.

Key words: ore quantity estimation, artificial intelligence, deep learning, convolutional neural network, physics engine

中图分类号:

TD57

毕林,李亚龙,郭昭宏. 基于深度卷积神经网络的卡车装载矿石量估计研究[J]. 黄金科学技术, 2019, 27(1): 112-120.

Lin BI,Yalong LI,Zhaohong GUO. Study on the Estimation of Ore Loading Quantity of Truck Based on Deep Convolutional Neural Network[J]. Gold Science and Technology, 2019, 27(1): 112-120.

图/表 12

表1

图1

表2

图2

表3

表4

表5

图3

图4

图5

图6

图7

参考文献 21

1	高如新，王俊孟 .基于双目立体视觉的煤体积测量[J].计算机系统应用，2014，23（5）：126-133.
	Gao Ruxin ， Wang Junmeng .Volume measurement of coal based on binocular stereo vision[J].Computer Systems and Applications，2014，23（5）：126-133.
2	毛琳琳 .基于双目立体视觉的大堆物料体积测量方法研究[D].杭州：中国计量学院，2015.
	Mao Linlin .Research on Measurement Method for Piles of Material Volume Based on Binocular Stereo Vision[D].Hangzhou：China Jiliang University，2015.
3	段化鹏 .虚拟现实中物理引擎关键技术的研究与应用[D].青岛：山东科技大学，2010.
	Duan Huapeng .Research and Application of Physics Engine Key Techniques in Virtual Reality[D].Qingdao：Shandong University of Science and Technology，2010.
4	康宇 .基于Irrlicht引擎的3D游戏的设计与实现[D].长春：吉林大学，2012.
	Kang Yu .Design and Implementation of 3D Game Based on the Irrlicht Engine[D].Changchun：Jilin University， 2012.
5	Lécun Y ， Bottou L ， Bengio Y ，et al .Gradient-based learning applied to document recognition[J].Proceedings of the IEEE，1998，86（11）：2278-2324.
6	Hinton G E ， Osindero S ， Teh Y W .A fast learning algorithm for deep belief nets[J].Neural Computer，2006，18（7）：1527-1554.
7	Ciresan D C ， Meier U ， Gambardella L M ， et al .Deep，big，simple neural nets for handwritten digit recognition [J].Neural Computation，2010（12）：3207-3220.
8	Pitts W . A Logical Calculus of the Ideas Immanent in Nervous Activity[M]//Neurocomputing：Foundations of Research.Cambridge:The Massachusetts Institute of Technology Press，1988：115-133.
9	Hagan M T ， Beale M ， Beale M .Neural Network Design[M].Beijing：China Machine Press，2002.
10	Wang T ， Wu D J ， Coates A ，et al .End-to-end text recognition with convolutional neural networks[C]//International Conference on Pattern Recognition IEEE，2013：3304-3308.
11	Li H ， Lin Z ， Shen X ，et al .A convolutional neural network cascade for face detection[C]//Computer Vision and Pattern Recognition IEEE，2015：5325-5334.
12	Lecun Y ， Boser B ， Denker J S ，et al .Backpropagation applied to handwritten zip code recognition[J].Neural Computation，2014，1（4）：541-551.
13	Gulcehre C ， Moczulski M ， Denil M ，et al .Noisy activation functions[C]//International Conference on Machine Learning.New York：Journal of Machine Learning Research，2016：3059-3068.
14	Ioffe S ， Szegedy C .Batch normalization：Accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning. Lille：Journal of Machine Learning Research，2015：448-456.
15	Bouvrie J .Notes on convolutional neural networks[R].Massachusetts：Center for Biological and Computational Learning，2006：38-44.
16	Glorot X ， Bengio Y .Understanding the difficulty of training deep feedforward neural networks[J].Journal of Machine Learning Research，2010，9：249-256.
17	Nair V ， Hinton G E .Rectified linear units improve restricted boltzmann machines[C]//International Conference on Machine Learning.Haifa：Journal of Machine Learning Research，2010：807-814.
18	Su W ， Boyd S ， Candes E J .A differential equation for modeling nesterov’s accelerated gradient method：Theory and insights[J].Advances in Neural Information Processing Systems，2015，3（1）：2510-2518.
19	Wallach I ， Dzamba M ， Heifets A .AtomNet：A deep convolutional neural network for bioactivity prediction in structure-based drug discovery[J].Mathematische Zeitschrift，2015，47（1）：34-46.
20	Krizhevsky A ， Sutskever I ， Hinton G E .ImageNet classification with deep convolutional neural networks[J].Communications of the Association for Computing Machinery，2017，60（6）：84-90.
21	Zeiler M D ， Fergus R .Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision. Switzerland：Springer，2014：818-833.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

模块	功能
Core	包括一些核心引擎类，各种数据结构，自定义结构类型
Gui	包括一些常用的图像用户接口类，实现了各种常用控件
Io	一些输入输出，xml，zip，ini文件读写等操作接口
Scene	负责管理场景，包括场景节点，摄像机，粒子系统，公告板，Mesh，灯光，动画器，地形等大部分的3D功能
Video	负责设置视频驱动，渲染2D和3D场景，控制纹理，灯光，材质，顶点，图片等渲染属性

物体	尺寸（分别为x，y，z方向）	坐标（x，y，z）
容器底板	（10，0.1，24）	（0，0，0）
容器侧板1	（0.1，5.5，24.01）	（-5，2.75，0）
容器侧板2	（0.1，5.5，24.01）	（5，2.75，0）
容器侧板3	（10.1，5.5，0.1）	（0，2.75，-12）
容器侧板4	（10.1，5.5，0.1）	（0，2.75，12）
矿石单元	（0.8~0.9，0.8~0.9，0.8~0.9）	（-3~3，5~15，-10~10）

数据批次	坐标范围	矿石单元数/个	样本数/个
一	（-3~3，5~15，-10~10）	0~999	1 000
二	（-3~3，5~15，-10~0）	0~599	600
三	（-3~3，5~15，-5~5）	0~599	600
四	（-3~3，5~15，0~10）	0~599	600

数据批次	矿石单元数/个	样本数/个
一	0~49，200~249，400~449，600~649，800~850	250
二	100~149，300~349，500~549	150
三	50~99，250~299，450~499	150
四	150~199，250~399，550~599	150

层名称	特征输出	核尺寸/步长	填充	权重参数量
卷积层1	25625632	5*5/1	2	2 400
最大池化层1	17717732	3*3/2	0	9 220
卷积层2	17717732	5*5/1	2	25 600
平均池化层2	878732	3*3/2	0	9 220
卷积层3	878764	5*5/1	2	51 200
平均池化层3	474764	3*3/2	0	18 430
全连接层	1	-	-	141 380