首页 | 本学科首页   官方微博 | 高级检索  
     检索      

采用改进YoloV4模型检测复杂环境下马铃薯
引用本文:张兆国,张振东,李加念,王海翼,李彦彬,李东昊.采用改进YoloV4模型检测复杂环境下马铃薯[J].农业工程学报,2021,37(22):170-178.
作者姓名:张兆国  张振东  李加念  王海翼  李彦彬  李东昊
作者单位:1. 昆明理工大学现代农业工程学院,昆明 650500;2. 云南省高校中药材机械化工程研究中心,昆明 650500
基金项目:云南省重大科技专项(2018ZC001);重庆市科研机构绩效激励引导专项(cstc2019jxj100002)
摘    要:为解决马铃薯联合收获机在作业过程中分级清选的问题,并在收获作业过程中实时监测评估收获状态,该研究提出一种在光照亮度变化大、土壤与薯块遮挡、机器振动以及尘土干扰等情况下对马玲薯进行识别检测并快速准确获取马铃薯数量以及损伤情况的机器学习模型。在卷积神经残差网络中引入轻量级注意力机制,改进YoloV4检测网络,并将YoloV4结构中的CSP-Darknet53网络替换为MobilenetV3网络,完成特征提取。试验结果表明,基于卷积神经网络的深度学习方法相比于传统Open-CV识别提高了马铃薯识别精度,相比于其他传统机器学习模型,MobilenetV3-YoloV4识别速度更快,马铃薯识别的全类平均准确率达到91.4%,在嵌入式设备上的传输速度为23.01帧/s,模型鲁棒性强,能够在各种环境下完成对正常马铃薯和机械损伤马铃薯的目标检测,可为马铃薯联合收获机智能清选以及智能收获提供技术支撑。

关 键 词:机器视觉  目标检测  深度学习  马铃薯  YoloV4  MobilenetV3
收稿时间:2021/5/30 0:00:00
修稿时间:2021/6/29 0:00:00

Potato detection in complex environment based on improved YoloV4 model
Zhang ZhaoGuo,Zhang Zhendong,Li Jianian,Wang Haiyi,Li Yanbin,Li Donghao.Potato detection in complex environment based on improved YoloV4 model[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(22):170-178.
Authors:Zhang ZhaoGuo  Zhang Zhendong  Li Jianian  Wang Haiyi  Li Yanbin  Li Donghao
Institution:1. Faculty of Modern Agricultural Engineering, Kunming University of Science and Technology, Kunming 650500, China; 2. Yunnan province university of Chinese medicine mechanization research center, Kunming 650500, China
Abstract:Abstract: Potatoes have been provided more guarantee for the national food security as the fourth largest food crop in China. However, the relatively low harvest efficiency and intelligence operation have been serious bottlenecks in the potato industry at present. It is necessary to real-time detect and evaluate the potato''s state during harvesting, particularly on the grading and cleaning treatment in a combine harvester. In this study, a machine learning model was proposed to quickly and accurately identify the number and damage of potatoes under the various working environments, such as light brightness, shielding of soil and potato blocks, machine vibration, and dust interference. A lightweight attention mechanism was also introduced into the convolutional neural residual network. The attention mechanism acted on the full connection layer was then added to the YoloV4 using the different weights of each channel. The original K-means aggregation was abandoned, due to the relatively consistent size of potatoes. Three output layers of YoloV4 were combined into a large output layer, where the cspdarknet53 was replaced by the mobile netv3 network structure to realize the feature extraction. As such, the MobilenetV3 presented an inverse residual structure with the deeply separable convolution blocks and linear bottlenecks. The amount of calculation and parameters were reduced to 1/4 of the original using the H-swish activation function instead of the swish function, thereby significantly improving the detection speed without loss of the recognition rate of the potato. Some operations were selected to process the collected images for the better generalization ability of the training model, including the horizontal flip, vertical flip, mirror image, and adding noise. Among them, there were 1 296 images with high quality, 322 images of mechanically damaged potatoes, and 231 images with disturbing for comparison. The collected image data set was used for the model training at the workstation, where the loss value of training set and test set were recorded. Subsequently, the comparative and field tests were carried out, where the trained network was introduced into the embedded equipment. The evaluation indexes were set as the precision-recall curve, AP (detection accuracy), map (mean value of AP value in all categories) and detection speed. It was proved that the depth learning improved the recognition accuracy of potato, compared with the traditional open CV model. The MobilenetV3-YoloV4 also presented a higher recognition speed, and an excellent extraction performance to the target, compared with YoloV4, YoloV3, VGG16, and traditional open CV models. The results show that the average accuracy of potato recognition was 91.4%, indicating strong robustness for the target detection of normal potato and mechanically damaged potato in various environments. There was a better performance at the illumination of 30o, 45o, 60o and 90o, where the transmission speed of 23.01 frames per second when the network model was applied to embedded devices. A field experiment proved that the MobilenetV3-YoloV4 was used to real-time detect the potato flow in the actual harvest. According to the flow, the separation speed of the vertical annular was adjusted to avoid the excessive accumulation of potatoes, when the potato was fed too much. Otherwise, the linear scratch between potato and soil potato would result in the increase of the skin breaking rate. Once the feeding amount was reduced, the rotating speed of the vertical annular was adjusted to reduce the damage caused by the vibration of the device, where there was less energy consumption, as well as the less linear scratch between the potato and the grid. This finding can provide sound technical support for the intelligent cleaning and grading of potatoes in a combine harvester.
Keywords:machine vision  target detection  deep-learning  potato  YoloV4  MobilenetV3
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号