首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于改进YOLOv5s的果园环境葡萄检测
引用本文:孙俊,吴兆祺,贾忆琳,宫东见,武小红,沈继锋.基于改进YOLOv5s的果园环境葡萄检测[J].农业工程学报,2023,39(18):192-200.
作者姓名:孙俊  吴兆祺  贾忆琳  宫东见  武小红  沈继锋
作者单位:江苏大学电气信息工程学院, 镇江 212013
基金项目:国家自然科学基金面上项目(31971788);江苏大学农业装备学部项目(NZXB20210210)
摘    要:为了快速精准地识别复杂果园环境下的葡萄目标,该研究基于YOLOv5s提出一种改进的葡萄检测模型(MRWYOLOv5s)。首先,为了减少模型参数量,采用轻量型网络MobileNetv3作为特征提取网络,并在MobileNetv3的bneck结构中嵌入坐标注意力模块(coordinate attention,CA)以加强网络的特征提取能力;其次,在颈部网络中引入RepVGG Block,融合多分支特征提升模型的检测精度,并利用RepVGG Block的结构重参数化进一步加快模型的推理速度;最后,采用基于动态非单调聚焦机制的损失(wise intersection over union loss,WIoU Loss)作为边界框回归损失函数,加速网络收敛并提高模型的检测准确率。结果表明,改进的MRW-YOLOv5s模型参数量仅为7.56 M,在测试集上的平均精度均值(mean average precision,mAP)达到97.74%,相较于原YOLOv5s模型提升了2.32个百分点,平均每幅图片的检测时间为10.03 ms,比原YOLOv5s模型减少了6.13 ms。与主流的目标检测模型S...

关 键 词:图像处理  果实识别  YOLOv5s  注意力机制  RepVGG  Wise  IoU
收稿时间:2023/6/20 0:00:00
修稿时间:2023/8/17 0:00:00

Detecting grape in an orchard using improved YOLOv5s
SUN Jun,WU Zhaoqi,JIA Yilin,GONG Dongjian,WU Xiaohong,SHEN Jifeng.Detecting grape in an orchard using improved YOLOv5s[J].Transactions of the Chinese Society of Agricultural Engineering,2023,39(18):192-200.
Authors:SUN Jun  WU Zhaoqi  JIA Yilin  GONG Dongjian  WU Xiaohong  SHEN Jifeng
Institution:School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China
Abstract:Grape has been one of the most popular fruits with great nutritional value and economic benefits. Manual picking of mature grapes cannot fully meet the large-scale production in recent years, particularly with the expansion of planting areas. A picking robot can be expected to monitor the growth of grapes in orchards in real time. Automatic grape picking can also be promoted to realize intelligent agricultural production. In this study, an improved YOLOv5s model (MRW-YOLOv5s) was proposed to rapidly and accurately identify the grapes in orchards. Firstly, the lightweight network MobileNetv3 was used as the feature extraction network, in order to reduce the amount of model parameters. A coordinate attention module (CA) was also embedded into the bneck structure of MobileNetv3 to strengthen the feature extraction capability of the network. Secondly, RepVGG Block was introduced into the neck network, where the multi-branch features were integrated to improve the detection accuracy of the model. Moreover, the structural reparameterization of the RepVGG Block was implemented to further accelerate the inference speed of the model. Finally, Wise Intersection over Union Loss (WIoU Loss) with the dynamic non-monotonic focusing mechanism was taken as the bounding box regression loss function, in order to accelerate the network convergence for the better detection accuracy of the model. Gradient-weighted class activation mapping (Grad-CAM) was also selected to capture the grape targets when the backbone network of the improved model was embedded with the CA module. A better performance was then achieved, compared with the model embedded with Efficient Channel Attention (ECA) and Convolutional Block Attention Module (CBAM). In addition, there was the lowest speed of bounding box loss regression in the convergence curve of the loss function, while the highest loss value after convergence, where the EIoU was the bounding box loss function. Once the CIoU and Wise-IoU v1 were taken as the bounding box loss functions, there were similar convergence speeds and loss values, indicating a slightly lower value than that of EIoU. Moreover, there was the highest convergence speed of the model and the lowest loss value, when the Wise-IoU v3 was used as the bounding box loss function. Therefore, the Wise-IoU v3 can be expected to accelerate the convergence of the model for better accuracy of the model. The results showed that the number of parameters of the improved MRW-YOLOv5s model was only 7.56 M. The mean Average Precision (mAP) on the test set reached 97.74%, and the average detection time per image was 10.03 ms, which were 2.32 percentage points and 6ms higher than those of the original YOLOv5s model, respectively. The mAP of the MRW-YOLOv5s model was 9.89, 7.53, 2.12, 0.91, and 2.42 percentage points higher, respectively, compared with the mainstream object detection models, such as the SSD, RetinaNet, YOLOv4, YOLOv7, and YOLOX. In terms of the number of model parameters, the improved model was only 7.56 M, which was 68.2%, 79.2%, 88.2%, 79.7%, and 15.4% less than the above five models, respectively. The average detection speed of the improved model was only 10.03 ms, which was 2.64, 13.19, 10.59, 4.14, and 5.46 ms higher than the above five models, respectively. Furthermore, the weight size of the improved model was only 26.97 MB, which was more conducive to model deployment. Therefore, the MRW-YOLOv5s model can greatly contribute to the detection accuracy, parameter size, and detection speed. The finding can also provide technical support for the intelligent orchards and mechanization of picking.
Keywords:image processing  fruit identification  YOLOv5s  attention mechanism  RepVGG  Wise IoU
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号