首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于改进Oriented R-CNN的旋转框麦穗检测与计数模型
引用本文:于俊伟,陈威威,郭园森,母亚双,樊超.基于改进Oriented R-CNN的旋转框麦穗检测与计数模型[J].农业工程学报,2024,40(6):248-257.
作者姓名:于俊伟  陈威威  郭园森  母亚双  樊超
作者单位:河南工业大学粮食信息处理与控制教育部重点实验室,郑州 450001;河南工业大学河南省粮食光电探测与控制重点实验室,郑州 450001;河南工业大学人工智能与大数据学院,郑州 450001;河南工业大学粮食信息处理与控制教育部重点实验室,郑州 450001;河南工业大学河南省粮食光电探测与控制重点实验室,郑州 450001;河南工业大学信息科学与工程学院,郑州 450001
基金项目:国家自然科学基金青年基金项目(62006071);2021年度河南省科技攻关计划项目(212102210152);河南工业大学粮食信息处理中心开放课题(KFJJ2023004)
摘    要:为对干扰、遮挡等复杂的田野环境中麦穗进行精准定位与计数,该研究提出了一种改进的Oriented R-CNN麦穗旋转框检测与计数方法,首先在主干网络中引入跨阶段局部空间金字塔(SPPCSPC,spatial pyramid pooling cross stage partial networks)模块扩大模型感受野,增强网络感知能力;其次,在颈网络中结合路径聚合网络(PANet,path aggregation network)和混合注意力机制(E2CBAM,efficient two convolutional block attention module),丰富特征图包含的特征信息;最后采用柔性非极大值抑制算法(Soft-NMS,soft-non maximum suppression)优化预测框筛选过程。试验结果显示,改进的模型对复杂环境中的麦穗检测效果良好。相较原模型,平均精确度均值mAP提高了2.02个百分点,与主流的旋转目标检测模型Gliding vertex、R3det、Rotated Faster R-CNN、S2anet和Rotated Retinanet相比,mAP分别提高了4.99、2.49、3.94、2.25和4.12个百分点。该研究方法利用旋转框准确定位麦穗位置,使得框内背景区域面积大幅度减少,为实际观察麦穗生长状况和统计数量提供了一种有效的方法。

关 键 词:图像识别  作物  注意力机制  麦穗  Oriented  R-CNN
收稿时间:2023/12/8 0:00:00
修稿时间:2024/1/14 0:00:00

Improved Oriented R-CNN-based model for oriented wheat ears detection and counting
YU Junwei,CHEN Weiwei,GUO Yuansen,MU Yashuang,FAN Chao.Improved Oriented R-CNN-based model for oriented wheat ears detection and counting[J].Transactions of the Chinese Society of Agricultural Engineering,2024,40(6):248-257.
Authors:YU Junwei  CHEN Weiwei  GUO Yuansen  MU Yashuang  FAN Chao
Institution:Key Laboratory of Grain Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China;Henan Key Laboratory of Grain Photoelectric Detection and Control, Henan University of Technology, Zhengzhou 450001, China;College of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou 450001, China;Key Laboratory of Grain Information Processing and Control, Ministry of Education, Henan University of Technology, Zhengzhou 450001, China;Henan Key Laboratory of Grain Photoelectric Detection and Control, Henan University of Technology, Zhengzhou 450001, China;College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China
Abstract:An accurate detection can greatly contribute to the wheat ears in field environments. Traditional object detection models with horizontal bounding boxes cannot accurately detect the densely distributed wheat ears, particularly on the significant occlusion between ears and stalks. The high miss detection of wheat ears often occurs in the variation of illumination conditions, dense distribution, and small scales, due to the overlap of prediction bounding boxes. It is a high demand to orient the wheat ears with less noise and of large background for the high performance. In this study, an improved Oriented Region-based Convolution Neural Networks (R-CNN) model was proposed to detect and count rotated wheat ears. Firstly, the spatial pyramid pooling cross-stage partial networks (SPPCSPC) was added to the backbone network to generate the last layer of the output feature map. The sensing field was then enlarged to enhance the perceptual ability of the network; Secondly, the feature aggregation network and the efficient two convolutional block attention module (E2CBAM) hybrid attention mechanism module were introduced into the neck network to enrich the feature information in the feature map; Finally, the prediction bounding boxes were optimized using the flexible non-maximal inhibition algorithm soft-non maximum suppression (Soft-NMS), in order to optimize the predicted bounding boxes screening. The E2CBAM module was improved using the convolutional block attention module (CBAM) in the E2CA module, instead of the CAM channel attention module. The E2CA module was composed of two parallel ECA branch structures: the maximum and average pooling. Two adaptive convolution kernels were then obtained to sum. Finally, the channel assignment was weighted for the important channel information. The key feature was captured to improve the detection performance of the model. To verify the E2CBAM hybrid attention module, the path aggregation network (PANet) was introduced into the neck network to enrich the semantic and target location in the feature map. The detection accuracy of the model was then improved by 0.19 percentage points. Furthermore, the detection accuracy was improved by 0.16 and 0.31 percentage points, whereas, the number of parameters increased by 3.58 and 3.54 M. respectively, in the CBAM and E2CBAM hybrid attention mechanism module. The floating-point computation remained unchanged. Compared with the CBAM, the E2CBAM hybrid attention mechanism module improved the detection accuracy of the model by 0.15 percentage points, while reducing the number of parameters by 0.04M with the unchanged computation. The experimental results show that the improved Oriented R-CNN model accurately represented the head direction of wheat ears, indicating better detection performance. The mean mAP of average accuracy was 2.02 percentage points higher than the original model, compared with the mainstream-oriented bounding boxes detection models. Moreover, the mAP values were improved by 4.99, 2.49, 3.94, 2.25, and 4.12 percentage points, respectively, compared with the mainstream rotating target detection models, Gliding vertex, R3det, Rotated Faster R-CNN, S2anet, and Rotated Retinanet. The Oriented R-CNN was utilized to accurately represent the head direction of wheat ears. The background area was also reduced in the prediction bounding boxes. The model detection was more visually appealing. The finding can provide an effective way for the practical observation of the growth status of wheat ears and counting the number of ears.
Keywords:image recognition  crops  attention mechanism  wheat ear  Oriented R-CNN
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号