基于语义分割的非结构化田间道路场景识别 Recognition of unstructured field road scene based on semantic segmentation model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于语义分割的非结构化田间道路场景识别

引用本文：	孟庆宽,杨晓霞,张漫,关海鸥.基于语义分割的非结构化田间道路场景识别[J].农业工程学报,2021,37(22):152-160.

作者姓名：	孟庆宽杨晓霞张漫关海鸥

作者单位：	1. 天津职业技术师范大学自动化与电气工程学院,天津市信息传感与智能控制重点实验室,天津 300222;2. 中国农业大学现代精细农业系统集成研究教育部重点实验室,北京 100083;3. 黑龙江八一农垦大学电气与信息学院,大庆 163319

基金项目：	国家自然科学基金项目（31571570、62001329）;天津市自然科学基金项目（18JCQNJC04500、19JCQNJC01700）;天津职业技术师范大学校级预研项目（KJ2009、KYQD1706）

摘要：	环境信息感知是智能农业装备系统自主导航作业的关键技术之一。农业田间道路复杂多变,快速准确地识别可通行区域,辨析障碍物类别,可为农业装备系统高效安全地进行路径规划和决策控制提供依据。该研究以非结构化农业田间道路场景为研究对象,根据环境对象动、静态属性进行类别划分,提出一种基于通道注意力结合多尺度特征融合的轻量化语义分割模型。首先采用Mobilenet V2轻量卷积神经网络提取图像特征,将混合扩张卷积融入特征提取网络最后2个阶段,在保证特征图分辨率的基础上增加感受野并保持信息的连续性与完整性;然后引入通道注意力模块对特征提取网络各阶段特征通道依据重要程度重新标定;最后通过空间金字塔池化模块将多尺度池化特征进行融合,获取更加有效的全局场景上下文信息,增强对复杂道路场景识别的准确性。语义分割试验表明,不同道路环境下本文模型可以对场景对象进行有效识别解析,像素准确率和平均像素准确率分别为94.85%、90.38%,具有准确率高、鲁棒性强的特点。基于相同测试集将本文模型与FCN-8S、SegNet、DeeplabV3+、BiseNet模型进行对比试验,本文模型的平均区域重合度为85.51%,检测速度达到8.19帧/s,参数数量为,相比于其他模型具有准确性高、推理速度快、参数量小等优点,能够较好地实现精度与速度的均衡。研究成果可为智能农业装备在非结构化道路环境下安全可靠运行提供技术参考。
关键词：	机器视觉语义分割环境感知非结构化道路轻量卷积注意力机制特征融合
收稿时间：	2021/6/1 0:00:00
修稿时间：	2021/9/16 0:00:00
Recognition of unstructured field road scene based on semantic segmentation model

Meng Qingkuan,Yang Xiaoxi,Zhang Man,Guan Haiou.Recognition of unstructured field road scene based on semantic segmentation model[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(22):152-160.

Authors:	Meng Qingkuan Yang Xiaoxi Zhang Man Guan Haiou

Institution:	1. College of Automation and Electrical Eengineering, Tianjin University of Technology and Education, Tianjin Key Laboratory of Information Sensing and Intelligent Control, Tianjin 300222, China;;2. Key Laboratory of Modern Precision Agriculture System Integration Research, Ministry of Education, China Agricultural University, Beijing 10083, China; 3. College of Electrical and Information, Heilongjiang Bayi Agricultural University, Daqing 163319, China

Abstract:	Abstract: Environmental information perception has been one of the most important technologies in agricultural automatic navigation tasks, such as plant fertilization, crop disease detection, automatic harvesting, and cultivation. Among them, the complex environment of a field road is characterized by the fuzzy road edge, uneven road surface, and irregular shape. It is necessary to accurately and rapidly identify the passable areas and obstacles when the agricultural machinery makes path planning and decision control. In this study, a lightweight semantic segmentation model was proposed to recognize the unstructured roads in fields using a channel attention mechanism combined with the multi-scale features fusion. Some environmental objects were also classified into 12 categories, including building, person, vehicles, sky, waters, plants, road, soil, pole, sign, coverings, and background, according to the static and dynamic properties. Furthermore, a mobile architecture named MobileNetV2 was adopted to obtain the image feature information, in order to reduce the model parameters for a higher reasoning speed. Specifically, an inverted residual structure with lightweight depth-wise convolutions was utilized to filter the features in the intermediate expansion layer. In addition, the last two stages of the backbone network were combined with the hybrid dilated convolution (HDC), aiming to increase the receptive fields and maintain the resolution of the feature map. The hybrid dilated convolution with the dilation rate of 1, 2, and 3 was used to effectively expand the receptive fields, thereby alleviating the "gridding problem" caused by the standard dilated convolution. A channel attention block (CAB) was also introduced to change the weight of each stage feature, in order to enhance the class consistency. The channel attention block was used to strengthen both the higher and lower level features of each stage for a better prediction. In addition, some errors of semantic segmentation were partially or completely attributed to the contextual relationship. A pyramid pooling module was empirically adopted to fuse three scale feature maps for the global contextual prior. There was the global context information in the first image level, where the feature vector was produced by a global average pooling. The pooled representation was then generated for different locations, where the rest pyramid levels separated the feature maps into different sub-regions. As such, the output of different levels in the pyramid module contained the feature maps with varied sizes, followed by up sampling and concatenation to form the final output. The results showed that the objects in the complex roads were effectively segmented with pixel accuracy (PA) and mean pixel accuracy (MPA) of 94.85% and 90.38%, respectively. Furthermore, the single category pixel accuracy of some objects was more than 90%, such as road, plants, building, waters, sky, and soil, indicating a higher accuracy, strong robustness, and excellent generalization. An evaluation was also made to verify the efficiency and superiority of the model, where the mean intersection over union (MIoU), segmentation speed, and parameter scale were adopted as the indexes. The FCN-8S, SegNet, DeeplabV3+ and BiseNet networks were also developed on the same training and test datasets. It was found that the MIoU of the model was 85.51%, indicating a higher accuracy than others. The parameter quantity of the model was 2.41×106, smaller than FCN-8S, SegNet, DeeplabV3+, and BiseNet. In terms of an image with a resolution of 512×512 pixels, the reasoning speed of the model reached 8.19 frames per second, indicating an excellent balance between speed and accuracy. Consequently, the lightweight semantic segmentation model was achieved to accurately and rapidly segment the multiple road scenes in the field environment. The finding can provide a strong technical reference for the safe and reliable operation of intelligent agricultural machinery on unstructured roads.

Keywords:	machine vision semantic segmentation environmental perception unstructured field roads lightweight convolution attention mechanism feature fusion

	点击此处可从《农业工程学报》浏览原始摘要信息
	点击此处可从《农业工程学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏