首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于改进HRNet的单幅图像苹果果树深度估计方法
引用本文:龙燕,高研,张广犇.基于改进HRNet的单幅图像苹果果树深度估计方法[J].农业工程学报,2022,38(23):122-129.
作者姓名:龙燕  高研  张广犇
作者单位:1.西北农林科技大学机械与电子工程学院,杨凌 712100;2. 农业农村部农业物联网重点实验室,杨凌 712100;3. 陕西省农业信息感知与智能服务重点实验室,杨凌 712100
基金项目:陕西省重点研发计划一般项目-农业领域(2020NY-144)
摘    要:针对苹果自动采收获取果树深度信息的实际需求,以及目前单幅图像深度估计算法存在的空间分辨率低和边缘模糊问题,提出一种基于改进高分辨率网络(High-Resoultion Net, HRNet)的单幅图像苹果果树深度估计模型。首先基于HRNet构建多分支并行的编码器网络,提取多尺度特征,并通过引入密集连接机制强化特征传递过程中的连续性;为了减少冗余特征造成的噪声干扰,使用卷积注意力模块在通道及像素层级对融合特征进行重标定,强化特征图结构信息。在解码器网络中,使用条纹细化模块自适应地优化特征图的边界细节信息,突出边缘特征,改善边缘模糊问题,最后经上采样生成深度图。在NYU Depth V2公共数据集和果园深度数据集上进行试验。试验结果表明,引入密集连接机制,添加卷积注意力模块、条纹细化模块均能提升模型性能。提出的改进HRNet网络在果园深度数据集上的平均相对误差、均方根误差、对数平均误差、深度边缘准确误差和边缘完整性误差分别为0.123、0.547、0.051、3.90和10.76,在不同阈值下的准确率达到了0.850、0.975、0.993;在主观视觉上,改进HRNet网络生成的深度图有清晰的边缘以及较多的纹理细节。本文方法在客观指标和主观效果上均有良好的表现。

关 键 词:苹果果树  单幅图像深度估计  密集连接机制  卷积注意力模块  条纹细化模块
收稿时间:2022/9/26 0:00:00
修稿时间:2022/11/26 0:00:00

Depth estimation of apple tree in single image using improved HRNet
Long Yan,Gao Yan,Zhang Guangben.Depth estimation of apple tree in single image using improved HRNet[J].Transactions of the Chinese Society of Agricultural Engineering,2022,38(23):122-129.
Authors:Long Yan  Gao Yan  Zhang Guangben
Institution:1. College of Mechanical and Electronic Engineering, Northwest A & F University, Yangling 712100, China; 2. Key Laboratory of Agricultural Internet of Things, Ministry of Agriculture and Rural Affairs, Yangling 712100, China; 3. Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Service, Yangling, 712100, China
Abstract:Estimation of apple tree depth from a single RGB image can be applied to precise fruit positioning and robot autonomous harvesting. In order to satisfy the actual requirements of obtaining depth information for apple mechanized picking, the improved High-Resolution Network (HRNet) was used to carry out research on the monocular depth estimation method of apple tree in the natural scene. Firstly, a multi-branch parallel encoder network was constructed based on HRNet to extract multi-scale features, and the continuity in the feature transfer process was enhanced by introducing a dense connection mechanism. In order to reduce the noise interference caused by redundant features, the Convolutional Block Attention Module (CBAM) was used to recalibrate the fused feature maps at the channel and pixel levels, effectively learn the different weight distributions of the feature maps, and enhance the structure information. In the decoder network, the Stripe Refinement Module (SRM) was used to gather the boundary pixels in the horizontal and vertical orthogonal directions, adaptively optimize the boundary details of the feature map, highlight the edge features, and reduce the blurry edge in the predicted results. Finally, the prediction depth images of the same size as the RGB images were generated by up-sampling. An image acquisition platform was set up to collect RGB and depth images of apple orchards at different times, and then enhanced the data using horizontal mirroring, color jitter, and random rotation. Through data enhancement, totally 3374 orchard RGB images and depth images were obtained to make our depth datasets. Experiments were conducted on the NYU Depth V2 dataset and the orchard depth dataset. Firstly, ablation experiments were performed on HRNet networks with different degrees of improvement. Compared with the traditional HRNet network, the predictive performance of different improved networks had been improved to some extent, which indicated the introduction of dense connection mechanism, adding CBAM, SRM could improve model performance. Secondly, we compared the algorithm in this paper with the current mainstream networks, the average relative error (REL), root mean square error (RMS), logarithmic mean error (log10), depth edge accuracy error () and edge integrity error () of the proposed improved HRNet network on the orchard depth dataset were 0.123, 0.547, 0.051, 3.90 and 10.76; the accuracy at different thresholds reached 0.850, 0.975, 0.993; in terms of subjective vision, the depth map generated by the improved HRNet network had more accurate spatial resolution, which could better present the depth information distribution of the image, with clear edges and more texture details, the depth information of some small-sized objects were also displayed, and the overall effect was the best, which was closer to the real depth map. The ablation analysis of the above network and the comparison of subjective and objective effects with other depth estimation algorithms fully demonstrate the effectiveness of the proposed algorithm. Based on the experiment results, the proposed network outperformed for both visual quality and objective measurement on NYU Depth V2 dataset and the orchard depth dataset, which can provide the apple automatic picking machine with a new idea to obtain depth information.
Keywords:apple fruit tree  single image depth estimation  dense connection mechanism  convolutional block attention module  stripe refinement module
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号