首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于改进YOLOV5s网络的奶牛多尺度行为识别方法
引用本文:白强,高荣华,赵春江,李奇峰,王荣,李书琴.基于改进YOLOV5s网络的奶牛多尺度行为识别方法[J].农业工程学报,2022,38(12):163-172.
作者姓名:白强  高荣华  赵春江  李奇峰  王荣  李书琴
作者单位:1. 西北农林科技大学信息工程学院,杨凌 712100;2. 北京市农林科学院信息技术研究中心,北京 100097;3. 国家农业信息化工程技术研究中心,北京 100097
基金项目:国家重点研发计划项目(2019YFE0125400);北京市农林科学院科技创新能力建设专项(KJCX20220404)
摘    要:奶牛站立、喝水、行走、躺卧等日常行为与其生理健康密切相关,高效准确识别奶牛行为对及时掌握奶牛健康状况,提高养殖场经济效益具有重要意义。针对群体养殖环境下奶牛行为数据中,场景复杂、目标尺度变化大、奶牛行为多样等对行为识别造成的干扰,该研究提出一种改进YOLOV5s奶牛多尺度行为识别方法。该方法在骨干网络顶层引入基于通道的Transformer注意力机制使模型关注奶牛目标区域,同时对奶牛多尺度行为目标增加路径聚合结构的支路与检测器获取底层细节特征,并引入SE(Squeeze-and-Excitation Networks)注意力机制优化检测器,构建SEPH(SE Prediction Head)识别重要特征,提高奶牛多尺度行为识别能力。试验验证改进后的奶牛行为识别模型在无权重激增的同时,多尺度目标识别结果的平均精度均值较YOLOV5s提高1.2个百分点,尤其是对奶牛行走识别结果的平均精度4.9个百分点,研究结果为群体养殖环境下,全天实时监测奶牛行为提供参考。

关 键 词:机器视觉  图像识别  奶牛行为识别  YOLOV5s  Transformer  多尺度  注意力机制
收稿时间:2022/3/8 0:00:00
修稿时间:2022/6/8 0:00:00

Multi-scale behavior recognition method for dairy cows based on improved YOLOV5s network
BaiQiang,Gao Ronghu,ZhaoChunjiang,Li Qifeng,WangRong,Li Shuqin.Multi-scale behavior recognition method for dairy cows based on improved YOLOV5s network[J].Transactions of the Chinese Society of Agricultural Engineering,2022,38(12):163-172.
Authors:BaiQiang  Gao Ronghu  ZhaoChunjiang  Li Qifeng  WangRong  Li Shuqin
Abstract:Abstract: Daily behaviors of dairy cows (such as standing, drinking, walking, and lying down) are closely related to their physical health. Efficient and accurate identification of dairy cow behavior is of great significance to timely grasp the health status for the better economic benefits of the farm. However, the cow behavior data varies significantly in the group breeding environment, due to the complex scene, conditions, and diverse cow behavior. In addition, the behavior recognition can also be confined to the target scale of cows covering a wide area under different perspectives. In this study, a multi-scale behavior recognition was proposed for the dairy cow using an improved YOLOV5s network. First, a channel-based Transformer attention mechanism was introduced at the top layer of the backbone network. The learnable location parameters were then added to all channels of the top-level feature map with the high-level semantics. As such, the relationship between feature channels and regional information was established, where the size of the location parameter was represented by the region. Secondly, a correlation analysis was performed on the channel sequences at different levels, combined with the multi-head self-attention mechanism of the Transformer. The degree of importance was then obtained to strengthen the expression of feature information between channels. Thus, the long-range dependency between regions and feature channels was built for the model to focus on the cow target area during training. Thirdly, the PAN Neck structure was used to transfer the feature information of different levels through up-sampling and down-sampling. The PAN Neck branch was then added to the feature map by twice down-sampling for the multi-scale behavioral target of dairy cows. The target detector was also selected for the underlying features, where the high-level semantic information of the top layer was integrated into the underlying features under the action of PAN Neck. Correspondingly, a feature map was constructed with detailed features and high-level semantic information, while the SE attention machine was introduced for the global pooling. Finally, the global information of the feature map channel was extracted to determine the importance of the single-layer MLP. The weight of important features also increased to suppress the propagation of noise information at the channel level. Four-scale detectors were optimized to construct the optimal SEPH of multi-scale targets for the better performance of multi-scale behavior recognition of dairy cows. Consequently, there was no weight surge for the improved recognition model of cow behavior. Specifically, the mAP of multi-scale target recognition increased by 1.2 percentage points after experimental verification, especially the AP of two similar behavior (cow walking and standing) recognition increased by 0.8 and 4.9 percentage points, respectively, compared with the original. Nevertheless, the cow''s drinking water and eating behaviors cannot be detected directly during this time. A second-level behavior evaluation was then proposed for the cows using greedy thinking. Therefore, the spatial information of the auxiliary judgment label was used to jointly determine the drinking and eating behavior of cows. The numbers of errors were two and five for the cow drinking water and eating, respectively, indicating the better performance of the improved YOLOV5s network.
Keywords:machine vision  image recognition  dairy cows behavior recognition  YOLOV5s  Transformer  multi-scale  attention module
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号