首页 | 本学科首页   官方微博 | 高级检索  
     检索      

结合GWRFR和作物物候信息的玉米产量早期预测
引用本文:裴杰,谭绍锋,郭韩,刘一博,方华军.结合GWRFR和作物物候信息的玉米产量早期预测[J].农业工程学报,2024,40(1):161-169.
作者姓名:裴杰  谭绍锋  郭韩  刘一博  方华军
作者单位:中山大学测绘科学与技术学院,珠海 519082;自然资源部华南热带亚热带自然资源监测重点实验室,珠海 519082;中国科学院地理科学与资源研究所生态系统观测与模拟重点实验室,北京 100101;中科吉安生态环境研究院,吉安 343000
基金项目:井冈山农高区省级科技专项“揭榜挂帅”项目(20222-051244);广东省基础与应用基础研究基金项目(2021A1515110442)
摘    要:及时并准确地估计作物产量,对保障粮食安全、维护世界粮食供应稳定具有重要意义。此前,已有许多研究者使用机器学习方法对作物产量预估进行研究。然而,结合作物的空间分布、使用局部模型进行分析的研究较少;且诸多研究均以年份为时间尺度进行建模,未能精细到作物生长的各个阶段,无法实现作物产量的早期预测。针对以上问题,该研究结合多源遥感数据,利用随机森林(random forest,RF)以及地理加权随机森林(geographically weighted random forest regression,GWRFR)模型对美国县级玉米产量进行建模,探讨全局与局部模型在玉米产量预测方面的性能;并通过将GWRFR模型应用于玉米的各个物候期,获取了玉米产量的最佳提前预测时间。结果表明,GWRFR局部模型的精度(R2=0.87,RMSE=864.21 kg/hm2)高于传统的RF全局模型(R2=0.83,RMSE=994.75 kg/hm2),并且能够较好地克服空间数据的非平稳性,即使在全局模型中加入经纬度作为变量,RF模型的预测效果(R2=0.85,RMSE=890.88 kg/hm2)仍然低于GWRFR模型。对于玉米产量的预测可以提前至收获前2~3个月,即在乳熟期前后就能得到比较准确的预测结果(R2=0.90,RMSE=748.39 kg/hm2)。该研究结果可为大尺度作物产量预估提供一种新的思路,对区域或全球其他作物的产量预测也具有一定的指导意义。

关 键 词:产量  预测  遥感  机器学习  作物物候
收稿时间:2023/8/3 0:00:00
修稿时间:2023/12/19 0:00:00

Early prediction of maize yield by integrating GWRFR and crop phenological information
PEI Jie,TAN Shaofeng,GUO Han,LIU Yibo,FANG Huajun.Early prediction of maize yield by integrating GWRFR and crop phenological information[J].Transactions of the Chinese Society of Agricultural Engineering,2024,40(1):161-169.
Authors:PEI Jie  TAN Shaofeng  GUO Han  LIU Yibo  FANG Huajun
Abstract:The precise estimation of crop yields is essential for global food security, particularly in the face of challenges like climate change, population growth, and food distribution inequalities. Despite the widespread use of machine learning techniques combined with remote sensing data for large-scale yield prediction, the integration of crop spatial position information and local models remains underexplored. This is particularly significant given the spatial nature of crop yield prediction, where spatial factors are highly influential. Previous studies, predominantly conducted on an annual or full-growth season basis, have not provided precise predictions for each phenological stage of maize growth. Consequently, these studies fall short in pinpointing the most effective prediction time for maize yield and understanding the impact of environmental factors at each stage. This research delves into two key questions: 1) Does the inclusion of spatial location information in the geographic weighted random forest (GWRFR) model improve yield prediction accuracy over the traditional random forest model? 2) Among different phenological stages of maize, which stage provides the optimal window for yield prediction? To address these issues, this study employed multi-source remote sensing data in conjunction with machine learning algorithms, and predicted maize yield at the county level in the United States. This study investigated the relationship between yield prediction and the spatial location of sample points, assessing the relevance of including latitude and longitude as independent variables. Further, the study introduced the local GWRFR model for maize yield prediction and compared its modeling performance with the global random forest (RF) model. In addition, the study examined two methodological approaches for determining the best prediction time. The first approach, referred to as the accumulated environmental variables (AEV) approach, integrated data from various phenological periods. The second approach, known as the current stage variables (CSV) approach, used data exclusively from the specific growth stage under analysis. The seven key growth stages of maize included planted, emerged, silking, dough, dent, mature and harvest, providing a comprehensive view of the crop''s lifecycle. Through a comprehensive evaluation of the results from both schemes, this study identified the optimal prediction time for maize yield. The findings indicate that incorporating latitude and longitude into the model enhanced yield prediction accuracy. Without these spatial factors, the RF model achieved an coefficient of determination (R2) of 0.83 and root mean squared error (RMSE) of 994.75 kg/hm2, while including them improved these metrics to an R2 of 0.85 and RMSE of 890.88 kg/hm2. This provides preliminary evidence that including spatial factors can enhance maize yield prediction accuracy. Moreover, the local GWRFR model further improved prediction accuracy (R2=0.87, RMSE=864.21 kg/hm2), outperforming the traditional RF model and effectively addressing the non-stationarity of spatial data. In terms of optimal prediction time, the scheme where the environmental variables accumulate over phenological stages showed increasing accuracy from the first stage (planted) up to the fourth stage (dough), peaking at R2=0.90 and RMSE of 748.39 kg/hm2, and then stabilized. In contrast, the scheme utilizing only current stage variables improved accuracy from the first stage up to the third stage (silking), reaching its peak (R2=0.88, RMSE=827.85 kg/hm2) before decreasing. This suggests the best prediction time was around dough stage, approximately 2-3 months before harvest. Additionally, the strong correlation observed between early prediction results and those covering the entire growth season underscores the reliability of maize yield predictions made during the dough stages. In conclusion, this study introduces a novel method for large-scale crop yield prediction, integrating spatial data and phenological stages with advanced modeling techniques. The findings significantly contribute to enhancing food security and stabilizing the global food supply chain. This research not only provides critical insights for agricultural practices but also sets a foundation for future studies in crop yield prediction, potentially extending to other crops and regions, and incorporating a broader range of environmental factors.
Keywords:yield  prediction  remote sensing  machine learning  crop phenology
点击此处可从《农业工程学报》浏览原始摘要信息
点击此处可从《农业工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号