随着农作物病虫害研究文献的快速增长,对农作物病虫害领域文献进行文本挖掘变得越来越重要。开发有效、准确的农作物病虫害命名实体识别系统有助于在农作物病虫害相关研究报告中提取研究成果,为农作物病虫害的治理提供有效建议。本文针对中文农作物病虫害数据集缺失问题,提出了基于半远程监督的停等算法,利用该算法构建中文农作物病虫害领域语料库,大幅度减少标注过程的人工成本和时间成本;同时,提出了中文农作物病虫害命名实体识别模型(Agricultural information extraction, Agr-IE),该模型基于BERT-BILSTM-CRF,辅以多源信息融合(多源分词信息和全局词汇嵌入信息)丰富字符向量,使其充分结合字符级与词汇级的信息,以提高模型捕捉上下文信息的能力。实验表明,该模型可以有效地识别病害、虫害、药剂、作物等实体,F1值分别为96.56%、95.12%、94.48%、95.54%,并对识别难度较大的病原实体具有较好的识别效果,F1值为81.48%,高于BERT-BILSTM-CRF、BERT等模型的相应值。本文所提模型在MSRA和Weibo等其他领域数据集上与CAN-NER、Lattice-LSTM-CRF等模型进行了对比实验,并取得最佳的识别效果,F1值分别为95.80%、94.57%,表明该算法具有一定的泛化能力。 相似文献
Making use of the markers linked closely to QTL for early-maturing traits for MAS (Marker-assisted selection) is an effective method for the simultaneous improvement of early maturity and other properties in cotton. In this study, two F2 populations and their F2:3 families were generated from the two upland cotton (Gossypium hirsutum L.) crosses, Baimian2 × TM-1 and Baimian2 × CIR12. QTL for early-maturing traits were analyzed using F2:3 families. A total of 54 QTL (31 suggestive and 23 significant) were detected. Fourteen significant QTL had the LOD scores not only > 3 but also exceeding permutation threshold. At least four common QTL, qBP-17 for bud period (BP), qGP-17a/qGP-17b (qGP-17) for growth period (GP), qYPBF-17a/qYPBF-17b (qYPBF-17) for yield percentage before frost (YPBF) and qHFFBN-17 for height of first fruiting branch node (HFFBN), were found in both populations. These common QTL should be reliable and could be used for MAS to facilitate early maturity. The common QTL, qBP-17, had a LOD score not only > 3 but also exceeding permutation threshold, explaining 12.6% of the phenotypic variation. This QTL should be considered preferentially in MAS. Early-maturing traits of cotton are primarily controlled by dominant and over-dominant effects. 相似文献
Soil organic carbon (SOC) in mountainous regions is characterized by strong topography-induced heterogeneity, which may contribute to large uncertainties in regional SOC stock estimation. However, the quantitative effects of topography on SOC stocks in semiarid alpine grasslands are currently not well understood. Therefore, the purpose of this research study is to determine the role of topography in shaping the spatial patterns of SOC stocks.
Materials and methods
Soils from the summit, shoulder, backslope, footslope, and toeslope positions along nine toposequences within three elevation-dependent grassland types (i.e., montane desert steppe at ~?2450 m, montane steppe at ~?2900 m, and subalpine meadow at ~?3350 m) are sampled at four depths (0–10, 10–20, 20–40, and 40–60 cm). SOC content, bulk density, soil texture, soil water content, and grassland biomass are determined. The general linear model (GLM) is employed to quantify the effects of topography on the SOC stocks. Ordinary least squares regressions are performed to explore the underlying relationships between SOC stocks and the other edaphic factors.
Results and discussion
In accordance with the present results, the SOC stocks at 0–60 cm show an increasing trend in respect to the elevation zone, with the highest stock being approximately 37.70 g m?2 in the subalpine meadow, about 2.07 and 3.41 times larger than that in the montane steppe and montane desert steppe, respectively. Along the toposequences, it is revealed the SOC stocks are maximal at toeslope, reaching to 14.98, 31.76, and 49.52 kg m?2, which are also significantly larger than those at the shoulder by a factor of 1.38, 2.31, and 1.44, in montane desert steppe, montane steppe, and subalpine meadow, respectively. Topography totally is seen to explain about 84% of the overall variation in SOC stocks, of which 70.61 and 9.74% are attributed to elevation zone and slope position, while the slope aspect and slope gradient are seen to plausibly explain only about 1.84 and 0.01%, respectively.
Conclusions
The elevation zone and the slope position are seen to markedly shape the spatial patterns of the SOC stocks, and thus, they may be considered as key indicating factors in constructing the optimal SOC estimation model in such semiarid alpine grasslands.