首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The general linear model encompasses statistical methods such as regression and analysis of variance (anova ) which are commonly used by soil scientists. The standard ordinary least squares (OLS) method for estimating the parameters of the general linear model is a design‐based method that requires that the data have been collected according to an appropriate randomized sample design. Soil data are often obtained by systematic sampling on transects or grids, so OLS methods are not appropriate. Parameters of the general linear model can be estimated from systematically sampled data by model‐based methods. Parameters of a model of the covariance structure of the error are estimated, then used to estimate the remaining parameters of the model with known variance. Residual maximum likelihood (REML) is the best way to estimate the variance parameters since it is unbiased. We present the REML solution to this problem. We then demonstrate how REML can be used to estimate parameters for regression and anova ‐type models using data from two systematic surveys of soil. We compare an efficient, gradient‐based implementation of REML (ASReml) with an implementation that uses simulated annealing. In general the results were very similar; where they differed the error covariance model had a spherical variogram function which can have local optima in its likelihood function. The simulated annealing results were better than the gradient method in this case because simulated annealing is good at escaping local optima.  相似文献   

2.
Abstract. Assuming that other sources of error can be neglected, the reliability of a land suitability classification depends on the homogeneity of physiographically delineated map units with regard to land qualities. The map unit homogeneity of a small area in France was estimated using 64 observation points, arranged according to a nested sampling scheme, followed by nested analysis of variance.
The analysis shows that in this area map units are too heterogeneous to accept the suitability classification as being completely reliable. However, alternative procedures using methods of optimal interpolation to map gradual change within the physiographic units are too expensive at a mapping scale of 1:25000 or smaller. It is not possible to produce completely accurate suitability maps at smaller scales. However, incorporating nested sampling and analysis of variance as standard procedures in land evaluation surveys costs little effort and yields at least an estimate of map accuracy and reliability of the suitability classification.  相似文献   

3.
Process models are commonly used in soil science to obtain predictions at a spatial scale that is different from the scale at which the model was developed, or the scale at which information on model inputs is available. When this happens, the model and its inputs require aggregation or disaggregation to the application scale, and this is a complex problem. Furthermore, the validity of the aggregated model predictions depends on whether the model describes the key processes that determine the process outcome at the target scale. Different models may therefore be required at different spatial scales. In this paper we develop a diagnostic framework which allows us to judge whether a model is appropriate for use at one or more spatial scales both with respect to the prediction of variations at those scale and in the requirement for disaggregation of the inputs. We show that spatially nested analysis of the covariance of predictions with measured process outcomes is an efficient way to do this. This is applied to models of the processes that lead to ammonia volatilization from soil after the application of urea. We identify the component correlations at different scales of a nested scheme as the diagnostic with which to evaluate model behaviour. These correlations show how well the model emulates components of spatial variation of the target process at the scales of the sampling scheme. Aggregate correlations were identified as the most pertinent to evaluate models for prediction at particular scales since they measure how well aggregated predictions at some scale correlate with aggregated values of the measured outcome. There are two circumstances under which models are used to make predictions. In the first case only the model is used to predict, and the most useful diagnostic is the concordance aggregate correlation. In the second case model predictions are assimilated with observations which should correct bias in the prediction, and errors in the variance; the aggregate correlations would be the most suitable diagnostic.  相似文献   

4.
A column of soil, excavated from a contaminated landscape was evaluated by means of X-ray fluorescence analysis. The measurements were intended to assess the vertical distribution of heavy metals and toxic elements in the depth profile. To judge fitness for purpose of the analytical method used the element specific power functions were derived yielding the minimum detectable variations of analyte concentrations refer to the investigated soil profile. The required measurement uncertainty components caused by both the sampling procedure and chemical analysis were empirically estimated using a nested sampling design (duplicate method). For this purpose the full length of the soil core was divided into horizontal layers. From each selected layer (sampling target) two composite samples were taken by simple random sampling to represent the typical composition of the sampling target. The pool of measurement results, obtained for the nested sampling design finally was subjected to variance analysis. The evaluation of the estimated variance components in terms of the percentage of total variance confirmed fitness for purpose for the method used.  相似文献   

5.
Geostatistical estimates of a soil property by kriging are equivalent to the best linear unbiased predictions (BLUPs). Universal kriging is BLUP with a fixed‐effect model that is some linear function of spatial co‐ordinates, or more generally a linear function of some other secondary predictor variable when it is called kriging with external drift. A problem in universal kriging is to find a spatial variance model for the random variation, since empirical variograms estimated from the data by method‐of‐moments will be affected by both the random variation and that variation represented by the fixed effects. The geostatistical model of spatial variation is a special case of the linear mixed model where our data are modelled as the additive combination of fixed effects (e.g. the unknown mean, coefficients of a trend model), random effects (the spatially dependent random variation in the geostatistical context) and independent random error (nugget variation in geostatistics). Statisticians use residual maximum likelihood (REML) to estimate variance parameters, i.e. to obtain the variogram in a geostatistical context. REML estimates are consistent (they converge in probability to the parameters that are estimated) with less bias than both maximum likelihood estimates and method‐of‐moment estimates obtained from residuals of a fitted trend. If the estimate of the random effects variance model is inserted into the BLUP we have the empirical BLUP or E‐BLUP. Despite representing the state of the art for prediction from a linear mixed model in statistics, the REML–E‐BLUP has not been widely used in soil science, and in most studies reported in the soils literature the variogram is estimated with methods that are seriously biased if the fixed‐effect structure is more complex than just an unknown constant mean (ordinary kriging). In this paper we describe the REML–E‐BLUP and illustrate the method with some data on soil water content that exhibit a pronounced spatial trend.  相似文献   

6.
In a spatial regression context, scientists are often interested in a physical interpretation of components of the parametric covariance function. For example, spatial covariance parameter estimates in ecological settings have been interpreted to describe spatial heterogeneity or “patchiness” in a landscape that cannot be explained by measured covariates. In this article, we investigate the influence of the strength of spatial dependence on maximum likelihood (ML) and restricted maximum likelihood (REML) estimates of covariance parameters in an exponential-with-nugget model, and we also examine these influences under different sampling designs—specifically, lattice designs and more realistic random and cluster designs—at differing intensities of sampling (n=144 and 361). We find that neither ML nor REML estimates perform well when the range parameter and/or the nugget-to-sill ratio is large—ML tends to underestimate the autocorrelation function and REML produces highly variable estimates of the autocorrelation function. The best estimates of both the covariance parameters and the autocorrelation function come under the cluster sampling design and large sample sizes. As a motivating example, we consider a spatial model for stream sulfate concentration.  相似文献   

7.
The value of nested sampling for exploring the spatial structure of univariate variation of the soil has been demonstrated in several studies and applied to practical problems. This paper shows how the method can be extended to the multivariate case. While the extension is simple in theory, in practice the direct estimation of covariance components by equating mean‐square matrices with their expectation will often lead to estimates that are not positive semidefinite. This paper discusses solutions to this problem for balanced and unbalanced sample designs. In the balanced case there is a residual maximum likelihood (REML) estimator that will find estimates of covariance components that maximize an overall likelihood on the condition that all components are positive semidefinite (p.s.d.). This is possible because the condition is met if the differences of successive mean‐square matrices are positive semidefinite, and this constraint can be incorporated into an algorithm. This does not hold for unbalanced designs. In this paper the problem was solved for unbalanced designs by scaling covariance components that were not p.s.d. to the nearest p.s.d. matrix according to a Euclidean distance. These methods were applied to data from three surveys, two with balanced and one with unbalanced sampling. Different patterns of scale‐dependence of the correlation of soil properties were found. For example, at Ginninderra Experimental Station in Australia the soil water content and bulk density were correlated significantly, with the correlation increasing with distance to 56 m, but at longer distances the properties were not significantly correlated. By contrast, the pH of the soil and the available P content showed correlation that increased with distance. The implications of these results for planning more detailed sampling, both for prediction and for investigation of processes, are discussed.  相似文献   

8.
Emissions of gases from the soil are known to vary spatially in a complex way. In this paper we show how such data can be analysed with the wavelet transform. We analysed data on rates of N2O emission from soil cores collected at 4‐m intervals on a 1024‐m transect across arable land at Silsoe in England. We used a thresholding procedure to represent intermittent variation in N2O emission from the soil as a sparse wavelet process, i.e. one in which most of the wavelet coefficients are not significantly different from zero. This analysis made clear that the rate of N2O emission varied more intermittently on this transect than did soil pH, for which many more of the wavelet coefficients had to be retained. This account of intermittent variation motivated us to consider a class of random functions, which we call wavelet random functions, for the simulation of spatially intermittent variation. A wavelet random function (WRF) is an inverse wavelet transform of a set of random wavelet coefficients with specified variance at each scale. We generated intermittent variation at a particular scale in the WRF by specifying a binormal process for the wavelet coefficients at this scale. We showed by simulation that adaptive sampling schemes are more efficient than ordinary stratified random sampling to estimate the mean of a spatial variable that is intermittent at a particular scale. This is because the sampling can be concentrated in the more variable regions. When we simulated values that emulate the intermittency of our data on N2O we found that the gains in efficiency from simple adaptive sampling schemes were small. This was because the emission of N2O is intermittent over several disparate scales. More sophisticated adaptive sampling is needed for these conditions, and it should embody knowledge of the relevant soil processes.  相似文献   

9.
The purpose of this note is to propose a variance estimator under non-measurable designs that exploits the existence of an auxiliary variable well correlated with the survey variable of interest. Under non-measurable designs, the Sen–Yates–Grundy variance estimator generates a downward bias that can be reduced using a calibration weighting based on the auxiliary variable. Conditions of approximate unbiasedness for the resulting calibration estimator are given. The application to systematic sampling is considered. The proposal proves to be effective for estimating the variance of the forest cover estimator in remote sensing-based surveys, owing to the strong correlation between the reference data, available from a systematic sample, and the satellite map data, available for the whole population and hence exploited as an auxiliary variable. Supplementary materials accompanying this paper appear online.  相似文献   

10.
Quantitative predictions of ammonia volatilization from soil are useful to environmental managers and policy makers and empirical models have been used with some success. Spatial analysis of the soil properties and their relationship to the ammonia volatilization process is important as predictions will be required at disparate scales from the field to the catchment and beyond. These relationships are known to change across scales and this may affect the performance of an empirical model. This study is concerned with the variation of ammonia volatilization and some controlling soil properties: bulk density, volumetric water content, pH, CEC, soil pH buffer power, and urease activity, over distances of 2, 50, 500, and >2000 m. We sampled a 16 km × 16 km region in eastern England and analyzed the results by a nested analysis of (co)variance, from which variance components and correlations for each scale were obtained. The overall correlations between ammonia volatilization and the soil properties were generally weak: –0.09 for bulk density, 0.04 for volumetric water content, –0.22 for CEC, –0.08 for urease activity, –0.22 for pH and 0.18 for the soil pH buffer power. Variation in ammonia volatilization was scale‐dependent, with substantial variance components at the 2‐ and 500‐m scales. The results from the analysis of covariance show that the relationships between ammonia volatilization and soil properties are complex. At the >2000 m scale, ammonia volatilization was strongly correlated with pH (–0.82) and CEC (–0.55), which is probably the result of differences in parent material. We also observed weaker correlations at the 500‐m scale with bulk density (–0.61), volumetric water content (0.48), urease activity (–0.42), pH (–0.55) and soil pH buffer power (0.38). Nested analysis showed that overall correlations may mask relationships at scales of interest and the effect of soil variables on these soil processes is scale‐dependent.  相似文献   

11.
基于河北省第二次全国土壤普查数据,对比了常用土壤有机碳相关因子土地利用和土壤类型与普通克里格插值结合前后对土壤有机碳密度空间预测精度的差异,探讨了普通克里格插值在区域土壤有机碳空间预测中的应用。研究结果表明,土地利用能够独立解释土壤有机碳密度总方差的19.0%,与普通克里格插值结合以后能够将对土壤有机碳密度总方差的解释程度显著提高到30.2%。低级土壤分类土属能够独立解释土壤有机碳密度总方差的45.0%,但与普通克里格插值结合以后对土壤有机碳密度总方差的解释程度为44.8%,两者相差不大。因此区域空间上能否进一步应用普通克里格插值优化土壤有机碳的空间预测与所选用的土壤有机碳相关因子有关。  相似文献   

12.
13.
Many soil properties and processes vary at different spatial scales. As a result, relationships between soil properties often depend on scale. In this paper we show this for two soil properties of biological importance, by means of a nested analysis of covariance. The variables were urease activity (UA) and soil organic carbon (SOC), sampled on an unbalanced nested design at three sites with different land uses (arable, forest and pasture). The objective of this study was to investigate the scale‐dependent relationships of UA and SOC at these three sites to exemplify the phenomenon of scale‐dependency in the covariation of biogeochemical variables. At each site the variables showed different scale dependencies, expressed in their correlations at different scales. At the pasture site, UA and SOC were uncorrelated at all scales in the sampling design (0.2 m, 1 m, 6 m and ≥15 m), and the overall product moment correlation was 0.10. A significant positive scale dependent correlation (0.65) was found at the 1‐m scale for the forested site. The soil properties were not spatially correlated at any of the other scales and the associated product moment correlation for this site was 0.14. Urease activity and soil organic C were found not to be correlated at the shorter scales in the arable site. However, significant positive correlation coefficients of 0.89 and 0.82 were obtained at the longer scales of 6 and ≥15‐m respectively for the arable site. The product moment correlation at this site was 0.65. At both the arable and forest site, we found that correlations at particular scales were stronger than the overall product moment correlation. This approach allowed us to identify significant relationships between urease activity and soil organic carbon and the scales at which these relationships occur and to draw conclusions about the spatial scales, which must be resolved in further studies of these variables in these contrasting environments. This study highlights the pervasive effect of scale in soil biogeochemistry and shows that scale‐dependence must not be disregarded by soil scientists in their investigations of biogeochemical processes.  相似文献   

14.
Persistently high Nitrogen (N) deposition may have caused widespread N saturation in Central Europe’s forests. Simple and inexpensive methods are required for estimating the N status. This study suggests that the current N status of forest ecosystems can be estimated by measuring CaCl2-extractable nitrate concentrations in the soil below the main rooting zone. We tested this possibility using a large number of samples (135 in total) in a nested sampling design in two homogeneous Norway spruce forests in southern Bavaria. This approach was accompanied by a small scale survey with suction cups (N = 54) in one forest. Nitrate concentrations determined by soil extracts varied widely (coefficients of variance 95 and 125%) and were well comparable with those of the simultaneous investigation of seepage water. Site and stand conditions explained only a small portion (<10%) of the total variation. Mineral soil nitrate concentrations were not spatially dependent at the medium and large scales (about 10 m to several km) in both forests. Therefore the reliability of estimates at these scales depends mainly on the sample size. At the small scale (<about 10 m) large variation in nitrate concentrations and a considerable spatial dependency could be observed. Therefore intensive sampling is necessary at short distances in order to estimate the mean adequately. From our results, we deduct possibilities and limitations of nitrate inventories as a tool for regional assessment of the N status of forests.  相似文献   

15.
A gene-by-gene mixed model analysis is a useful statistical method for assessing significance for microarray gene differential expression. While a large amount of data on thousands of genes are collected in a microarray experiment, the sample size for each gene is usually small, which could limit the statistical power of this analysis. In this report, we introduce an empirical Bayes (EB) approach for general variance component models applied to microarray data. Within a linear mixed model framework, the restricted maximum likelihood (REML) estimates of variance components of each gene are adjusted by integrating information on variance components estimated from all genes. The approach starts with a series of single-gene analyses. The estimated variance components from each gene are transformed to the “ANOVA components”. This transformation makes it possible to independently estimate the marginal distribution of each “ANOVA component.” The modes of the posterior distributions are estimated and inversely transformed to compute the posterior estimates of the variance components. The EB statistic is constructed by replacing the REML variance estimates with the EB variance estimates in the usual t statistic. The EB approach is illustrated with a real data example which compares the effects of five different genotypes of male flies on post-mating gene expression in female flies. In a simulation study, the ROC curves are applied to compare the EB statistic and two other statistics. The EB statistic was found to be the most powerful of the three. Though the null distribution of the EB statistic is unknown, a t distribution may be used to provide conservative control of the false positive rate.  相似文献   

16.
研究不同模型对土壤有机质空间预测的性能差异对制定更加科学合理的采样策略、提升采样效率和提高土壤空间预测精度有着重要的指导意义。本研究将6496个土壤样点按8∶2的比例分层随机分成训练集与验证集,应用普通克里格、随机森林以及随机森林-回归克里格三种有代表性的数字化土壤制图(Digital Soil Mapping,DSM)模型,对河南省许昌市耕地表层土壤有机质含量及空间分布进行预测,对三种模型性能表现进行综合评价。三种模型输出的预测结果显示:研究区耕地表层土壤有机质含量水平一般,均值为18.70 ~ 18.81 g kg?1,变异系数0.15 ~ 0.17,属中等强度变异;空间分布总体格局为西北与西南部分山地褐土区、东南部砂姜黑土区表层有机质含量高,中北部脱潮土、石灰性潮土区表层有机质含量低。验证结果表明:三种模型性能表现无明显差距,预测精度基本一致,输出结果对研究区耕地表层土壤有机质变异解释百分比在33% ~ 34%之间,在相同和相近尺度土壤有机质空间预测案例研究里属中等水平。在协变量有限且样点分布较为均匀的情况下,普通克里格模型便于快速获得研究区目标变量的空间分布;如果协变量比较丰富且易于收集利用,或是进行空间预测的同时还需要甄别不同因素对目标变量的影响大小,则建议采用随机森林模型;协变量有限,但样点密度较大时,随机森林-回归克里格模型可能是对目标变量进行空间预测的不错选择。  相似文献   

17.
区域土壤侵蚀遥感抽样调查方法   总被引:6,自引:1,他引:5  
土壤侵蚀是全球性环境问题,土壤侵蚀调查是水土保持规划和生态文明建设的科学基础。为了完善土壤侵蚀抽样调查方法,快速、精准地估算土壤侵蚀实际速率,对基于高分辨率遥感影像进行目视解译,提取高精度土地利用和水土保持措施信息的方法进行了研究。基于现代地理信息科学,充分利用虚拟地球及其提供的公开高分辨率遥感数据资源,考虑土壤侵蚀及其治理的时空特征,采用分层不等概系统空间抽样方法布设抽样单元,通过对公开高分辨率遥感影像的目视解译,完成泛第三极地区土地利用和水保措施的遥感抽样调查。研究实现了2万个抽样调查单元的解译,提取了土地利用和水土保持措施信息;基于CSLE模型完成了典型抽样调查单元的土壤流失速率计算,并对解译结果进行了精度和实用性分析。结果表明:基于公开高分辨率遥感影像、利用分层不等概系统空间抽样方法,可快速提取土地利用和水土保持措施信息,完成区域土壤侵蚀抽样调查。  相似文献   

18.
R. Kerry  M.A. Oliver 《Geoderma》2007,140(4):383-396
It has been generally accepted that the method of moments (MoM) variogram, which has been widely applied in soil science, requires about 100 sites at an appropriate interval apart to describe the variation adequately. This sample size is often larger than can be afforded for soil surveys of agricultural fields or contaminated sites. Furthermore, it might be a much larger sample size than is needed where the scale of variation is large. A possible alternative in such situations is the residual maximum likelihood (REML) variogram because fewer data appear to be required. The REML method is parametric and is considered reliable where there is trend in the data because it is based on generalized increments that filter trend out and only the covariance parameters are estimated. Previous research has suggested that fewer data are needed to compute a reliable variogram using a maximum likelihood approach such as REML, however, the results can vary according to the nature of the spatial variation. There remain issues to examine: how many fewer data can be used, how should the sampling sites be distributed over the site of interest, and how do different degrees of spatial variation affect the data requirements? The soil of four field sites of different size, physiography, parent material and soil type was sampled intensively, and MoM and REML variograms were calculated for clay content. The data were then sub-sampled to give different sample sizes and distributions of sites and the variograms were computed again. The model parameters for the sets of variograms for each site were used for cross-validation. Predictions based on REML variograms were generally more accurate than those from MoM variograms with fewer than 100 sampling sites. A sample size of around 50 sites at an appropriate distance apart, possibly determined from variograms of ancillary data, appears adequate to compute REML variograms for kriging soil properties for precision agriculture and contaminated sites.  相似文献   

19.
Estimates of mean values of soil properties within small rectangular blocks of land can be obtained by kriging provided the semi-variogram is known. This paper describes optimal rectangular grid sampling configurations whereby estimation variances can be minimized. For linear semi-variograms square blocks are best estimated by sampling at the nodes of a centrally placed grid with its interval equal to the block side divided by the square root of the sample size. For spherical semi-variograms the same configuration is almost optimal. The estimation variance of a bulked sample can be identical with that of a kriged estimate where the semi-variogram is linear and equal portions of soil are taken from each node on the optimally configured grid and provided the soil property is additive. For spherical semi-variograms the above is approximately true. Comparisons with estimates that take no account of known spatial dependence show that the true variances can be much less than those apparent using classical theory, and the necessary sampling effort much less. Within block-variances are often needed for planning, and an appendix gives two-dimensional auxiliary functions from which they can be calculated for linear and spherical semi-variograms.  相似文献   

20.
The plausibility of the assumption that soil variation can be treated as a realization of a random spatial process that is stationary in the variance can break down in various ways. It is possible to test the assumption using methods based on the wavelet transform. To date these approaches have been applied using the discrete wavelet transform. A drawback of this approach is that it uses a partition of the spatial frequencies represented in the data into intervals (scales) that are arbitrarily defined in advance and are not necessarily suitable for the representation of the variation of the data in question. A solution to this problem is to identify the best basis for the data from a wavelet packet library. An interesting question is whether the structure of this best basis is in itself informative about the plausibility of the stationarity assumption. In this paper, I show that this is indeed the case. The best basis for a stationary random variable from some packet library is the basis on the maximum dilation of the mother wavelet, which gives the finest resolution in the frequency domain. I propose the ratio of the entropy cost functional for this basis to that of the empirical best basis as a measure of evidence against the null hypothesis of stationarity in the variance. Critical values of this statistic may be obtained by Monte Carlo methods. I demonstrate the method using data on the clay content of soil on a transect in central England. The null hypothesis of stationarity in the variance may be rejected. Tests for the uniformity of variance can then be applied to wavelet packets in the best basis. The dominant local feature that is reflected in this behaviour is the unique pattern of variation in alluvium around a drainage channel that crosses the transect. This variation contrasts with that seen at most positions on the transect, variation that arises from a more or less regular pattern of boundaries between contrasting Jurassic strata.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号