首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 25 毫秒
1.
Monte Carlo (MC) methods have been found useful in estimation of variance parameters for large data and complex models with many variance components (VC), with respect to both computer memory and computing time. A disadvantage has been a fluctuation in round‐to‐round values of estimates that makes the estimation of convergence challenging. Furthermore, with Newton‐type algorithms, the approximate Hessian matrix might have sufficient accuracy, but the inaccuracy in the gradient vector exaggerates the round‐to‐round fluctuation to intolerable. In this study, the reuse of the same random numbers within each MC sample was used to remove the MC fluctuation. Simulated data with six VC parameters were analysed by four different MC REML methods: expectation‐maximization (EM), Newton–Raphson (NR), average information (AI) and Broyden's method (BM). In addition, field data with 96 VC parameters were analysed by MC EM REML. In all the analyses with reused samples, the MC fluctuations disappeared, but the final estimates by the MC REML methods differed from the analytically calculated values more than expected especially when the number of MC samples was small. The difference depended on the random numbers generated, and based on repeated MC AI REML analyses, the VC estimates were on average non‐biased. The advantage of reusing MC samples is more apparent in the NR‐type algorithms. Smooth convergence opens the possibility to use the fast converging Newton‐type algorithms. However, a disadvantage from reusing MC samples is a possible “bias” in the estimates. To attain acceptable accuracy, sufficient number of MC samples need to be generated.  相似文献   

2.
A simulation study was conducted to assess the influence of differences in the length of individual testing periods on estimates of (co)variance components of a random regression model for daily feed intake of growing pigs performance tested between 30 and 100 kg live weight. A quadratic polynomial in days on test with fixed regressions for sex, random regressions for additive genetic and permanent environmental effects and a constant residual variance was used for a bivariate simulation of feed intake and daily gain. (Co)variance components were estimated for feed intake only by means of a Bayesian analysis using Gibbs sampling and restricted maximum likelihood (REML). A single trait random regression model analogous to the one used for data simulation was used to analyse two versions of the data: full data sets with 18 weekly means of feed intake per animal and reduced data sets with the individual length of testing periods determined when tested animals reached 100 kg live weight. Only one significant difference between estimates from full and reduced data (REML estimate of genetic covariance between linear and quadratic regression parameters) and two significant differences from expected values (Gibbs estimates of permanent environmental variance of quadratic regression parameters) occurred. These differences are believed to be negligible, as the number lies within the expected range of type I error when testing at the 5% level. The course of test day variances calculated from estimates of additive genetic and permanent environmental covariance matrices also supports the conclusion that no bias in estimates of (co)variance components occurs due to the individual length of testing periods of performance‐tested growing pigs. A lower number of records per tested animal only results in more variation among estimates of (co)variance components from reduced compared with full data sets. Compared with the full data, the effective sample size of Gibbs samples from the reduced data decreased to 18% for residual variance and increased up to five times for other (co)variances. The data structure seems to influence the mixing of Gibbs chains.  相似文献   

3.
The genetic evaluation using the carcass field data in Japanese Black cattle has been carried out employing an animal model, implementing the restricted maximum likelihood (REML) estimation of additive genetic and residual variances. Because of rapidly increasing volumes of the official data sets and therefore larger memory spaces required, an alternative approach like the REML estimation could be useful. The purpose of this study was to investigate Gibbs sampling conditions for the single-trait variance component estimation using the carcass field data. As prior distributions, uniform and normal distributions and independent scaled inverted chi-square distributions were used for macro-environmental effects, breeding values, and the variance components, respectively. Using the data sets of different sizes, the influences of Gibbs chain length and thinning interval were investigated, after the burn-in period was determined using the coupling method. As would be expected, the chain lengths had obviously larger effects on the posterior means than those of thinning intervals. The posterior means calculated using every 10th sample from 90 000 of samples after 10 000 samples discarded as burn-in period were all considered to be reasonably comparable to the corresponding estimates by REML.  相似文献   

4.
The purpose of this study is to present guidelines in selection of statistical and computing algorithms for variance components estimation when computing involves software packages. For this purpose two major methods are to be considered: residual maximal likelihood (REML) and Bayesian via Gibbs sampling. Expectation‐Maximization (EM) REML is regarded as a very stable algorithm that is able to converge when covariance matrices are close to singular, however it is slow. However, convergence problems can occur with random regression models, especially if the starting values are much lower than those at convergence. Average Information (AI) REML is much faster for common problems but it relies on heuristics for convergence, and it may be very slow or even diverge for complex models. REML algorithms for general models become unstable with larger number of traits. REML by canonical transformation is stable in such cases but can support only a limited class of models. In general, REML algorithms are difficult to program. Bayesian methods via Gibbs sampling are much easier to program than REML, especially for complex models, and they can support much larger datasets; however, the termination criterion can be hard to determine, and the quality of estimates depends on a number of details. Computing speed varies with computing optimizations, with which some large data sets and complex models can be supported in a reasonable time; however, optimizations increase complexity of programming and restrict the types of models applicable. Several examples from past research are discussed to illustrate the fact that different problems required different methods.  相似文献   

5.
This data set consisted of over 29 245 field records from 24 herds of registered Nelore cattle born between 1980 and 1993, with calves sires by 657 sires and 12 151 dams. The records were collected in south‐eastern and midwestern Brazil and animals were raised on pasture in a tropical climate. Three growth traits were included in these analyses: 205‐ (W205), 365‐ (W365) and 550‐day (W550) weight. The linear model included fixed effects for contemporary groups (herd‐year‐season‐sex) and age of dam at calving. The model also included random effects for direct genetic, maternal genetic and maternal permanent environmental (MPE) contributions to observations. The analyses were conducted using single‐trait and multiple‐trait animal models. Variance and covariance components were estimated by restricted maximum likelihood (REML) using a derivative‐free algorithm (DFREML) for multiple traits (MTDFREML). Bayesian inference was obtained by a multiple trait Gibbs sampling algorithm (GS) for (co)variance component inference in animal models (MTGSAM). Three different sets of prior distributions for the (co)variance components were used: flat, symmetric, and sharp. The shape parameters (ν) were 0, 5 and 9, respectively. The results suggested that the shape of the prior distributions did not affect the estimates of (co)variance components. From the REML analyses, for all traits, direct heritabilities obtained from single trait analyses were smaller than those obtained from bivariate analyses and by the GS method. Estimates of genetic correlations between direct and maternal effects obtained using REML were positive but very low, indicating that genetic selection programs should consider both components jointly. GS produced similar but slightly higher estimates of genetic parameters than REML, however, the greater robustness of GS makes it the method of choice for many applications.  相似文献   

6.
This work focuses on the effects of variable amount of genomic information in the Bayesian estimation of unknown variance components associated with single‐step genomic prediction. We propose a quantitative criterion for the amount of genomic information included in the model and use it to study the relative effect of genomic data on efficiency of sampling from the posterior distribution of parameters of the single‐step model when conducting a Bayesian analysis with estimating unknown variances. The rate of change of estimated variances was dependent on the amount of genomic information involved in the analysis, but did not depend on the Gibbs updating schemes applied for sampling realizations of the posterior distribution. Simulation revealed a gradual deterioration of convergence rates for the locations parameters when new genomic data were gradually added into the analysis. In contrast, the convergence of variance components showed continuous improvement under the same conditions. The sampling efficiency increased proportionally to the amount of genomic information. In addition, an optimal amount of genomic information in variance–covariance matrix that guaranty the most (computationally) efficient analysis was found to correspond a proportion of animals genotyped ***0.8. The proposed criterion yield a characterization of expected performance of the Gibbs sampler if the analysis is subject to adjustment of the amount of genomic data and can be used to guide researchers on how large a proportion of animals should be genotyped in order to attain an efficient analysis.  相似文献   

7.
Volumes of official data sets have been increasing rapidly in the genetic evaluation using the Japanese Black routine carcass field data. Therefore, an alternative approach with smaller memory requirement to the current one using the restricted maximum likelihood (REML) and the empirical best linear unbiased prediction (EBLUP) is desired. This study applied a Bayesian analysis using Gibbs sampling (GS) to a large data set of the routine carcass field data and practically verified its validity in the estimation of breeding values. A Bayesian analysis like REML‐EBLUP was implemented, and the posterior means were calculated using every 10th sample from 90 000 of samples after 10 000 samples discarded. Moment and rank correlations between breeding values estimated by GS and REML‐EBLUP were very close to one, and the linear regression coefficients and the intercepts of the GS on the REML‐EBLUP estimates were substantially one and zero, respectively, showing a very good agreement between breeding value estimation by the current GS and the REML‐EBLUP. The current GS required only one‐sixth of the memory space with REML‐EBLUP. It is confirmed that the current GS approach with relatively small memory requirement is valid as a genetic evaluation procedure using large routine carcass data.  相似文献   

8.
(Co)variance component estimates were computed for retail cuts per day of age (kilograms per day), cutability (percentage of carcass weight), and marbling score (1 through 11) using a multiple-trait sire model. Restricted maximum likelihood estimates of (co)variance components were obtained via an expectation-maximization algorithm. Carcass data consisted of 8,265 progeny records collected by U.S. Simmental producers. Growth trait information (birth weight, weaning weight, and[or] postweaning gain) for those progeny with carcass data and an additional 5,405 contemporaries formed the complete data set for analysis. A total of 420 sires were represented. Three models differing in number of traits were investigated: 1) carcass traits with growth traits, 2) carcass traits only, and 3) single trait. The final models did not include postweaning gain because of convergence problems. Parameter estimates for all three models were essentially the same. Heritability estimates were .30, .18, and .23 for retail cuts per day, cutability, and marbling score, respectively. Correlations between growth and carcass traits were low except for those with retail cuts per day, which were moderate and positive. The additional information gained by adding growth traits to the carcass-traits-only evaluation lowered prediction error variances most for retail cuts per day. Little change in prediction error variances was found for cutability and marbling score. Inclusion of growth traits in future sire evaluations for carcass traits will benefit the evaluation of retail cuts per day but have considerably less effect on cutability and marbling score.  相似文献   

9.
The amount of variance captured in genetic estimations may depend on whether a pedigree‐based or genomic relationship matrix is used. The purpose of this study was to investigate the genetic variance as well as the variance of predicted genetic merits (PGM) using pedigree‐based or genomic relationship matrices in Brown Swiss cattle. We examined a range of traits in six populations amounting to 173 population‐trait combinations. A main aim was to determine how using different relationship matrices affect variance estimation. We calculated ratios between different types of estimates and analysed the impact of trait heritability and population size. The genetic variances estimated by REML using a genomic relationship matrix were always smaller than the variances that were similarly estimated using a pedigree‐based relationship matrix. The variances from the genomic relationship matrix became closer to estimates from a pedigree relationship matrix as heritability and population size increased. In contrast, variances of predicted genetic merits obtained using a genomic relationship matrix were mostly larger than variances of genetic merit predicted using pedigree‐based relationship matrix. The ratio of the genomic to pedigree‐based PGM variances decreased as heritability and population size rose. The increased variance among predicted genetic merits is important for animal breeding because this is one of the factors influencing genetic progress.  相似文献   

10.
In variance component quantitative trait loci (QTL) analysis, a mixed model is used to detect the most likely chromosome position of a QTL. The putative QTL is included as a random effect and a method is needed to estimate the QTL variance. The standard estimation method used is an iterative method based on the restricted maximum likelihood (REML). In this paper, we present a novel non-iterative variance component estimation method. This method is based on Henderson's method 3, but relaxes the condition of unbiasedness. Two similar estimators were compared, which were developed from two different partitions of the sum of squares in Henderson's method 3. The approach was compared with REML on data from a European wild boar × domestic pig intercross. A meat quality trait was studied on chromosome 6 where a functional gene was known to be located. Both partitions resulted in estimated QTL variances close to the REML estimates. From the non-iterative estimates, we could also compute good approximations of the likelihood ratio curve on the studied chromosome.  相似文献   

11.
The multiple-trait derivative-free REML set of programs was written to handle partially missing data for multiple-trait analyses as well as single-trait models. Standard errors of genetic parameters were reported for univariate models and for multiple-trait analyses only when all traits were measured on animals with records. In addition to estimating (co)variance components for multiple-trait models with partially missing data, this paper shows how the multiple-trait derivative-free REML set of programs can also estimate SE by augmenting the data file when not all animals have all traits measured. Although the standard practice has been to eliminate records with partially missing data, that practice uses only a subset of the available data. In some situations, the elimination of partial records can result in elimination of all the records, such as one trait measured in one environment and a second trait measured in a different environment. An alternative approach requiring minor modifications of the original data and model was developed that provides estimates of the SE using an augmented data set that gives the same residual log likelihood as the original data for multiple-trait analyses when not all traits are measured. Because the same residual vector is used for the original data and the augmented data, the resulting REML estimators along with their sampling properties are identical for the original and augmented data, so that SE for estimates of genetic parameters can be calculated.  相似文献   

12.
Most genomic prediction studies fit only additive effects in models to estimate genomic breeding values (GEBV). However, if dominance genetic effects are an important source of variation for complex traits, accounting for them may improve the accuracy of GEBV. We investigated the effect of fitting dominance and additive effects on the accuracy of GEBV for eight egg production and quality traits in a purebred line of brown layers using pedigree or genomic information (42K single‐nucleotide polymorphism (SNP) panel). Phenotypes were corrected for the effect of hatch date. Additive and dominance genetic variances were estimated using genomic‐based [genomic best linear unbiased prediction (GBLUP)‐REML and BayesC] and pedigree‐based (PBLUP‐REML) methods. Breeding values were predicted using a model that included both additive and dominance effects and a model that included only additive effects. The reference population consisted of approximately 1800 animals hatched between 2004 and 2009, while approximately 300 young animals hatched in 2010 were used for validation. Accuracy of prediction was computed as the correlation between phenotypes and estimated breeding values of the validation animals divided by the square root of the estimate of heritability in the whole population. The proportion of dominance variance to total phenotypic variance ranged from 0.03 to 0.22 with PBLUP‐REML across traits, from 0 to 0.03 with GBLUP‐REML and from 0.01 to 0.05 with BayesC. Accuracies of GEBV ranged from 0.28 to 0.60 across traits. Inclusion of dominance effects did not improve the accuracy of GEBV, and differences in their accuracies between genomic‐based methods were small (0.01–0.05), with GBLUP‐REML yielding higher prediction accuracies than BayesC for egg production, egg colour and yolk weight, while BayesC yielded higher accuracies than GBLUP‐REML for the other traits. In conclusion, fitting dominance effects did not impact accuracy of genomic prediction of breeding values in this population.  相似文献   

13.
Summary Restricted maximum likelihood (REML) was used to determine the choice of statistical model, additive genetic maternal and common litter effects and consequences of ignoring these effects on estimates of variance–covariance components under random and phenotypic selection in swine using computer simulation. Two closed herds of different size and two traits, (i) pre‐weaning average daily gain and (ii) litter size at birth, were considered. Three levels of additive direct and maternal genetic correlations (rdm) were assumed to each trait. Four mixed models (denoted as GRM1 through GRM4) were used to generate data sets. Model GRM1 included only additive direct genetic effects, GRM2 included only additive direct genetic and common litter effects, GRM3 included only additive direct and maternal genetic effects and GRM4 included all the random effects. Four mixed animal models (defined as EPM1 through EPM4) were defined for estimating genetic parameters similar to GRM. Data from each GRM were fitted with EPM1 through EPM4. The largest biased estimates of additive genetic variance were obtained when EPM1 was fitted to data generated assuming the presence of either additive maternal genetic, common litter effects or a combination thereof. The bias of estimated additive direct genetic variance (VAd) increased and those of recidual variance (VE) decreased with an increase in level of rdm when GRM3 was used. EPM1, EPM2 and EPM3 resulted in biased estimation of the direct genetic variances. EPM4 was the most accurate in each GRM. Phenotypic selection substantially increased bias of estimated additive direct genetic effect and its mean square error in trait 1, but decreased those in trait 2 when ignored in the statistical model. For trait 2, estimates under phenotypic selection were more biased than those under random selection. It was concluded that statistical models for estimating variance components should include all random effects considered to avoid bias.  相似文献   

14.
The present study used published quantitative trait loci (QTL) mapping data from three F2 crosses in pigs for 34 meat quality and carcass traits to derive the distribution of additive QTL effects as well as dominance coefficients. Dominance coefficients were calculated as the observed QTL dominance deviation divided by the absolute value of the observed QTL additive effect. The error variance of this ratio was approximated using the delta method. Mixtures of normal distributions (mixtures of normals) were fitted to the dominance coefficient using a modified EM‐algorithm that considered the heterogeneous error variances of the data points. The results suggested clearly to fit one component which means that the dominance coefficients are normally distributed with an estimated mean (standard deviation) of 0.193 (0.312). For the additive effects mixtures of normals and a truncated exponential distribution were fitted. Two components were fitted by the mixtures of normals. The mixtures of normals did not predict enough QTL with small effects compared to the exponential distribution and to literature reports. The estimated rate parameter of the exponential distribution was 5.81 resulting in a mean effect of 0.172.  相似文献   

15.
Markov Chain Monte Carlo methods made possible estimation of parameters for complex random regression test‐day models. Models evolved from single‐trait with one set of random regressions to multiple‐trait applications with several random effects described by regressions. Gibbs sampling has been used for models with linear (with respect to coefficients) regressions and normality assumptions for random effects. Difficulties associated with implementations of Markov Chain Monte Carlo schemes include lack of good practical methods to assess convergence, slow mixing caused by high posterior correlations of parameters and long running time to generate enough posterior samples. Those problems are illustrated through comparison of Gibbs sampling schemes for single‐trait random regression test‐day models with different model parameterizations, different functions used for regressions and posterior chains of different sizes. Orthogonal polynomials showed better convergence and mixing properties in comparison with ‘lactation curve’ functions of the same number of parameters. Increasing the order of polynomials resulted in smaller number of independent samples for covariance components. Gibbs sampling under hierarchical model parameterization had a lower level of autocorrelation and required less time for computation. Posterior means and standard deviations of genetic parameters were very similar for chains of different size (from 20 000 to 1 000 000) after convergence. Single‐trait random regression models with large data sets can be analysed by Markov Chain Monte Carlo methods in relatively short time. Multiple‐trait (lactation) models are computationally more demanding and better algorithms are required.  相似文献   

16.
Multivariate estimation of genetic parameters involving more than a handful of traits can be afflicted by problems arising through substantial sampling variation. We present a review of underlying causes and proposals to improve estimates, focusing on linear mixed model‐based estimation via restricted maximum likelihood (REML). Both full multivariate analyses and pooling of results from overlapping subsets of traits are considered. It is suggested to impose a penalty on the likelihood designed to reduce sampling variances at the expense of a little additional bias. Simulation results are discussed which demonstrate that this can yield REML estimates that are on average closer to the population values than their unpenalized counterparts. Suitable penalties can be obtained based on assumed prior distributions of selected parameters. Necessary choices of penalty functions and of the stringency of penalization are examined. We argue that scale‐free penalty functions lend themselves to a simple scheme imposing a mild, default penalty which can yield “better” estimates without being likely to incur detrimental effects.  相似文献   

17.
Simulated horse data were used to compare multivariate estimation of genetic parameters and prediction of breeding values (BV) for categorical, continuous and molecular genetic data using linear animal models via residual maximum likelihood (REML) and best linear unbiased prediction (BLUP) and mixed linear-threshold animal models via Gibbs sampling (GS). Simulation included additive genetic values, residuals and fixed effects for one continuous trait, liabilities of four binary traits, and quantitative trait locus (QTL) effects and genetic markers with different recombination rates and polymorphism information content for one of the liabilities. Analysed data sets differed in the number of animals with trait records and availability of genetic marker information. Consideration of genetic marker information in the model resulted in marked overestimation of the heritability of the QTL trait. If information on 10,000 or 5,000 animals was used, bias of heritabilities and additive genetic correlations was mostly smaller, correlation between true and predicted BV was always higher and identification of genetically superior and inferior animals was - with regard to the moderately heritable traits, in many cases - more reliable with GS than with REML/BLUP. If information on only 1,000 animals was used, neither GS nor REML/BLUP produced genetic parameter estimates with relative bias 50% for all traits. Selection decisions for binary traits should rather be based on GS than on REML/BLUP breeding values.  相似文献   

18.
Components of (co)variance for weaning weight were estimated from field data provided by the American Simmental Association. These components were obtained for the observational components of variance corresponding to a sire, maternal grandsire, and dam within maternal grandsire model. From these estimates, direct additive genetic variance (Sigma2A), maternal additive genetic variance (Sigma2M), covariance between direct and maternal additive genetic effects (SigmaAM), variance of permanent environment(Sigma2pe) and temporary environment variance(Sigma2te) were determined. A procedure to approximate restricted maximum likelihood (REML) estimates of the observational components of variance based on the expectation-maximization (EM) algorithm is described. From these results, phenotypic variance ( ) of weaning weight was 667.88 kg2. Values forSigma2A, Sigma2M, Sigma2pe and Sigma2te were 79,30,58,38,49.45, and 469.97 kg2, respectively. Genetic correlation between direct and maternal additive genetic effects was .16.  相似文献   

19.
The objective was to compare the performance of a recently derived, new method of estimating variances and covariances with any mixed linear model and any pattern of missing data with that of restricted maximum likelihood. For each of 96 combinations of six three-herd x four-sire unbalanced designs of 39 offspring each, four heritability values, two ratios of sire variance to interaction variance, and two distributions (multivariate normal and multivariate chi2, 3 df), 15,000 vectors (n = 39) were generated. Least squares Lehmann-Scheffé (LSLS) estimators of sire variance, interaction variance, and heritability were compared to those of REML with the performance measures of percentage of estimates (of the 15,000) that were positive, mean square error, variance, percentage of estimates within +/- 50% of the parameter, bias, maximum value, skewness, and kurtosis. The LSLS method vastly outperformed REML in almost all 96 combinations. Averaged over the 48 combinations with multivariate normal data, the average percentage that REML estimators of heritability performed relative to those of LSLS for the first five of the above listed eight performance measures was -100%. The number of times LSLS was better than REML was 235 out of 240. The analogous values for the 48 combinations with multivariate chi2, 3-df data were -90% and 230 out of 240. The REML maximum values were always larger than the LSLS values. The LSLS skewness and kurtosis values were about the same as those for REML, with the exception of LSLS heritability kurtosis values, which were notably less than those for REML. The explicit expectations of the LSLS estimators showed that the LSLS estimators were surprisingly unbiased given the paucity of data. Explicit coefficients for calculating mean square errors, variances, and biases squared of the LSLS estimators of the three variances were obtained for each design. The LSLS advantage was not quite so large with the multivariate chi2, 3-df data as with the multivariate normal data. Results with a symmetric multinomial distribution were the same as with the multivariate normal. The overall result was that the LSLS estimators produced substantially more non-zero estimates than REML estimators and these more abundant positive estimates were substantially grouped closer to their respective parameters. Results justify efforts to make the LSLS procedure computationally available.  相似文献   

20.
The widespread use of the set of multiple-trait derivative-free REML programs for prediction of breeding values and estimation of variance components has led to significant improvement in traits of economic importance. The initial version of this software package, however, was generally limited to pedigree-based relationships. With continued advances in genomic research and the increased availability of genotyping, relationships based on molecular markers are obtainable and desirable. The addition of a new program to the set of multiple-trait derivative-free REML programs is described that allows users the flexibility to calculate relationships using standard pedigree files or an arbitrary relationship matrix based on genetic marker information. The strategy behind this modification and its design is described. An application is illustrated in a QTL association study for canine hip dysplasia.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号