首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 156 毫秒
1.
基因型填充策略研究   总被引:1,自引:1,他引:0  
基因组数据在畜禽遗传育种中的应用越来越广泛,基因型填充作为基因组数据处理的重要工具,填充结果的好坏直接影响后续分析,为了得到好的填充结果,需要制定完善的填充策略。本研究通过模拟数据探讨参考群体大小、目标群体与参考群体间遗传关系(距离)远近、目标位点数目(比例)、最小等位基因频率以及填充算法等因素对基因型填充效果的影响。结果表明,目标位点数目与填充效果呈显著的正相关(P<0.05),是影响基因型填充准确性的主要因素;参考群体大小是影响Beagle5.1填充错误率的主要因素,目标位点数目是影响Minimac4填充错误率的主要因素;目标群体和参考群体的遗传距离对Beagle5.1填充效果的影响较Minimac4更为显著;一般情况下,最小等位基因频率越高的位点填充错误率越高;在参考群体个体数量少且目标位点数目多的情况下,Minimac4的填充速度优于Beagle5.1,但随参考群体个体数目增加有逆趋势。在保证填充质量的前提下,Beagle5.1对本研究中几种因素的标准要求相对较低。相对地,当目标群体位点数目较低,参考群体个体数目较多时,Beagle5.1的填充效果更好,而Minimac4更适合参考群体个体数目较少,目标群体位点数目较高的填充中。本研究针对不同的填充目的制定了不同策略,为基因型填充标准提供了参考。  相似文献   

2.
旨在探究低密度液相芯片在生产实践中的实用性,降低育种成本。本试验选用了3 761头约160日龄,110 kg左右健康大白猪,随机抽取100头大白猪,根据10K芯片标记信息,从50K芯片中抽取标记生成10K芯片,作为填充群体。再从剩余群体中,分别随机抽取800、2 000、3 600个个体作为参考群体,使用Beagle 4.1软件对100头填充群体进行基因型填充至50K芯片,重复10次,以基因型一致性和基因型相关系数来评价基因型填充的准确性。结果表明,10K和50K芯片平均连锁不平衡(r2)程度为0.227和0.258,相差不大。最小等位基因频率(MAF)为0.05是基因型填充准确性的拐点,剔除掉MAF<0.05标记后,填充准确性明显升高。填充准确性随参考群体规模增大而上升,参考群由800头扩大到3 600头,填充准确性从0.90提高到0.95,10次重复的标准差也从0.006下降到0.002。对于较小的参考群体规模,染色体基因型填充准确性波动较大,随着参考群体规模增大,每条染色体填充准确性相差不大。本研究结果表明,猪液相芯片从10K填充到50K是可行的,可以大规模用于基因组选择,降低基因组选择育种成本。  相似文献   

3.
旨在提出一种新型基因组关系矩阵并验证其在多品种联合群体中的模拟应用效果。本研究利用QMsim软件模拟牛的表型数据和基因型数据;利用Gmatrix软件构建常规G阵;利用R语言构建新型G阵,新型G阵在常规G阵的基础上,将多品种联合群体的非哈代-温伯格平衡位点考虑在内;利用DMU软件使用“一步”法模型计算基因组估计育种值(estimated genomic breeding value,GEBV);比较不同情况下使用两种G阵的GEBV预测准确性。结果表明,在不同遗传力及QTL数下,不对新型G阵使用A22阵加权就能达到常规G阵使用A22阵加权时的GEBV预测准确性。在系谱部分缺失时,新型G阵不加权较常规G阵加权时GEBV预测准确性高。证明,在系谱有部分缺失时,新型G阵对多品种GEBV的预测有一定优势。  相似文献   

4.
联合育种是我国生猪遗传改良计划的重要工作,联合育种能够扩大群体规模,增加群体内遗传变异,提高育种值估计的准确性,且相较于传统育种方法对低遗传力的繁殖性状有着更明显的效果。本研究收集了河北大好河山养殖有限公司、河北裕丰京安养殖有限公司、石家庄清凉山养殖有限公司(以下分别简称大好河山、京安和清凉山)3家育种场共6 790条大白猪的繁殖性状,构建了基因组选择合并参考群体,通过基因型填充将纽勤50K(Geneseek)芯片基因型填充到液相50K,采用一步法进行基因组联合遗传评估。结果表明:清凉山与裕丰京安两场遗传背景相近,大好河山场与其他两场存在较远的联系;基于系谱信息预测大好河山个体的总产仔数育种值准确性为0.170,基因组预测准确性则为0.324;通过联合基因组遗传评估,总产仔数基因组预测的准确性进一步提升至0.347,比基于单场系谱信息提高了104%。本研究表明通过基因型填充统一各场SNP芯片类型,构建河北省大白猪繁殖性状基因组选择参考群,从而进行联合基因组选择是可行的,尤其对提高常规育种进展缓慢的繁殖性状意义重大。  相似文献   

5.
基因组选择(GS)是近些年发展起来的一项新型育种技术,目前已在动植物育种实践中应用。本研究通过在1 068头杜洛克公猪群体中使用不同密度的SNP芯片进行全基因组选择效果比较分析。结果发现:使用基因型填充后芯片以及高密度SNP芯片所获得的估计基因组育种值(GEBV)之间可以达到99%的相关,并发现个体间亲缘关系的远近对同群体内基因型填充结果的准确率影响不大。由此可见,与目标性状紧密相关的低密度SNP芯片可用于实际育种工作,在降低使用成本的同时并不影响全基因组选择效果,为实质性进行猪分子育种提供了一条可行途径。  相似文献   

6.
本研究收集了1 267头大白猪的生长性能测定记录和800头大白猪的繁殖记录,并用猪50K固相芯片(Geneseek)及基于靶向捕获测序技术的猪50K液相芯片(液相50K)进行基因型分型,比较2款芯片基因组选择的准确性。采用一步法模型,估计达百公斤体重日龄、百公斤活体背膘厚和总产仔数3个性状的基因组育种值。本研究探究不同参考群体规模对基因组选择的影响,生长性状设置500和1 000 2个参考群体规模,繁殖性状设置400和700 2个参考群体规模。结果表明,液相50K与Geneseek芯片具有很好的兼容性,液相50K在2个生长性状的基因组选择准确性比Geneseek平均高出1.7%,对总产仔数的基因组选择准确性略高于Geneseek,但提升幅度较小。2款芯片的并集使标记数增加到62 039个SNP,基因组选择准确性比单款芯片平均提升2.9%。无论是单款芯片还是2款芯片并集,随着参考群体规模扩大,基因组选择准确性明显上升。本研究表明液相芯片技术能够用于基因组选择且有优势,研究结果可为我国猪分子育种提供参考。  相似文献   

7.
旨在探索低密度芯片标记的筛选方法并评估不同低密度芯片的准确性。本研究采用BovineHD高密度芯片,检测西门塔尔牛基因组的SNP位点及其与饱和脂肪酸含量的关联性,根据P值或效应值筛选标记构成低密度芯片,使用IBS聚类分组和随机分组进行交叉验证,估计基因组育种值并评估其准确性。结果表明,在14号染色体的MYC基因附近有5个位点与饱和脂肪酸性状显著相关,可考虑作为西门塔尔牛脂肪酸含量的候选基因进行后续研究。根据P值筛选标记并使用IBS聚类分组进行交叉验证时,估计基因组育种值准确性最高,芯片密度达到7K时准确性趋于稳定。因此,本研究发现的标记位点可能对西门塔尔牛的脂肪酸含量存在一定影响,并且为低密度芯片标记位点的筛选提供参考资料。  相似文献   

8.
高密度SNP芯片及其对肉牛育种影响的研究进展   总被引:1,自引:1,他引:0  
近年来先进的测序和基因分型技术促进了肉牛育种方法的革新。从过去低通量、耗时的限制性片段多态标记(RFLP)到如今高通量、高密度的单核苷酸多态性(SNP)标记,基因检测效率大幅提高。随着肉牛基因组序列图谱及SNP图谱的完成,基于高密度SNP标记的牛全基因组选择成了牛育种的新热点。作者立足高密度SNP芯片对肉牛育种的影响,综述高密度SNP芯片及和下一代测定技术及肉牛全基因组选择的研究进展,阐明高密度SNP芯片对多品种全基因组选择的模型的建立及准确的预测基因组育种值极其重要。  相似文献   

9.
旨在比较简化基因组测序技术和基因芯片技术实施基因组选择的基因组估计育种值(GEBV)准确性。本研究在AH肉鸡资源群体F2代中随机选取395个个体(其中公鸡212只,母鸡183只,来自8个半同胞家系),同时采用10×SLAF测序技术和Illumina Chicken 60K SNP芯片进行基因标记分型。采用基因组最佳无偏估计法(GBLUP)和BayesCπ对6周体重、12周体重、日均增重、日均采食量、饲料转化率和剩余采食量等6个性状进行GEBV准确性比较研究,并采用5折交叉验证法验证。结果表明,采用同一基因标记分型平台,两种育种值估计方法所得GEBV准确性差异不显著(P>0.05);不同的性状对基因标记分型平台的选择存在差异,对于6周体重,使用基因芯片可获得更高的GEBV准确性(P<0.05),对于剩余采食量,则使用简化基因组测序可获得更高的GEBV准确性(P<0.05)。综合6个性状GEBV均值比较,两个基因标记分型平台之间差异不到0.01,高通量测序技术和基因芯片技术都可以用于黄羽肉鸡基因组选择。  相似文献   

10.
单核苷酸多态性(single nucleotide polymorphism,SNP)是遗传学研究中重要的材料。近年来,全基因组SNP标记开发方法的发展使得研究者们能够以较低成本获得丰富的基因组标记,大大推动了基因组水平的相关研究。基因组预测从已知基因型数据和表型数据的个体建立训练模型,对未知表型的个体进行基因型和表型预测,在育种领域具有重要意义。全基因组SNP的分型策略结合基因组预测方法,构成了动物基因组选择的前沿。本文从这两个方面进行综述,以期为从事分子遗传学,尤其是复杂性状研究的研究者们提供参考。  相似文献   

11.
Genomic data is more and more widely used in livestock breeding. Genotype imputation is an important tool to handle missing values in genotypic data, and the quality of imputation results directly affects the subsequent analysis. To obtain good imputation results, a comprehensive imputation strategy needs to be formulated. We studied on the effects of several factors on genotype imputation by simulation. The factors included reference population size, genetic relationship (distance) between the target population and the reference population, the number of target sites (proportion), the minimum allele frequency (MAF), and the imputation algorithm. The results showed that the number of target sites was the main factor affecting the genotype imputation, and it showed significantly positive correlation with the quality of imputation(P<0.05). The reference population size was the main factor affecting the imputation error rate in Beagle5.1. Correspondingly, the number of target sites was the main factor affecting the imputation error rate in Minimac4. Genetic distance between the target population and the reference population had a more significant effect on the imputation quality of Beagle5.1 than Minimac4. In general, the imputation error rate increased as the increases of MAF in a site. When the number of individuals in the reference population was small and the number of target sites was large, the speed of Minimac4 was superior to Beagle5.1, but there was a reverse trend as the reference population size increased. On the premise of ensuring the imputation quality, Beagle5.1 had relatively lower requirements for the above factors. In contrast, when the number of target sites was low and reference population size was large, the imputation effect of Beagle5.1 was better, while Minimac4 was more suitable for the imputation of a small reference population size and a higher number of target sites. In this study, different strategies were formulated for different imputation purposes, and the study results would provide a reference for genotype imputation.  相似文献   

12.
This study investigated the effect of including Nordic Holsteins in the reference population on the imputation accuracy and prediction accuracy for Chinese Holsteins. The data used in this study include 85 Chinese Holstein bulls genotyped with both 54K chip and 777K (HD) chip, 2862 Chinese cows genotyped with 54K chip, 510 Nordic Holstein bulls genotyped with HD chip, and 4398 Nordic Holstein bulls genotyped with 54K chip and with deregressed proofs for five milk production traits. Based on these data, the accuracy of imputation from 54K to HD marker data and the accuracy of genomic predictions in Chinese Holstein were assessed. The allele correct rate increased around 2.7 and 1.7% in imputation from the 54K to the HD marker data for Chinese Holstein bulls and cows, respectively, when the Nordic HD‐genotyped bulls were included in the reference data for imputation. However, the prediction accuracy was improved slightly when using the marker data imputed based on the combined HD reference data, compared with using the marker data imputed based on the Chinese HD reference data only. On the other hand, when using the combined reference population including 4398 Nordic Holstein bulls, the accuracy of genomic predictions increased 6.5 percentage points together with a reduction of prediction bias. The HD markers did not outperform the 54K markers in genomic prediction based on the present data. The results indicate that for Chinese Holsteins, it is necessary to genotype more individuals with 54K chip to increase reference population rather than increasing marker density.  相似文献   

13.
Background: Genome-wide association studies and genomic predictions are thought to be optimized by using whole-genome sequence(WGS) data. However, sequencing thousands of individuals of interest is expensive.Imputation from SNP panels to WGS data is an attractive and less expensive approach to obtain WGS data. The aims of this study were to investigate the accuracy of imputation and to provide insight into the design and execution of genotype imputation.Results: We genotyped 450 chickens with a 600 K SNP array, and sequenced 24 key individuals by whole genome re-sequencing. Accuracy of imputation from putative 60 K and 600 K array data to WGS data was 0.620 and 0.812 for Beagle, and 0.810 and 0.914 for FImpute, respectively. By increasing the sequencing cost from 24 X to 144 X, the imputation accuracy increased from 0.525 to 0.698 for Beagle and from 0.654 to 0.823 for FImpute. With fixed sequence depth(12 X), increasing the number of sequenced animals from 1 to 24, improved accuracy from 0.421 to0.897 for FImpute and from 0.396 to 0.777 for Beagle. Using optimally selected key individuals resulted in a higher imputation accuracy compared with using randomly selected individuals as a reference population for resequencing. With fixed reference population size(24), imputation accuracy increased from 0.654 to 0.875 for FImpute and from 0.512 to 0.762 for Beagle as the sequencing depth increased from 1 X to 12 X. With a given total cost of genotyping, accuracy increased with the size of the reference population for FImpute, but the pattern was not valid for Beagle, which showed the highest accuracy at six fold coverage for the scenarios used in this study.Conclusions: In conclusion, we comprehensively investigated the impacts of several key factors on genotype imputation. Generally, increasing sequencing cost gave a higher imputation accuracy. But with a fixed sequencing cost, the optimal imputation enhance the performance of WGP and GWAS. An optimal imputation strategy should take size of reference population, imputation algorithms, marker density, and population structure of the target population and methods to select key individuals into consideration comprehensively. This work sheds additional light on how to design and execute genotype imputation for livestock populations.  相似文献   

14.
The influence of genotype imputation using low‐density single nucleotide polymorphism (SNP) marker subsets on the genomic relationship matrix (G matrix), genetic variance explained, and genomic prediction (GP) was investigated for carcass weight and marbling score in Japanese Black fattened steers, using genotype data of approximately 40,000 SNPs. Genotypes were imputed using equally spaced SNP subsets of different densities. Two different linear models were used. The first (model 1) incorporated one G matrix, while the second (model 2) used two different G matrices constructed using the selected and remaining SNPs. When using model 1, the estimated additive genetic variance was always larger when using all SNPs obtained via genotype imputation than when using only equally spaced SNP subsets. The correlations between the genomic estimated breeding values obtained using genotype imputation with at least 3,000 SNPs and those using all available SNPs without imputation were higher than 0.99 for both traits. While additive genetic variance was likely to be partitioned with model 2, it did not enhance the accuracy of GP compared with model 1. These results indicate that genotype imputation using an equally spaced low‐density panel of an appropriate size can be used to produce a cost‐effective, valid GP.  相似文献   

15.
Using target and reference fattened steer populations, the performance of genotype imputation using lower‐density marker panels in Japanese Black cattle was evaluated. Population imputation was performed using BEAGLE software. Genotype information for approximately 40 000 single nucleotide polymorphism (SNP) markers by Illumina BovineSNP50 BeadChip was available, and imputation accuracy was assessed based on the average concordance rates of the genotypes, varying equally spaced SNP densities, and the number of individuals in the reference population. Two additional statistics were also calculated as indicators of imputation performance. The concordance rates tended to be lower for SNPs with greater minor allele frequencies, or those located near the ends of the chromosomes. Longer autosomes yielded greater imputation accuracies than shorter ones. When SNPs were selected based on linkage disequilibrium information, relative imputation accuracy was slightly improved. When 3000 and 10 000 equally spaced SNPs were used, the imputation accuracies were greater than 90% and approximately 97%, respectively. These results indicate that combining genotyping using a lower‐density SNP chip with genotype imputation based on a population of individuals genotyped using a higher‐density SNP chip is a cost‐effective and valid approach for genomic prediction.  相似文献   

16.
The objective of this paper was to investigate, for various scenarios at low and high marker density, the accuracy of imputing genotypes when using a multivariate mixed model framework using information from 2, 4, or 10 surrounding markers. This model predicts genotypes at a locus, using genotypes at nearby loci as correlated traits, and the additive genetic relationship matrix to use information from genotyped relatives. For 2 scenarios this method was compared with the population-based imputation algorithms FastPHASE and Beagle. Accuracies of imputation were obtained with Monte Carlo simulation and predicted with selection index theory, using input from the simulated data. Five different scenarios of missing genotypes were considered: 1) genotypes of some loci are missing due to genotyping errors, 2) juvenile selection candidates are genotyped using a smaller SNP panel, 3) some animals in the pedigree of a breeding population are not genotyped, 4) juvenile selection candidates are not genotyped, and 5) 1 generation of animals in the top of the pedigree are not genotyped. Surrounding marker information did not improve accuracy of imputation when animals whose genotypes were imputed were not genotyped for those surrounding markers. When those animals were genotyped for surrounding markers, results indicated a limited gain when linkage disequilibrium (LD) between SNP was low, but a substantial increase in accuracy when LD between SNP was high. For scenario 1, using 1 vs. 11 SNP, accuracy was respectively 0.75 and 0.81 at low, and 0.75 and 0.93 at high density. For scenario 2, using 1 vs. 11 SNP, accuracy was, respectively, 0.70 and 0.73 at low, and 0.71 and 0.84 at high density. Beagle outperformed the other methods at high SNP density, whereas the multivariate mixed model was clearly superior when SNP density was low and animals where genotyped with a reduced SNP panel. The results showed that extending the univariate gene content method to a multivariate BLUP model with inclusion of surrounding marker information only yields greater imputation accuracy when the animals with imputed loci are at least genotyped for some SNP that are in LD with the SNP to be imputed. The equation derived from selection index theory accurately predicted the accuracy of imputation using the multivariate mixed model framework.  相似文献   

17.
A major obstacle in applying genomic selection (GS) to uniquely adapted local breeds in less-developed countries has been the cost of genotyping at high densities of single-nucleotide polymorphisms (SNP). Cost reduction can be achieved by imputing genotypes from lower to higher densities. Locally adapted breeds tend to be admixed and exhibit a high degree of genomic heterogeneity thus necessitating the optimization of SNP selection for downstream imputation. The aim of this study was to quantify the achievable imputation accuracy for a sample of 1,135 South African (SA) Drakensberger cattle using several custom-derived lower-density panels varying in both SNP density and how the SNP were selected. From a pool of 120,608 genotyped SNP, subsets of SNP were chosen (1) at random, (2) with even genomic dispersion, (3) by maximizing the mean minor allele frequency (MAF), (4) using a combined score of MAF and linkage disequilibrium (LD), (5) using a partitioning-around-medoids (PAM) algorithm, and finally (6) using a hierarchical LD-based clustering algorithm. Imputation accuracy to higher density improved as SNP density increased; animal-wise imputation accuracy defined as the within-animal correlation between the imputed and actual alleles ranged from 0.625 to 0.990 when 2,500 randomly selected SNP were chosen vs. a range of 0.918 to 0.999 when 50,000 randomly selected SNP were used. At a panel density of 10,000 SNP, the mean (standard deviation) animal-wise allele concordance rate was 0.976 (0.018) vs. 0.982 (0.014) when the worst (i.e., random) as opposed to the best (i.e., combination of MAF and LD) SNP selection strategy was employed. A difference of 0.071 units was observed between the mean correlation-based accuracy of imputed SNP categorized as low (0.01 < MAF ≤ 0.1) vs. high MAF (0.4 < MAF ≤ 0.5). Greater mean imputation accuracy was achieved for SNP located on autosomal extremes when these regions were populated with more SNP. The presented results suggested that genotype imputation can be a practical cost-saving strategy for indigenous breeds such as the SA Drakensberger. Based on the results, a genotyping panel consisting of ~10,000 SNP selected based on a combination of MAF and LD would suffice in achieving a <3% imputation error rate for a breed characterized by genomic admixture on the condition that these SNP are selected based on breed-specific selection criteria.  相似文献   

18.
The objective of this study was to investigate the accuracy of genomic prediction of body weight and eating quality traits in a numerically small sheep population (Dorper sheep). Prediction was based on a large multi-breed/admixed reference population and using (a) 50k or 500k single nucleotide polymorphism (SNP) genotypes, (b) imputed whole-genome sequencing data (~31 million), (c) selected SNPs from whole genome sequence data and (d) 50k SNP genotypes plus selected SNPs from whole-genome sequence data. Furthermore, the impact of using a breed-adjusted genomic relationship matrix on accuracy of genomic breeding value was assessed. The selection of genetic variants was based on an association study performed on imputed whole-genome sequence data in an independent population, which was chosen either randomly from the base population or according to higher genetic proximity to the target population. Genomic prediction was based on genomic best linear unbiased prediction (GBLUP), and the accuracy of genomic prediction was assessed according to the correlation between genomic breeding value and corrected phenotypes divided by the square root of trait heritability. The accuracy of genomic prediction was between 0.20 and 0.30 across different traits based on common 50k SNP genotypes, which improved on average by 0.06 (absolute value) on average based on using prioritized genetic markers from whole-genome sequence data. Using prioritized genetic markers from a genetically more related GWAS population resulted in slightly higher prediction accuracy (0.02 absolute value) compared to genetic markers derived from a random GWAS population. Using high-density SNP genotypes or imputed whole-genome sequence data in GBLUP showed almost no improvement in genomic prediction accuracy however, accounting for different marker allele frequencies in reference population according to a breed-adjusted GRM resulted to on average 0.024 (absolute value) increase in accuracy of genomic prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号