首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 703 毫秒
1.
为探究基因组数据填充软件准确性的影响因素和展示填充具体过程,本研究使用两款主要填充软件Beagle 5.1和Minimac 3对奶牛基因组50K芯片数据进行填充至150K,使用个体的填充结果和真实数据进行填充一致性计算,比较两软件的填充准确性和一致性的差异及其主要影响因素。研究结果表明,Minimac 3软件需要使用其他软件进行基因定向后再进行填充,而Beagle 5.1软件可同时进行基因定向和基因组填充。Beagle 5.1与Minimac 3软件填充一致性的相关系数为0.98;Beagle 5.1软件平均填充的准确性(r~2)为0.9841,一致性为0.9914,填充准确性与一致性的相关系数为0.39;Minimac 3软件平均填充的准确性为0.9782,一致性为0.9911,填充准确性(r~2)和一致性的相关系数为0.36。由于软件计算填充准确性原理问题,填充的准确性(r~2)受最小等位基因影响较大。填充的一致性在最小等位基因频率和位点杂合度上升时均呈下降趋势,当位点杂合度0.6时显著下降(填充一致性低于0.8),但Beagle 5.1软件的填充效果在相同的最小等位基因频率和杂合度下均优于Minimac 3软件。本研究发现填充准确性(r~2)受填充位点的杂合度影响较大,而Beagle 5.1软件进行基因组数据填充的准确性更高,基因组数据填充后使用填充一致性作为填充准确性的判断标准可避免删除过多有效填充位点。  相似文献   

2.
为探究基于A矩阵期望遗传关系最大化(maximizing the expected genetic relationship for matrix A,RELA)、基于A矩阵目标群体遗传方差最小化(minimized the target population genetic variance for matrix A,MCA)、平均亲缘关系最大化(the highest mean kinship coefficients,KIN)、随机选择(random selection,RAN)、共同祖先筛选(common ancestor,CA)等不同参考群筛选方法及参考群规模对基因型填充准确性的影响。本研究使用矮小型黄羽肉鸡作为试验群体,采用鸡600K SNP芯片(Affymetrix Axion HD genotyping array)进行基因分型,测定435羽子代公鸡45、56、70、84、91日龄体重。利用Beagle软件将低密度SNP芯片填充为高密度SNP芯片数据,比较不同参考群筛选方法、参考群规模对基因型填充准确性的影响,以及填充芯片基因组预测准确性。结果表明,使用Beagle 4.0结合系谱信息进行填充效果最佳,其次为Beagle 4.0,而Beagle 5.1填充效果最差。使用MCA方法筛选参考群进行基因型填充准确性最高,使用RAN方法筛选参考群进行基因型填充准确性最低,MCA、RELA、CA 3种方法基因型填充准确性差别较小。相比其他方法,使用MCA方法筛选个体作为参考群将低密度SNP芯片填充至高密度SNP芯片进行基因组选择的预测准确性较高,与真实高密度SNP芯片的基因组预测准确性相差甚微。随着参考群规模增大,基因型填充准确性也随之增加,但增速逐渐下降,最后趋于平缓。综上所述,可以通过参考群筛选方法构建参考群以及控制参考群规模,以保证基因型填充和基因组预测准确性并节省成本,本研究为基因型填充在畜禽遗传育种中的应用提供技术参考。  相似文献   

3.
联合育种是我国生猪遗传改良计划的重要工作,联合育种能够扩大群体规模,增加群体内遗传变异,提高育种值估计的准确性,且相较于传统育种方法对低遗传力的繁殖性状有着更明显的效果。本研究收集了河北大好河山养殖有限公司、河北裕丰京安养殖有限公司、石家庄清凉山养殖有限公司(以下分别简称大好河山、京安和清凉山)3家育种场共6 790条大白猪的繁殖性状,构建了基因组选择合并参考群体,通过基因型填充将纽勤50K(Geneseek)芯片基因型填充到液相50K,采用一步法进行基因组联合遗传评估。结果表明:清凉山与裕丰京安两场遗传背景相近,大好河山场与其他两场存在较远的联系;基于系谱信息预测大好河山个体的总产仔数育种值准确性为0.170,基因组预测准确性则为0.324;通过联合基因组遗传评估,总产仔数基因组预测的准确性进一步提升至0.347,比基于单场系谱信息提高了104%。本研究表明通过基因型填充统一各场SNP芯片类型,构建河北省大白猪繁殖性状基因组选择参考群,从而进行联合基因组选择是可行的,尤其对提高常规育种进展缓慢的繁殖性状意义重大。  相似文献   

4.
旨在探究低密度液相芯片在生产实践中的实用性,降低育种成本。本试验选用了3 761头约160日龄,110 kg左右健康大白猪,随机抽取100头大白猪,根据10K芯片标记信息,从50K芯片中抽取标记生成10K芯片,作为填充群体。再从剩余群体中,分别随机抽取800、2 000、3 600个个体作为参考群体,使用Beagle 4.1软件对100头填充群体进行基因型填充至50K芯片,重复10次,以基因型一致性和基因型相关系数来评价基因型填充的准确性。结果表明,10K和50K芯片平均连锁不平衡(r2)程度为0.227和0.258,相差不大。最小等位基因频率(MAF)为0.05是基因型填充准确性的拐点,剔除掉MAF<0.05标记后,填充准确性明显升高。填充准确性随参考群体规模增大而上升,参考群由800头扩大到3 600头,填充准确性从0.90提高到0.95,10次重复的标准差也从0.006下降到0.002。对于较小的参考群体规模,染色体基因型填充准确性波动较大,随着参考群体规模增大,每条染色体填充准确性相差不大。本研究结果表明,猪液相芯片从10K填充到50K是可行的,可以大规模用于基因组选择,降低基因组选择育种成本。  相似文献   

5.
本研究旨在探讨系谱错误对猪基因组选择的影响。模拟数据研究表明,随着系谱错误率增加,基因组选择估计育种值的准确性、无偏性和秩相关系数均逐步减小,20%系谱错误率相较0%时准确性、无偏性和秩相关系数分别由0.423 3、0.174 9、0.409 9降为0.358 2、0.103 1、0.346 9;当基因型检测个体数目增多,0%与20%系谱错误率下基因组选择估计育种值准确性的差值缩小。研究表明,本研究所分析育种场群中系谱错误率约为3.2%;应用基因组选择一步法对有表型及系谱记录的大白猪进行育种值估计的准确性为0.4471,利用基因型数据矫正部分系谱错误后,准确性提高0.42%。以上结果表明,系谱错误的存在会降低基因组选择的育种值估计准确性,使育种值的估计无偏性变小;通过增加基因型检测数目可以减少系谱错误对基因组选择造成的负面影响。  相似文献   

6.
基因组选择(GS)是近些年发展起来的一项新型育种技术,目前已在动植物育种实践中应用。本研究通过在1 068头杜洛克公猪群体中使用不同密度的SNP芯片进行全基因组选择效果比较分析。结果发现:使用基因型填充后芯片以及高密度SNP芯片所获得的估计基因组育种值(GEBV)之间可以达到99%的相关,并发现个体间亲缘关系的远近对同群体内基因型填充结果的准确率影响不大。由此可见,与目标性状紧密相关的低密度SNP芯片可用于实际育种工作,在降低使用成本的同时并不影响全基因组选择效果,为实质性进行猪分子育种提供了一条可行途径。  相似文献   

7.
现行群体间遗传距离的度量方法都以等位基因频率的计算为依据,这些方法的缺点之一是在位点不多,多态笥不高时要求群体有较大的取样个体数目,另外,这些方法所建造的树素图不能反映群体是由混血引起还是由长期进化引起这一问题,针对这些问题,本文提出了以个体间蛋白质多位点基因型比较为基础的群体间遗传距离和群体内基因杂合度的度量方法,以十个主要中国黄牛群体的遗传关系分析为例,比较圆满的解决了上面几个问题。  相似文献   

8.
利用微卫星标记分析了乌骨大骨鸡的遗传多样性和遗传结构,筛选了鸡基因组7条染色体上的7个微卫星标记位点,随机选取24羽乌骨大骨鸡个体,进行多态性检测,共检测到23个等位基因,每个座位等位基因数目从2个到5个不等,平均等位基因数为3.3个。该群体平均多态信息含量和平均杂合度分别为0.660 3和0.717 6。结果表明乌骨大骨鸡属多态性较丰富的群体。  相似文献   

9.
为探讨湘西黄牛分子遗传特征和寻找生长性状相关的分子标记,本试验采用PCR-SSCP方法检测湘西黄牛生长抑素(somatostatin,SST)基因第1外显子126 bp处g.934G>A位点的多态性,并进行了多态性与生长性状的关联分析。结果表明,湘西黄牛在g.934G>A位点有G和A两个等位基因,G等位基因占优势,检测到GG、AG和AA 3种基因型,呈中度多态,达到Hardy-Weinberg平衡状态(P>0.05)。湘西黄牛GG基因型个体的体高和体长显著大于AA基因型个体(P<0.05);GG和AG基因型个体的胸围和体重极显著大于AA基因型个体(P<0.01)。G等位基因对湘西黄牛的体高、体长、胸围和体重4个性状均为正效应。本试验结果提示,SST基因g.934G>A位点可能为湘西黄牛生长性状的分子遗传标记位点。  相似文献   

10.
运用SNP芯片评估马身猪保种群体的遗传结构   总被引:1,自引:1,他引:0  
旨在研究马身猪保种群体的遗传多样性、亲缘关系和家系结构。本研究利用Illumina CAUPorince 50 K SNP芯片检测39头保种马身猪的单核苷酸多态性(single nucleotide polymorphism,SNP);采用Plink软件计算最小等位基因频率、多态信息含量、观察杂合度和期望杂合度,分析保种群体的遗传多样性;采用Plink软件构建状态同源(identity by state,IBS)距离矩阵和分析连续性纯合片段(runs of homozygosity,ROH),采用Gmatix软件构建G矩阵,分析保种群体的亲缘关系;采用Mega X软件构建群体进化树,分析保种群体的家系结构。结果显示,39头保种马身猪中共检测到43 832个SNPs位点,平均基因型检出率为0.980 1;通过质控的SNPs位点有28 859个,其中72.4%具有多态性,该款SNP芯片适用于分析马身猪的遗传多样性。有效等位基因数为1.563 4,多态信息含量为0.412,最小等位基因频率为0.258,表明马身猪保种群体的遗传多样性比较丰富;平均观察杂合度为0.354 1,平均期望杂合度为0.349 9,说明马身猪保种群体出现了分化;平均IBS遗传距离为0.284 2,其中公猪为0.285 2,IBS距离矩阵和G矩阵结果均表明部分种猪之间存在亲缘关系;ROH共有8 131个,其中46.15%的长度在400~600 Mb之间,平均近交系数为0.237,说明保种群体的近交程度高;群体进化树结果表明,马身猪保种群体来源于3个家系,各家系的个体数量差异明显。马身猪保种群体的遗传多样性较丰富,但近交程度高,家系少,各家系的个体数量差异大,容易引起遗传多样性的丢失,因此,需从原种场引入新的血统,扩大保种群体数量,降低近交系数。  相似文献   

11.
Genomic data is more and more widely used in livestock breeding. Genotype imputation is an important tool to handle missing values in genotypic data, and the quality of imputation results directly affects the subsequent analysis. To obtain good imputation results, a comprehensive imputation strategy needs to be formulated. We studied on the effects of several factors on genotype imputation by simulation. The factors included reference population size, genetic relationship (distance) between the target population and the reference population, the number of target sites (proportion), the minimum allele frequency (MAF), and the imputation algorithm. The results showed that the number of target sites was the main factor affecting the genotype imputation, and it showed significantly positive correlation with the quality of imputation(P<0.05). The reference population size was the main factor affecting the imputation error rate in Beagle5.1. Correspondingly, the number of target sites was the main factor affecting the imputation error rate in Minimac4. Genetic distance between the target population and the reference population had a more significant effect on the imputation quality of Beagle5.1 than Minimac4. In general, the imputation error rate increased as the increases of MAF in a site. When the number of individuals in the reference population was small and the number of target sites was large, the speed of Minimac4 was superior to Beagle5.1, but there was a reverse trend as the reference population size increased. On the premise of ensuring the imputation quality, Beagle5.1 had relatively lower requirements for the above factors. In contrast, when the number of target sites was low and reference population size was large, the imputation effect of Beagle5.1 was better, while Minimac4 was more suitable for the imputation of a small reference population size and a higher number of target sites. In this study, different strategies were formulated for different imputation purposes, and the study results would provide a reference for genotype imputation.  相似文献   

12.
Background: Genome-wide association studies and genomic predictions are thought to be optimized by using whole-genome sequence(WGS) data. However, sequencing thousands of individuals of interest is expensive.Imputation from SNP panels to WGS data is an attractive and less expensive approach to obtain WGS data. The aims of this study were to investigate the accuracy of imputation and to provide insight into the design and execution of genotype imputation.Results: We genotyped 450 chickens with a 600 K SNP array, and sequenced 24 key individuals by whole genome re-sequencing. Accuracy of imputation from putative 60 K and 600 K array data to WGS data was 0.620 and 0.812 for Beagle, and 0.810 and 0.914 for FImpute, respectively. By increasing the sequencing cost from 24 X to 144 X, the imputation accuracy increased from 0.525 to 0.698 for Beagle and from 0.654 to 0.823 for FImpute. With fixed sequence depth(12 X), increasing the number of sequenced animals from 1 to 24, improved accuracy from 0.421 to0.897 for FImpute and from 0.396 to 0.777 for Beagle. Using optimally selected key individuals resulted in a higher imputation accuracy compared with using randomly selected individuals as a reference population for resequencing. With fixed reference population size(24), imputation accuracy increased from 0.654 to 0.875 for FImpute and from 0.512 to 0.762 for Beagle as the sequencing depth increased from 1 X to 12 X. With a given total cost of genotyping, accuracy increased with the size of the reference population for FImpute, but the pattern was not valid for Beagle, which showed the highest accuracy at six fold coverage for the scenarios used in this study.Conclusions: In conclusion, we comprehensively investigated the impacts of several key factors on genotype imputation. Generally, increasing sequencing cost gave a higher imputation accuracy. But with a fixed sequencing cost, the optimal imputation enhance the performance of WGP and GWAS. An optimal imputation strategy should take size of reference population, imputation algorithms, marker density, and population structure of the target population and methods to select key individuals into consideration comprehensively. This work sheds additional light on how to design and execute genotype imputation for livestock populations.  相似文献   

13.
A major obstacle in applying genomic selection (GS) to uniquely adapted local breeds in less-developed countries has been the cost of genotyping at high densities of single-nucleotide polymorphisms (SNP). Cost reduction can be achieved by imputing genotypes from lower to higher densities. Locally adapted breeds tend to be admixed and exhibit a high degree of genomic heterogeneity thus necessitating the optimization of SNP selection for downstream imputation. The aim of this study was to quantify the achievable imputation accuracy for a sample of 1,135 South African (SA) Drakensberger cattle using several custom-derived lower-density panels varying in both SNP density and how the SNP were selected. From a pool of 120,608 genotyped SNP, subsets of SNP were chosen (1) at random, (2) with even genomic dispersion, (3) by maximizing the mean minor allele frequency (MAF), (4) using a combined score of MAF and linkage disequilibrium (LD), (5) using a partitioning-around-medoids (PAM) algorithm, and finally (6) using a hierarchical LD-based clustering algorithm. Imputation accuracy to higher density improved as SNP density increased; animal-wise imputation accuracy defined as the within-animal correlation between the imputed and actual alleles ranged from 0.625 to 0.990 when 2,500 randomly selected SNP were chosen vs. a range of 0.918 to 0.999 when 50,000 randomly selected SNP were used. At a panel density of 10,000 SNP, the mean (standard deviation) animal-wise allele concordance rate was 0.976 (0.018) vs. 0.982 (0.014) when the worst (i.e., random) as opposed to the best (i.e., combination of MAF and LD) SNP selection strategy was employed. A difference of 0.071 units was observed between the mean correlation-based accuracy of imputed SNP categorized as low (0.01 < MAF ≤ 0.1) vs. high MAF (0.4 < MAF ≤ 0.5). Greater mean imputation accuracy was achieved for SNP located on autosomal extremes when these regions were populated with more SNP. The presented results suggested that genotype imputation can be a practical cost-saving strategy for indigenous breeds such as the SA Drakensberger. Based on the results, a genotyping panel consisting of ~10,000 SNP selected based on a combination of MAF and LD would suffice in achieving a <3% imputation error rate for a breed characterized by genomic admixture on the condition that these SNP are selected based on breed-specific selection criteria.  相似文献   

14.
Using target and reference fattened steer populations, the performance of genotype imputation using lower‐density marker panels in Japanese Black cattle was evaluated. Population imputation was performed using BEAGLE software. Genotype information for approximately 40 000 single nucleotide polymorphism (SNP) markers by Illumina BovineSNP50 BeadChip was available, and imputation accuracy was assessed based on the average concordance rates of the genotypes, varying equally spaced SNP densities, and the number of individuals in the reference population. Two additional statistics were also calculated as indicators of imputation performance. The concordance rates tended to be lower for SNPs with greater minor allele frequencies, or those located near the ends of the chromosomes. Longer autosomes yielded greater imputation accuracies than shorter ones. When SNPs were selected based on linkage disequilibrium information, relative imputation accuracy was slightly improved. When 3000 and 10 000 equally spaced SNPs were used, the imputation accuracies were greater than 90% and approximately 97%, respectively. These results indicate that combining genotyping using a lower‐density SNP chip with genotype imputation based on a population of individuals genotyped using a higher‐density SNP chip is a cost‐effective and valid approach for genomic prediction.  相似文献   

15.
This study investigated the effect of including Nordic Holsteins in the reference population on the imputation accuracy and prediction accuracy for Chinese Holsteins. The data used in this study include 85 Chinese Holstein bulls genotyped with both 54K chip and 777K (HD) chip, 2862 Chinese cows genotyped with 54K chip, 510 Nordic Holstein bulls genotyped with HD chip, and 4398 Nordic Holstein bulls genotyped with 54K chip and with deregressed proofs for five milk production traits. Based on these data, the accuracy of imputation from 54K to HD marker data and the accuracy of genomic predictions in Chinese Holstein were assessed. The allele correct rate increased around 2.7 and 1.7% in imputation from the 54K to the HD marker data for Chinese Holstein bulls and cows, respectively, when the Nordic HD‐genotyped bulls were included in the reference data for imputation. However, the prediction accuracy was improved slightly when using the marker data imputed based on the combined HD reference data, compared with using the marker data imputed based on the Chinese HD reference data only. On the other hand, when using the combined reference population including 4398 Nordic Holstein bulls, the accuracy of genomic predictions increased 6.5 percentage points together with a reduction of prediction bias. The HD markers did not outperform the 54K markers in genomic prediction based on the present data. The results indicate that for Chinese Holsteins, it is necessary to genotype more individuals with 54K chip to increase reference population rather than increasing marker density.  相似文献   

16.
Missing genotypes are a common feature of high density SNP datasets obtained using SNP chip technology and this is likely to decrease the accuracy of genomic selection. This problem can be circumvented by imputing the missing genotypes with estimated genotypes. When implementing imputation, the criteria used for SNP data quality control and whether to perform imputation before or after data quality control need to consider. In this paper, we compared six strategies of imputation and quality control using different imputation methods, different quality control criteria and by changing the order of imputation and quality control, against a real dataset of milk production traits in Chinese Holstein cattle. The results demonstrated that, no matter what imputation method and quality control criteria were used, strategies with imputation before quality control performed better than strategies with imputation after quality control in terms of accuracy of genomic selection. The different imputation methods and quality control criteria did not significantly influence the accuracy of genomic selection. We concluded that performing imputation before quality control could increase the accuracy of genomic selection, especially when the rate of missing genotypes is high and the reference population is small.  相似文献   

17.
The influence of genotype imputation using low‐density single nucleotide polymorphism (SNP) marker subsets on the genomic relationship matrix (G matrix), genetic variance explained, and genomic prediction (GP) was investigated for carcass weight and marbling score in Japanese Black fattened steers, using genotype data of approximately 40,000 SNPs. Genotypes were imputed using equally spaced SNP subsets of different densities. Two different linear models were used. The first (model 1) incorporated one G matrix, while the second (model 2) used two different G matrices constructed using the selected and remaining SNPs. When using model 1, the estimated additive genetic variance was always larger when using all SNPs obtained via genotype imputation than when using only equally spaced SNP subsets. The correlations between the genomic estimated breeding values obtained using genotype imputation with at least 3,000 SNPs and those using all available SNPs without imputation were higher than 0.99 for both traits. While additive genetic variance was likely to be partitioned with model 2, it did not enhance the accuracy of GP compared with model 1. These results indicate that genotype imputation using an equally spaced low‐density panel of an appropriate size can be used to produce a cost‐effective, valid GP.  相似文献   

18.
Boar reproductive traits are economically important for the pig industry. Here we conducted a genome‐wide association study (GWAS) for 13 reproductive traits measured on 205 F2 boars at day 300 using 60 K single nucleotide polymorphism (SNP) data imputed from a reference panel of 1200 pigs in a White Duroc × Erhualian F2 intercross population. We identified 10 significant loci for seven traits on eight pig chromosomes (SSC). Two loci surpassed the genome‐wide significance level, including one for epididymal weight around 60.25 Mb on SSC7 and one for semen temperature around 43.69 Mb on SSC4. Four of the 10 significant loci that we identified were consistent with previously reported quantitative trait loci for boar reproduction traits. We highlighted several interesting candidate genes at these loci, including APN, TEP1, PARP2, SPINK1 and PDE1C. To evaluate the imputation accuracy, we further genotyped nine GWAS top SNPs using PCR restriction fragment length polymorphism or Sanger sequencing. We found an average of 91.44% of genotype concordance, 95.36% of allelic concordance and 0.85 of r2 correlation between imputed and real genotype data. This indicates that our GWAS mapping results based on imputed SNP data are reliable, providing insights into the genetic basis of boar reproductive traits.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号