-
Genomic imputation and evaluation using high-density Holstein genotypes.
Genomic evaluations for 161,341 Holsteins were computed by using 311,725 of 777,962 markers on the Illumina BovineHD Genotyping BeadChip (HD). Initial edits with 1,741 HD genotypes from 5 breeds revealed that 636,967 markers were usable but that half were redundant. Holstein genotypes were from 1,510 animals with HD markers, 82,358 animals with 45,187 (50K) markers, 1,797 animals with 8,031 (8K) markers, 20,177 animals with 6,836 (6K) markers, 52,270 animals with 2,683 (3K) markers, and 3,229 nongenotyped dams (0K) with >90% of haplotypes imputable because they had 4 or more genotyped progeny. The Holstein HD genotypes were from 1,142 US, Canadian, British, and Italian sires, 196 other sires, 138 cows in a US Department of Agriculture research herd (Beltsville, MD), and 34 other females. Percentages of correctly imputed genotypes were tested by applying the programs findhap and FImpute to a simulated chromosome for an earlier population that had only 1,112 animals with HD genotypes and none with 8K genotypes. For each chip, 1% of the genotypes were missing and 0.02% were incorrect initially. After imputation of missing markers with findhap, percentages of genotypes correct were 99.9% from HD, 99.0% from 50K, 94.6% from 6K, 90.5% from 3K, and 93.5% from 0K. With FImpute, 99.96% were correct from HD, 99.3% from 50K, 94.7% from 6K, 91.1% from 3K, and 95.1% from 0K genotypes. Accuracy for the 3K and 6K genotypes further improved by approximately 2 percentage points if imputed first to 50K and then to HD instead of imputing all genotypes directly to HD. Evaluations were tested by using imputed actual genotypes and August 2008 phenotypes to predict deregressed evaluations of US bulls proven after August 2008. For 28 traits tested, the estimated genomic reliability averaged 61.1% when using 311,725 markers vs. 60.7% when using 45,187 markers vs. 29.6% from the traditional parent average. Squared correlations with future data were slightly greater for 16 traits and slightly less for 12 with HD than with 50K evaluations. The observed 0.4 percentage point average increase in reliability was less favorable than the 0.9 expected from simulation but was similar to actual gains from other HD studies. The largest HD and 50K marker effects were often located at very similar positions. The single-breed evaluation tested here and previous single-breed or multibreed evaluations have not produced large gains. Increasing the number of HD genotypes used for imputation above 1,074 did not improve the reliability of Holstein genomic evaluations.
VanRaden PM
,Null DJ
,Sargolzaei M
,Wiggans GR
,Tooker ME
,Cole JB
,Sonstegard TS
,Connor EE
,Winters M
,van Kaam JB
,Valentini A
,Van Doormaal BJ
,Faust MA
,Doak GA
... -
《-》
-
Assets of imputation to ultra-high density for productive and functional traits.
The aim of this study was to evaluate different-density genotyping panels for genotype imputation and genomic prediction. Genotypes from customized Golden Gate Bovine3K BeadChip [LD3K; low-density (LD) 3,000-marker (3K); Illumina Inc., San Diego, CA] and BovineLD BeadChip [LD6K; 6,000-marker (6K); Illumina Inc.] panels were imputed to the BovineSNP50v2 BeadChip [50K; 50,000-marker; Illumina Inc.]. In addition, LD3K, LD6K, and 50K genotypes were imputed to a BovineHD BeadChip [HD; high-density 800,000-marker (800K) panel], and with predictive ability evaluated and compared subsequently. Comparisons of prediction accuracy were carried out using Random boosting and genomic BLUP. Four traits under selection in the Spanish Holstein population were used: milk yield, fat percentage (FP), somatic cell count, and days open (DO). Training sets at 50K density for imputation and prediction included 1,632 genotypes. Testing sets for imputation from LD to 50K contained 834 genotypes and testing sets for genomic evaluation included 383 bulls. The reference population genotyped at HD included 192 bulls. Imputation using BEAGLE software (http://faculty.washington.edu/browning/beagle/beagle.html) was effective for reconstruction of dense 50K and HD genotypes, even when a small reference population was used, with 98.3% of SNP correctly imputed. Random boosting outperformed genomic BLUP in terms of prediction reliability, mean squared error, and selection effectiveness of top animals in the case of FP. For other traits, however, no clear differences existed between methods. No differences were found between imputed LD and 50K genotypes, whereas evaluation of genotypes imputed to HD was on average across data set, method, and trait, 4% more accurate than 50K prediction, and showed smaller (2%) mean squared error of predictions. Similar bias in regression coefficients was found across data sets but regressions were 0.32 units closer to unity for DO when genotypes were imputed to HD density. Imputation to HD genotypes might produce higher stability in the genomic proofs of young candidates. Regarding selection effectiveness of top animals, more (2%) top bulls were classified correctly with imputed LD6K genotypes than with LD3K. When the original 50K genotypes were used, correct classification of top bulls increased by 1%, and when those genotypes were imputed to HD, 3% more top bulls were detected. Selection effectiveness could be slightly enhanced for certain traits such as FP, somatic cell count, or DO when genotypes are imputed to HD. Genetic evaluation units may consider a trait-dependent strategy in terms of method and genotype density for use in the genome-enhanced evaluations.
Jiménez-Montero JA
,Gianola D
,Weigel K
,Alenda R
,González-Recio O
... -
《-》
-
Use of the Illumina Bovine3K BeadChip in dairy genomic evaluation.
Genomic evaluations using genotypes from the Illumina Bovine3K BeadChip (3K) became available in September 2010 and were made official in December 2010. The majority of 3K-genotyped animals have been Holstein females. Approximately 5% of male 3K genotypes and between 3.7 and 13.9%, depending on registry status, of female genotypes had sire conflicts. The chemistry used for the 3K is different from that of the Illumina BovineSNP50 BeadChip (50K) and causes greater variability in the accuracy of the genotypes. Approximately 2% of genotypes were rejected due to this inaccuracy. A single nucleotide polymorphism (SNP) was determined to be not usable for genomic evaluation based on percentage missing, percentage of parent-progeny conflicts, and Hardy-Weinberg equilibrium discrepancies. Those edits left 2,683 of the 2,900 3K SNP for use in genomic evaluations. The mean minor allele frequencies (MAF) for Holstein, Jersey, and Brown Swiss were 0.32, 0.28, and 0.29, respectively. Eighty-one SNP had both a large number of missing genotypes and a large number of parent-progeny conflicts, suggesting a correlation between call rate and accuracy. To calculate a genomic predicted transmitting ability (GPTA) the genotype of an animal tested on a 3K is imputed to the 45,187 SNP included in the current genomic evaluation based on the 50K. The accuracy of imputation increases as the number of genotyped parents increases from none to 1 to both. The average percentage of imputed genotypes that matched the corresponding actual 50K genotypes was 96.3%. The correlation of a GPTA calculated from a 3K genotype that had been imputed to 50K and GPTA from its actual 50K genotype averaged 0.959 across traits for Holsteins and was slightly higher for Jerseys at 0.963. The average difference in GPTA from the 50K- and 3K-based genotypes across trait was close to 0. The evaluation system has been modified to accommodate the characteristics of the 3K. The low cost of the 3K has greatly increased genotyping of females. Prior to the availability of the 3K (August 2010), female genotyping accounted for 38.7% of the genotyped animals. In the past year, the portion of total genotypes from females across all chip types rose to 59.0%.
Wiggans GR
,Cooper TA
,Vanraden PM
,Olson KM
,Tooker ME
... -
《-》
-
Reliability of genomic prediction for German Holsteins using imputed genotypes from low-density chips.
With the availability of single nucleotide polymorphism (SNP) marker chips, such as the Illumina BovineSNP50 BeadChip (50K), genomic evaluation has been routinely implemented in dairy cattle breeding. However, for an average dairy producer, total costs associated with the 50K chip are still too high to have all the cows genotyped and genomically evaluated. To study the accuracy of cheaper low-density chips, genotypes were simulated for 2 low-density chips, the Illumina Bovine3K BeadChip (3K) and BovineLD BeadChip (6K), according to their original marker maps. Simulated missing genotypes of the 50K chip were imputed using the programs Beagle and Findhap. Three genotype data sets were used to study imputation accuracy: the EuroGenomics data set, with 14,405 reference bulls (data set I); the smaller EuroGenomics data set, with 11,670 older reference bulls (data set II); and the data set of all genotyped German Holsteins, with 31,597 reference animals (data set III). Imputed genotypes were compared with their original ones to calculate allele error rate for validation animals in the 3 data sets. To evaluate the loss in accuracy of genomic prediction when using imputed genotypes, a genomic evaluation was conducted only for EuroGenomics data set II. Furthermore, combined genome-enhanced breeding values calculated from the original and imputed genotypes were compared. Allele error rate for EuroGenomics data set II was highest for the Findhap program on the 3K chip (3.3%) and lowest for the Beagle program on the 6K chip (0.6%). Across the data sets, Beagle was shown to be about 2 times as accurate as Findhap. Compared with the real 50K genotypes, the reduction in reliability of the genomic prediction when using the imputed genotypes was highest for Findhap on the 3K chip (5.3%) and lowest for Beagle on the 6K chip (1%) when averaged over the 12 evaluated traits. Differences in genome-enhanced breeding values of the original and imputed genotypes were largest for Findhap on the 3K chip, whereas Beagle on the 6K chip had the smallest difference. The low-density chip, 6K, gave markedly higher imputation accuracy and more accurate genomic prediction than the 3K chip. On the basis of the relatively small reduction in accuracy of genomic prediction, we would recommend the BovineLD 6K chip for large-scale genotyping as long as its costs are acceptable to breeders.
Segelke D
,Chen J
,Liu Z
,Reinhardt F
,Thaller G
,Reents R
... -
《-》
-
Comparison of genomic predictions using medium-density (∼54,000) and high-density (∼777,000) single nucleotide polymorphism marker panels in Nordic Holstein and Red Dairy Cattle populations.
This study investigated genomic prediction using medium-density (∼54,000; 54K) and high-density marker panels (∼777,000; 777K), based on data from Nordic Holstein and Red Dairy Cattle (RDC). The Holstein data comprised 4,539 progeny-tested bulls, and the RDC data 4,403 progeny-tested bulls. The data were divided into reference data and test data using October 1, 2001, as a cut-off date (birth date of the bulls). This resulted in about 25% genotyped bulls in the Holstein test data and 20% in the RDC test data. For each breed, 3 data sets of markers were used to predict breeding values: (1) 54K data set with missing genotypes, (2) 54K data set where missing genotypes were imputed, and (3) imputed high-density (HD) marker data set created by imputing the 54K data to the HD data based on 557 bulls genotyped using a 777K single nucleotide polymorphism chip in Holstein, and 706 bulls in RDC. Based on the 3 marker data sets, direct genomic breeding values (DGV) for protein, fertility, and udder health were predicted using a genomic BLUP model (GBLUP) and a Bayesian mixture model with 2 normal distributions. Reliability of DGV was measured as squared correlations between deregressed proofs (DRP) and DGV corrected for reliability of DRP. Unbiasedness was assessed by regression of DRP on DGV, based on the bulls in the test data sets. Averaged over the 3 traits, reliability of DGV based on the HD markers was 0.5% higher than that based on the 54K data in Holstein, and 1.0% higher than that in RDC. In addition, the HD markers led to an improvement of unbiasedness of DGV. The Bayesian mixture model led to 0.5% higher reliability than the GBLUP model in Holstein, but not in RDC. Imputing missing genotypes in the 54K marker data did not improve genomic predictions for most of the traits.
Su G
,Brøndum RF
,Ma P
,Guldbrandtsen B
,Aamand GP
,Lund MS
... -
《-》