-
Accuracy of direct genomic values derived from imputed single nucleotide polymorphism genotypes in Jersey cattle.
The objective of the present study was to evaluate the predictive ability of direct genomic values for economically important dairy traits when genotypes at some single nucleotide polymorphism (SNP) loci were imputed rather than measured directly. Genotypic data consisted of 42,552 SNP genotypes for each of 1,762 Jersey sires. Phenotypic data consisted of predicted transmitting abilities (PTA) for milk yield, protein percentage, and daughter pregnancy rate from May 2006 for 1,446 sires in the training set and from April 2009 for 316 sires in the testing set. The SNP effects were estimated using the Bayesian least absolute selection and shrinkage operator (LASSO) method with data of sires in the training set, and direct genomic values (DGV) for sires in the testing set were computed by multiplying these estimates by corresponding genotype dosages for sires in the testing set. The mean correlation across traits between DGV (before progeny testing) and PTA (after progeny testing) for sires in the testing set was 70.6% when all 42,552 SNP genotypes were used. When genotypes for 93.1, 96.6, 98.3, or 99.1% of loci were masked and subsequently imputed in the testing set, mean correlations across traits between DGV and PTA were 68.5, 64.8, 54.8, or 43.5%, respectively. When genotypes were also masked and imputed for a random 50% of sires in the training set, mean correlations across traits between DGV and PTA were 65.7, 63.2, 53.9, or 49.5%, respectively. Results of this study indicate that if a suitable reference population with high-density genotypes is available, a low-density chip comprising 3,000 equally spaced SNP may provide approximately 95% of the predictive ability observed with the BovineSNP50 Beadchip (Illumina Inc., San Diego, CA) in Jersey cattle. However, if fewer than 1,500 SNP are genotyped, the accuracy of DGV may be limited by errors in the imputed genotypes of selection candidates.
Weigel KA
,de Los Campos G
,Vazquez AI
,Rosa GJ
,Gianola D
,Van Tassell CP
... -
《-》
-
Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers.
The objective of the present study was to assess the predictive ability of subsets of single nucleotide polymorphism (SNP) markers for development of low-cost, low-density genotyping assays in dairy cattle. Dense SNP genotypes of 4,703 Holstein bulls were provided by the USDA Agricultural Research Service. A subset of 3,305 bulls born from 1952 to 1998 was used to fit various models (training set), and a subset of 1,398 bulls born from 1999 to 2002 was used to evaluate their predictive ability (testing set). After editing, data included genotypes for 32,518 SNP and August 2003 and April 2008 predicted transmitting abilities (PTA) for lifetime net merit (LNM$), the latter resulting from progeny testing. The Bayesian least absolute shrinkage and selection operator method was used to regress August 2003 PTA on marker covariates in the training set to arrive at estimates of marker effects and direct genomic PTA. The coefficient of determination (R(2)) from regressing the April 2008 progeny test PTA of bulls in the testing set on their August 2003 direct genomic PTA was 0.375. Subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP were created by choosing equally spaced and highly ranked SNP, with the latter based on the absolute value of their estimated effects obtained from the training set. The SNP effects were re-estimated from the training set for each subset of SNP, and the 2008 progeny test PTA of bulls in the testing set were regressed on corresponding direct genomic PTA. The R(2) values for subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP with largest effects (evenly spaced SNP) were 0.184 (0.064), 0.236 (0.111), 0.269 (0.190), 0.289 (0.179), 0.307 (0.228), 0.313 (0.268), and 0.322 (0.291), respectively. These results indicate that a low-density assay comprising selected SNP could be a cost-effective alternative for selection decisions and that significant gains in predictive ability may be achieved by increasing the number of SNP allocated to such an assay from 300 or fewer to 1,000 or more.
Weigel KA
,de los Campos G
,González-Recio O
,Naya H
,Wu XL
,Long N
,Rosa GJ
,Gianola D
... -
《-》
-
Imputation of genotypes from different single nucleotide polymorphism panels in dairy cattle.
Imputation of missing genotypes is important to join data from animals genotyped on different single nucleotide polymorphism (SNP) panels. Because of the evolution of available technologies, economical reasons, or coexistence of several products from competing organizations, animals might be genotyped for different SNP chips. Combined analysis of all the data increases accuracy of genomic selection or fine-mapping precision. In the present study, real data from 4,738 Dutch Holstein animals genotyped with custom-made 60K Illumina panels (Illumina, San Diego, CA) were used to mimic imputation of genotypes between 2 SNP panels of approximately 27,500 markers each and with 9,265 SNP markers in common. Imputation efficiency increased with number of reference animals (genotyped for both chips), when animals genotyped on a single chip were included in the training data, with regional higher marker densities, with greater distance to chromosome ends, and with a closer relationship between imputed and reference animals. With 0 to 2,000 animals genotyped for both chips, the mean imputation error rate ranged from 2.774 to 0.415% and accuracy ranged from 0.81 to 0.96. Then, imputation was applied in the Dutch Holstein population to predict alleles from markers of the Illumina Bovine SNP50 chip with markers from a custom-made 60K Illumina panel. A cross-validation study performed on 102 bulls indicated that the mean error rate per bull was approximately equal to 1.0%. This study showed the feasibility to impute markers in dairy cattle with the current marker panels and with error rates below 1%.
Druet T
,Schrooten C
,de Roos AP
《-》
-
Effect of imputing markers from a low-density chip on the reliability of genomic breeding values in Holstein populations.
The purpose of this study was to investigate the imputation error and loss of reliability of direct genomic values (DGV) or genomically enhanced breeding values (GEBV) when using genotypes imputed from a 3,000-marker single nucleotide polymorphism (SNP) panel to a 50,000-marker SNP panel. Data consisted of genotypes of 15,966 European Holstein bulls from the combined EuroGenomics reference population. Genotypes with the low-density chip were created by erasing markers from 50,000-marker data. The studies were performed in the Nordic countries (Denmark, Finland, and Sweden) using a BLUP model for prediction of DGV and in France using a genomic marker-assisted selection approach for prediction of GEBV. Imputation in both studies was done using a combination of the DAGPHASE 1.1 and Beagle 2.1.3 software. Traits considered were protein yield, fertility, somatic cell count, and udder depth. Imputation of missing markers and prediction of breeding values were performed using 2 different reference populations in each country: either a national reference population or a combined EuroGenomics reference population. Validation for accuracy of imputation and genomic prediction was done based on national test data. Mean imputation error rates when using national reference animals was 5.5 and 3.9% in the Nordic countries and France, respectively, whereas imputation based on the EuroGenomics reference data set gave mean error rates of 4.0 and 2.1%, respectively. Prediction of GEBV based on genotypes imputed with a national reference data set gave an absolute loss of 0.05 in mean reliability of GEBV in the French study, whereas a loss of 0.03 was obtained for reliability of DGV in the Nordic study. When genotypes were imputed using the EuroGenomics reference, a loss of 0.02 in mean reliability of GEBV was detected in the French study, and a loss of 0.06 was observed for the mean reliability of DGV in the Nordic study. Consequently, the reliability of DGV using the imputed SNP data was 0.38 based on national reference data, and 0.48 based on EuroGenomics reference data in the Nordic validation, and the reliability of GEBV using the imputed SNP data was 0.41 based on national reference data, and 0.44 based on EuroGenomics reference data in the French validation.
Dassonneville R
,Brøndum RF
,Druet T
,Fritz S
,Guillaume F
,Guldbrandtsen B
,Lund MS
,Ducrocq V
,Su G
... -
《-》
-
Selection of single-nucleotide polymorphisms and quality of genotypes used in genomic evaluation of dairy cattle in the United States and Canada.
Nearly 57,000 single-nucleotide polymorphisms (SNP) genotyped with the Illumina BovineSNP50 BeadChip (Illumina Inc., San Diego, CA) were investigated to determine usefulness of the associated SNP for genomic prediction. Genotypes were obtained for 12,591 bulls and cows, and SNP were selected based on 5,503 bulls with genotypes from a larger set of SNP. The following SNP were deleted: 6,572 that were monomorphic, 3,213 with scoring problems (primarily because of poor definition of clusters and excess number of clusters), and 3,649 with a minor allele frequency of <2%. Number of SNP for each minor allele frequency class (> or =2%) was fairly uniform (777 to 1,004). For 5 contiguous SNP assigned to chromosome 7, no bulls were heterozygous, which indicated that those SNP are actually on the nonpseudoautosomal portion of the X chromosome. Another 178 SNP that were not assigned to a chromosome but that had many fewer heterozygotes than expected were also assigned to the X chromosome. Existence of Hardy-Weinberg equilibrium was investigated by comparing observed with expected heterozygosity. For 11 SNP, the observed percentage of heterozygous individuals differed from the expected by >15%; therefore, those SNP were deleted. For 2,628 SNP, the genotype at another SNP was highly correlated (i.e., genotypes were identical for >99.5% of bulls), and those were deleted. After edits, 40,874 SNP remained. A parent-progeny conflict was declared when the genotypes were alternate homozygotes. Mean number of conflicts was 2.3 when pedigree was correct and 2,411 when it was incorrect. The sire was genotyped for >93% of animals. Maternal grandsire genotype was similarly checked; however, because alternate homozygotes could be valid, a conflict threshold of 16% was used to indicate a need for further investigation. Genotyping consistency was investigated for 21 bulls genotyped twice with differences primarily from SNP that were not scored in one of the genotypes. Concordance for readable SNP was extremely high (99.96-100%). Thousands of SNP that were polymorphic in Holsteins were monomorphic in Jerseys or Brown Swiss, which indicated that breed-specific SNP sets are required or that all breeds need to be considered in the SNP selection process. Genotypes from the Illumina BovineSNP50 BeadChip are of high accuracy and provide the basis for genomic evaluations in the United States and Canada.
Wiggans GR
,Sonstegard TS
,VanRaden PM
,Matukumalli LK
,Schnabel RD
,Taylor JF
,Schenkel FS
,Van Tassell CP
... -
《-》