-
Single-step genomic model improved reliability and reduced the bias of genomic predictions in Danish Jersey.
A bias in the trend of genomic estimated breeding values (GEBV) was observed in the Danish Jersey population where the trend of GEBV was smaller than the deregressed proofs for individuals in the validation population. This study attempted to improve the prediction reliability and reduce the bias of predicted genetic trend in Danish Jersey. The data consisted of 1,238 Danish Jersey bulls and 611,695 cows. All bulls were genotyped with the 54K chip, and 1,744 cows were genotyped with either 7K chips (1,157 individuals) or 54K chips (587 individuals). The trait used in the analysis was protein yield. All cows with EBV were used in a single-step approach. Deregressed proofs were used as the response variable. Four alternative approaches were compared with genomic best linear unbiased prediction (GBLUP) model with bulls in the reference data (GBLUPBull): (1) GBLUP with both bulls and genotyped cows in the reference data; (2) GBLUP including a year of birth effect; (3) GEBV from a GBLUP model that accounted for the difference of EBV between dams and maternal grandsires; and (4) using a single-step approach. The results indicated all 4 alternatives could reduce the bias of predicted genetic trend and that the single-step approach performed best. However, not all these approaches improved reliability or reduced inflation of GEBV. The reliability was 0.30 and regression coefficients of deregressed proofs on GEBV were 0.69 in the scenario GBLUPBull. When genotyped cows were included in the reference population, the regression coefficients decreased to 0.59 but the reliability increased to 0.35. If a year effect was included in the model, the prediction reliability decreased to 0.29 and the regression coefficient improved to 0.75. The method in which GEBV were adjusted for the difference between dam EBV and maternal grandsire EBV led to much lower regression coefficients though the reliability increased to 0.4. The single-step approach improved both the reliability, to 0.38 and regression coefficient to 0.78. Therefore, the bias in genetic trend was reduced. The results suggest that implementing the single-step approach is an effective way to improve genomic prediction in Danish Jersey cattle.
Ma P
,Lund MS
,Nielsen US
,Aamand GP
,Su G
... -
《-》
-
Including different groups of genotyped females for genomic prediction in a Nordic Jersey population.
Including genotyped females in a reference population (RP) is an obvious way to increase the RP in genomic selection, especially for dairy breeds of limited population size. However, the incorporation of these females must be conducted cautiously because of the potential preferential treatment of the genotyped cows and lower reliabilities of phenotypes compared with the proven pseudo-phenotypes of bulls. Breeding organizations in Denmark, Finland, and Sweden have implemented a female-genotyping project with the possibility of genotyping entire herds using the low-density (LD) chip. In the present study, 5 scenarios for building an RP were investigated in the Nordic Jersey population: (1) bulls only, (2) bulls with females from the LD project, (3) bulls with females from the LD project plus non-LD project females genotyped before their first calving, (4) bulls with females from the LD project plus non-LD project females genotyped after their first calving, and (5) bulls with all genotyped females. The genomically enhanced breeding value (GEBV) was predicted for 8 traits in the Nordic total merit index through a genomic BLUP model using deregressed proof (DRP) as the response variable in all scenarios. In addition, (daughter) yield deviation and raw phenotypic data were studied as response variables for comparison with the DRP, using stature as a model trait. The validation population was formed using a cut-off birth year of 2005 based on the genotyped Nordic Jersey bulls with DRP. The average increment in reliability of the GEBV across the 8 traits investigated was 1.9 to 4.5 percentage points compared with using only bulls in the RP (scenario 1). The addition of all the genotyped females to the RP resulted in the highest gain in reliability (scenario 5), followed by scenario 3, scenario 2, and scenario 4. All scenarios led to inflated GEBV because the regression coefficients are less than 1. However, scenario 2 and scenario 3 led to less bias of genomic predictions than scenario 5, with regression coefficients showing less deviation from scenario 1. For the study on stature, the daughter yield deviation/daughter yield deviation performed slightly better than the DRP as the response variable in the genomic BLUP (GBLUP) model. Therefore, adding unselected females in the RP could significantly improve the reliabilities and tended to reduce the prediction bias compared with adding selectively genotyped females. Although the DRP has performed robustly so far, the use of raw data is recommended with a single-step model as an optimal solution for future genomic evaluations.
Gao H
,Madsen P
,Nielsen US
,Aamand GP
,Su G
,Byskov K
,Jensen J
... -
《-》
-
Sharing reference data and including cows in the reference population improve genomic predictions in Danish Jersey.
Small reference populations limit the accuracy of genomic prediction in numerically small breeds, such like Danish Jersey. The objective of this study was to investigate two approaches to improve genomic prediction by increasing size of reference population in Danish Jersey. The first approach was to include North American Jersey bulls in Danish Jersey reference population. The second was to genotype cows and use them as reference animals. The validation of genomic prediction was carried out on bulls and cows, respectively. In validation on bulls, about 300 Danish bulls (depending on traits) born in 2005 and later were used as validation data, and the reference populations were: (1) about 1050 Danish bulls, (2) about 1050 Danish bulls and about 1150 US bulls. In validation on cows, about 3000 Danish cows from 87 young half-sib families were used as validation data, and the reference populations were: (1) about 1250 Danish bulls, (2) about 1250 Danish bulls and about 1150 US bulls, (3) about 1250 Danish bulls and about 4800 cows, (4) about 1250 Danish bulls, 1150 US bulls and 4800 Danish cows. Genomic best linear unbiased prediction model was used to predict breeding values. De-regressed proofs were used as response variables. In the validation on bulls for eight traits, the joint DK-US bull reference population led to higher reliability of genomic prediction than the DK bull reference population for six traits, but not for fertility and longevity. Averaged over the eight traits, the gain was 3 percentage points. In the validation on cows for six traits (fertility and longevity were not available), the gain from inclusion of US bull in reference population was 6.6 percentage points in average over the six traits, and the gain from inclusion of cows was 8.2 percentage points. However, the gains from cows and US bulls were not accumulative. The total gain of including both US bulls and Danish cows was 10.5 percentage points. The results indicate that sharing reference data and including cows in reference population are efficient approaches to increase reliability of genomic prediction. Therefore, genomic selection is promising for numerically small population.
Su G
,Ma P
,Nielsen US
,Aamand GP
,Wiggans G
,Guldbrandtsen B
,Lund MS
... -
《-》
-
Use of a Bayesian model including QTL markers increases prediction reliability when test animals are distant from the reference population.
Relatedness between reference and test animals has an important effect on the reliability of genomic prediction for test animals. Because genomic prediction has been widely applied in practical cattle breeding and bulls have been selected according to genomic breeding value without progeny testing, the sires or grandsires of candidates might not have phenotypic information and might not be in the reference population when the candidates are selected. The objective of this study was to investigate the decreasing trend of the reliability of genomic prediction given distant reference populations, using genomic best linear unbiased prediction (GBLUP) and Bayesian variable selection models with or without including the quantitative trait locus (QTL) markers detected from sequencing data. The data used in this study consisted of 22,242 bulls genotyped using the 54K SNP array from EuroGenomics. Among them, 1,444 Danish bulls born from 2006 to 2010 were selected as test animals. Different reference populations with varying relationships to test animals were created according to pedigree-based relationships. The reference individuals having a relationship with one or more test animals higher than 0.4 (scenario ρ < 0.4), 0.2 (ρ < 0.2), or 0.1 (ρ < 0.1, where ρ = relationship coefficient) were removed from reference sets; these represented the distance between reference and test animals being 2 generations, 3 generations, and 4 generations, respectively. Imputed whole-genome sequencing data of bulls from Denmark were used to conduct a genome-wide association study (GWAS). A small number of significant variants (QTL markers) from the GWAS were added to the array data. To compare the effects of different models, the basic GBLUP model, a Bayesian selection variable model, a GBLUP model with 2 components of genetic effects, and a Bayesian model with pooled array data and QTL markers were used for estimating genomic estimated breeding values (GEBV) of test animals. The reliability of genomic prediction decreased when the test animals were more generations away from the reference population. The reliability of genomic prediction was 0.461 for 1 generation away and 0.396 for 3 generations away, with the same number of individuals in the reference set, using a GBLUP model with chip markers only. The results showed that using the Bayesian method and QTL markers improved the reliability of genomic prediction in all scenarios of relationship between test and reference animals, in a range of 1.3% and 65.1% (4 generations away with only 841 individuals in the reference set). However, most gains were for predictions of milk yield and fat yield. There was little improvement for predictions of protein yield and mastitis, and no improvement for prediction of fertility, except for scenario ρ < 0.1, in which there was a large improvement for predictions of all traits. On the other hand, models including more than 10% polygenic effect decreased prediction reliability when the relationship between test and reference animals was distant.
Ma P
,Lund MS
,Aamand GP
,Su G
... -
《-》
-
Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances.
Various models have been used for genomic prediction. Bayesian variable selection models often predict more accurate genomic breeding values than genomic BLUP (GBLUP), but GBLUP is generally preferred for routine genomic evaluations because of low computational demand. The objective of this study was to achieve the benefits of both models using results from Bayesian models and genome-wide association studies as weights on single nucleotide polymorphism (SNP) markers when constructing the genomic matrix (G-matrix) for genomic prediction. The data comprised 5,221 progeny-tested bulls from the Nordic Holstein population. The animals were genotyped using the Illumina Bovine SNP50 BeadChip (Illumina Inc., San Diego, CA). Weighting factors in this investigation were the posterior SNP variance, the square of the posterior SNP effect, and the corresponding minus base-10 logarithm of the marker association P-value [-log10(P)] of a t-test obtained from the analysis using a Bayesian mixture model with 4 normal distributions, the square of the estimated SNP effect, and the corresponding -log10(P) of a t-test obtained from the analysis using a classical genome-wide association study model (linear regression model). The weights were derived from the analysis based on data sets that were 0, 1, 3, or 5 yr before performing genomic prediction. In building a G-matrix, the weights were assigned either to each marker (single-marker weighting) or to each group of approximately 5 to 150 markers (group-marker weighting). The analysis was carried out for milk yield, fat yield, protein yield, fertility, and mastitis. Deregressed proofs (DRP) were used as response variables to predict genomic estimated breeding values (GEBV). Averaging over the 5 traits, the Bayesian model led to 2.0% higher reliability of GEBV than the GBLUP model with an original unweighted G-matrix. The superiority of using a GBLUP with weighted G-matrix over GBLUP with an original unweighted G-matrix was the largest when using a weighting factor of posterior variance, resulting in 1.7 percentage points higher reliability. The second best weighting factors were -log10 (P-value) of a t-test corresponding to the square of the posterior SNP effect from the Bayesian model and -log10 (P-value) of a t-test corresponding to the square of the estimated SNP effect from the linear regression model, followed by the square of estimated SNP effect and the square of the posterior SNP effect. In addition, group-marker weighting performed better than single-marker weighting in terms of reducing bias of GEBV, and also slightly increased prediction reliability. The differences between weighting factors and scenarios were larger in prediction bias than in prediction accuracy. Finally, weights derived from a data set having a lag up to 3 yr did not reduce reliability of GEBV. The results indicate that posterior SNP variance estimated from a Bayesian mixture model is a good alternative weighting factor, and common weights on group markers with a size of 30 markers is a good strategy when using markers of the 50,000-marker (50K) chip. In a population with gradually increasing reference data, the weights can be updated once every 3 yr.
Su G
,Christensen OF
,Janss L
,Lund MS
... -
《-》