imputation for sequencing variants preselected to a customized low-density chip

Archive ouverte

Liu, Aoxing | Lund, Mogens, Sandø | Boichard, Didier | Mao, Xiaowei | Karaman, Emre | Fritz, Sebastien | Aamand, Gert, Pedersen | Wang, Yachun | Su, Guosheng

Edité par CCSD ; Nature Publishing Group -

International audience. the sequencing variants preselected from association analyses and bioinformatics analyses could improve genomic prediction. In this study, the imputation of sequencing SNPs preselected from major dairy breeds in Denmark-Finland-Sweden (DFS) and France (FRA) was investigated for both contemporary animals and old bulls in Danish Jersey. For contemporary animals, a two-step imputation which first imputed to 54 K and then to 54 K + DFS + fRA Snps achieved highest accuracy. correlations between observed and imputed genotypes were 91.6% for DFS SNPs and 87.6% for FRA SNPs, while concordance rates were 96.6% for DFS SNPs and 93.5% for FRA SNPs. The SNPs with lower minor allele frequency (MAF) tended to have lower correlations but higher concordance rates. For old bulls, imputation for DFS and FRA SNPs were relatively accurate even for bulls without progenies (correlations higher than 97.2% and concordance rates higher than 98.4%). For contemporary animals, given limited imputation accuracy of preselected sequencing SNPs especially for SNPs with low MAF, it would be a good strategy to directly genotype preselected sequencing Snps with a customized SNP chip. For old bulls, given high imputation accuracy for preselected sequencing SNPs with all MAF ranges, it would be unnecessary to re-genotype preselected sequencing SNPs. In dairy cattle, with the availability of whole-genome sequencing (WGS) data (~ 27 million variants), a large number of causative loci or single nucleotide polymorphisms (SNPs) tightly linked to causative loci have been detected through association analyses 1,2 and bioinformatics analyses 3,4. The integration of these preselected sequencing SNPs into the genotype data of the standard SNP chip is expected to improve genomic prediction, which has been well documented in various studies 5-7. On the one hand, compared with using the standard SNP chip, integrating preselected sequencing SNPs could better capture the information of causative loci instead of relying on the extensive linkage disequilibrium. On the other hand, compared with using (imputed) WGS SNPs, integrating only preselected sequencing SNPs could reduce the computational burden and avoid the noise originating from the inclusion of a large number of non-causative loci 8. Besides, using sequencing SNPs preselected from bioinfor-matics analyses could benefit association studies. For example, in three French dairy breeds including Holsteins, Montbéliarde, and Normande, a set of sequencing SNPs pre-selected from functional annotations were confirmed to be significant associations for milk production, fertility, and embryo mortality in both within-breed association analyses and across-breed meta-analyses 9. No matter genomic prediction or association studies, a sufficiently large population with genotypes of pre-selected sequencing SNPs is essential in order to get benefits from preselected sequencing SNPs. Although the costs of WGS keep decreasing, obtaining genotypes of preselected sequencing SNPs by directly sequenc-ing a large number of animals remains economically infeasible. An alternative is to use a customized SNP chip which can include the SNPs defined by customers. Under the project of EuroGenomics, a customized low-density chip 10 was designed to combine SNPs of the standard low-density chip 11 together with thousands

Suggestions

Du même auteur

Improvement of genomic prediction by integrating additional single nucleotide polymorphisms selected from imputed whole genome sequencing data.

Archive ouverte | Liu, Aoxing | CCSD

International audience. The availability of whole genome sequencing (WGS) data enables the discovery of causative single nucleotide polymorphisms (SNPs) or SNPs in high linkage disequilibrium with causative SNPs. Th...

Weighted single-step genomic best linear unbiased prediction integrating variants selected from sequencing data by association and bioinformatics analyses

Archive ouverte | Liu, Aoxing | CCSD

International audience. Background: Sequencing data enable the detection of causal loci or single nucleotide polymorphisms (SNPs) highly linked to causal loci to improve genomic prediction. However, until now, studi...

Using additional single nucleotide polymorphisms selected from whole genome sequence data for genomic prediction in Danish Jersey

Archive ouverte | Liu, Aoxing | CCSD

International audience

Chargement des enrichissements...