School of Agriculture, Food and Ecosystem Sciences - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 17
  • Item
    Thumbnail Image
    Sharing of either phenotypes or genetic variants can increase the accuracy of genomic prediction of feed efficiency
    Bolormaa, S ; MacLeod, IM ; Khansefid, M ; Marett, LC ; Wales, WJ ; Miglior, F ; Baes, CF ; Schenkel, FS ; Connor, EE ; Manzanilla-Pech, CI ; Stothard, P ; Herman, E ; Nieuwhof, GJ ; Goddard, ME ; Pryce, JE (BMC, 2022-09-06)
    BACKGROUND: Sharing individual phenotype and genotype data between countries is complex and fraught with potential errors, while sharing summary statistics of genome-wide association studies (GWAS) is relatively straightforward, and thus would be especially useful for traits that are expensive or difficult-to-measure, such as feed efficiency. Here we examined: (1) the sharing of individual cow data from international partners; and (2) the use of sequence variants selected from GWAS of international cow data to evaluate the accuracy of genomic estimated breeding values (GEBV) for residual feed intake (RFI) in Australian cows. RESULTS: GEBV for RFI were estimated using genomic best linear unbiased prediction (GBLUP) with 50k or high-density single nucleotide polymorphisms (SNPs), from a training population of 3797 individuals in univariate to trivariate analyses where the three traits were RFI phenotypes calculated using 584 Australian lactating cows (AUSc), 824 growing heifers (AUSh), and 2526 international lactating cows (OVE). Accuracies of GEBV in AUSc were evaluated by either cohort-by-birth-year or fourfold random cross-validations. GEBV of AUSc were also predicted using only the AUS training population with a weighted genomic relationship matrix constructed with SNPs from the 50k array and sequence variants selected from a meta-GWAS that included only international datasets. The genomic heritabilities estimated using the AUSc, OVE and AUSh datasets were moderate, ranging from 0.20 to 0.36. The genetic correlations (rg) of traits between heifers and cows ranged from 0.30 to 0.95 but were associated with large standard errors. The mean accuracies of GEBV in Australian cows were up to 0.32 and almost doubled when either overseas cows, or both overseas cows and AUS heifers were included in the training population. They also increased when selected sequence variants were combined with 50k SNPs, but with a smaller relative increase. CONCLUSIONS: The accuracy of RFI GEBV increased when international data were used or when selected sequence variants were combined with 50k SNP array data. This suggests that if direct sharing of data is not feasible, a meta-analysis of summary GWAS statistics could provide selected SNPs for custom panels to use in genomic selection programs. However, since this finding is based on a small cross-validation study, confirmation through a larger study is recommended.
  • Item
    Thumbnail Image
    A conditional multi-trait sequence GWAS discovers pleiotropic candidate genes and variants for sheep wool, skin wrinkle and breech cover traits
    Bolormaa, S ; Swan, AA ; Stothard, P ; Khansefid, M ; Moghaddar, N ; Duijvesteijn, N ; van der Werf, JHJ ; Daetwyler, HD ; MacLeod, IM (BMC, 2021-07-08)
    BACKGROUND: Imputation to whole-genome sequence is now possible in large sheep populations. It is therefore of interest to use this data in genome-wide association studies (GWAS) to investigate putative causal variants and genes that underpin economically important traits. Merino wool is globally sought after for luxury fabrics, but some key wool quality attributes are unfavourably correlated with the characteristic skin wrinkle of Merinos. In turn, skin wrinkle is strongly linked to susceptibility to "fly strike" (Cutaneous myiasis), which is a major welfare issue. Here, we use whole-genome sequence data in a multi-trait GWAS to identify pleiotropic putative causal variants and genes associated with changes in key wool traits and skin wrinkle. RESULTS: A stepwise conditional multi-trait GWAS (CM-GWAS) identified putative causal variants and related genes from 178 independent quantitative trait loci (QTL) of 16 wool and skin wrinkle traits, measured on up to 7218 Merino sheep with 31 million imputed whole-genome sequence (WGS) genotypes. Novel candidate gene findings included the MAT1A gene that encodes an enzyme involved in the sulphur metabolism pathway critical to production of wool proteins, and the ESRP1 gene. We also discovered a significant wrinkle variant upstream of the HAS2 gene, which in dogs is associated with the exaggerated skin folds in the Shar-Pei breed. CONCLUSIONS: The wool and skin wrinkle traits studied here appear to be highly polygenic with many putative candidate variants showing considerable pleiotropy. Our CM-GWAS identified many highly plausible candidate genes for wool traits as well as breech wrinkle and breech area wool cover.
  • Item
    Thumbnail Image
    Mutant alleles differentially shape fitness and other complex traits in cattle
    Xiang, R ; Breen, EJ ; Bolormaa, S ; Vander Jagt, CJ ; Chamberlain, AJ ; Macleod, IM ; Goddard, ME (NATURE PORTFOLIO, 2021-12-02)
    Mutant alleles (MAs) that have been classically recognised have large effects on phenotype and tend to be deleterious to traits and fitness. Is this the case for mutations with small effects? We infer MAs for 8 million sequence variants in 113k cattle and quantify the effects of MA on 37 complex traits. Heterozygosity for variants at genomic sites conserved across 100 vertebrate species increase fertility, stature, and milk production, positively associating these traits with fitness. MAs decrease stature and fat and protein concentration in milk, but increase gestation length and somatic cell count in milk (the latter indicative of mastitis). However, the frequency of MAs decreasing stature and fat and protein concentration, increasing gestation length and somatic cell count were lower than the frequency of MAs with the opposite effect. These results suggest bias in the mutations direction of effect (e.g. towards reduced protein in milk), but selection operating to reduce the frequency of these MAs. Taken together, our results imply two classes of genomic sites subject to long-term selection: sites conserved across vertebrates show hybrid vigour while sites subject to less long-term selection show a bias in mutation towards undesirable alleles.
  • Item
    Thumbnail Image
    New loci and neuronal pathways for resilience to heat stress in cattle
    Cheruiyot, EK ; Haile-Mariam, M ; Cocks, BG ; MacLeod, IM ; Xiang, R ; Pryce, JE (NATURE PORTFOLIO, 2021-08-17)
    While understanding the genetic basis of heat tolerance is crucial in the context of global warming's effect on humans, livestock, and wildlife, the specific genetic variants and biological features that confer thermotolerance in animals are still not well characterized. We used dairy cows as a model to study heat tolerance because they are lactating, and therefore often prone to thermal stress. The data comprised almost 0.5 million milk records (milk, fat, and proteins) of 29,107 Australian Holsteins, each having around 15 million imputed sequence variants. Dairy animals often reduce their milk production when temperature and humidity rise; thus, the phenotypes used to measure an individual's heat tolerance were defined as the rate of milk production decline (slope traits) with a rising temperature-humidity index. With these slope traits, we performed a genome-wide association study (GWAS) using different approaches, including conditional analyses, to correct for the relationship between heat tolerance and level of milk production. The results revealed multiple novel loci for heat tolerance, including 61 potential functional variants at sites highly conserved across 100 vertebrate species. Moreover, it was interesting that specific candidate variants and genes are related to the neuronal system (ITPR1, ITPR2, and GRIA4) and neuroactive ligand-receptor interaction functions for heat tolerance (NPFFR2, CALCR, and GHR), providing a novel insight that can help to develop genetic and management approaches to combat heat stress.
  • Item
    Thumbnail Image
    Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations
    Xiang, R ; MacLeod, IM ; Daetwyler, HD ; de Jong, G ; O'Connor, E ; Schrooten, C ; Chamberlain, AJ ; Goddard, ME (NATURE RESEARCH, 2021-02-08)
    The difficulty in finding causative mutations has hampered their use in genomic prediction. Here, we present a methodology to fine-map potentially causal variants genome-wide by integrating the functional, evolutionary and pleiotropic information of variants using GWAS, variant clustering and Bayesian mixture models. Our analysis of 17 million sequence variants in 44,000+ Australian dairy cattle for 34 traits suggests, on average, one pleiotropic QTL existing in each 50 kb chromosome-segment. We selected a set of 80k variants representing potentially causal variants within each chromosome segment to develop a bovine XT-50K genotyping array. The custom array contains many pleiotropic variants with biological functions, including splicing QTLs and variants at conserved sites across 100 vertebrate species. This biology-informed custom array outperformed the standard array in predicting genetic value of multiple traits across populations in independent datasets of 90,000+ dairy cattle from the USA, Australia and New Zealand.
  • Item
    Thumbnail Image
    Expression quantitative trait loci in sheep liver and muscle contribute to variations in meat traits
    Yuan, Z ; Sunduimijid, B ; Xiang, R ; Behrendt, R ; Knight, MI ; Mason, BA ; Reich, CM ; Prowse-Wilkins, C ; Vander Jagt, CJ ; Chamberlain, AJ ; MacLeod, IM ; Li, F ; Yue, X ; Daetwyler, HD (BMC, 2021-01-18)
    BACKGROUND: Variants that regulate transcription, such as expression quantitative trait loci (eQTL), have shown enrichment in genome-wide association studies (GWAS) for mammalian complex traits. However, no study has reported eQTL in sheep, although it is an important agricultural species for which many GWAS of complex meat traits have been conducted. Using RNA sequence data produced from liver and muscle from 149 sheep and imputed whole-genome single nucleotide polymorphisms (SNPs), our aim was to dissect the genetic architecture of the transcriptome by associating sheep genotypes with three major molecular phenotypes including gene expression (geQTL), exon expression (eeQTL) and RNA splicing (sQTL). We also examined these three types of eQTL for their enrichment in GWAS of multi-meat traits and fatty acid profiles. RESULTS: Whereas a relatively small number of molecular phenotypes were significantly heritable (h2 > 0, P < 0.05), their mean heritability ranged from 0.67 to 0.73 for liver and from 0.71 to 0.77 for muscle. Association analysis between molecular phenotypes and SNPs within ± 1 Mb identified many significant cis-eQTL (false discovery rate, FDR < 0.01). The median distance between the eQTL and transcription start sites (TSS) ranged from 68 to 153 kb across the three eQTL types. The number of common variants between geQTL, eeQTL and sQTL within each tissue, and the number of common variants between liver and muscle within each eQTL type were all significantly (P < 0.05) larger than expected by chance. The identified eQTL were significantly (P < 0.05) enriched in GWAS hits associated with 56 carcass traits and fatty acid profiles. For example, several geQTL in muscle mapped to the FAM184B gene, hundreds of sQTL in liver and muscle mapped to the CAST gene, and hundreds of sQTL in liver mapped to the C6 gene. These three genes are associated with body composition or fatty acid profiles. CONCLUSIONS: We detected a large number of significant eQTL and found that the overlap of variants between eQTL types and tissues was prevalent. Many eQTL were also QTL for meat traits. Our study fills a gap in the knowledge on the regulatory variants and their role in complex traits for the sheep model.
  • Item
    Thumbnail Image
    Improving Genomic Prediction of Crossbred and Purebred Dairy Cattle
    Khansefid, M ; Goddard, ME ; Haile-Mariam, M ; Konstantinov, K ; Schrooten, C ; de Jong, G ; Jewell, EG ; O'Connor, E ; Pryce, JE ; Daetwyler, HD ; MacLeod, IM (FRONTIERS MEDIA SA, 2020-12-14)
    This study assessed the accuracy and bias of genomic prediction (GP) in purebred Holstein (H) and Jersey (J) as well as crossbred (H and J) validation cows using different reference sets and prediction strategies. The reference sets were made up of different combinations of 36,695 H and J purebreds and crossbreds. Additionally, the effect of using different sets of marker genotypes on GP was studied (conventional panel: 50k, custom panel enriched with, or close to, causal mutations: XT_50k, and conventional high-density with a limited custom set: pruned HDnGBS). We also compared the use of genomic best linear unbiased prediction (GBLUP) and Bayesian (emBayesR) models, and the traits tested were milk, fat, and protein yields. On average, by including crossbred cows in the reference population, the prediction accuracies increased by 0.01-0.08 and were less biased (regression coefficient closer to 1 by 0.02-0.16), and the benefit was greater for crossbreds compared to purebreds. The accuracy of prediction increased by 0.02 using XT_50k compared to 50k genotypes without affecting the bias. Although using pruned HDnGBS instead of 50k also increased the prediction accuracy by about 0.02, it increased the bias for purebred predictions in emBayesR models. Generally, emBayesR outperformed GBLUP for prediction accuracy when using 50k or pruned HDnGBS genotypes, but the benefits diminished with XT_50k genotypes. Crossbred predictions derived from a joint pure H and J reference were similar in accuracy to crossbred predictions derived from the two separate purebred reference sets and combined proportional to breed composition. However, the latter approach was less biased by 0.13. Most interestingly, using an equalized breed reference instead of an H-dominated reference, on average, reduced the bias of prediction by 0.16-0.19 and increased the accuracy by 0.04 for crossbred and J cows, with a little change in the H accuracy. In conclusion, we observed improved genomic predictions for both crossbreds and purebreds by equalizing breed contributions in a mixed breed reference that included crossbred cows. Furthermore, we demonstrate, that compared to the conventional 50k or high-density panels, our customized set of 50k sequence markers improved or matched the prediction accuracy and reduced bias with both GBLUP and Bayesian models.
  • Item
    Thumbnail Image
    Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors
    MacLeod, IM ; Larkin, DM ; Lewin, HA ; Hayes, BJ ; Goddard, ME (OXFORD UNIV PRESS, 2013-09)
    Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (Ne) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493-496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in Ne. The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the Ne around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with Ne of between 3,500 and 6,000. The most recent reduction of Ne to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals.
  • Item
    Thumbnail Image
    Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits
    MacLeod, IM ; Bowman, PJ ; Vander Jagt, CJ ; Haile-Mariam, M ; Kemper, KE ; Chamberlain, AJ ; Schrooten, C ; Hayes, BJ ; Goddard, ME (BMC, 2016-02-27)
    BACKGROUND: Dense SNP genotypes are often combined with complex trait phenotypes to map causal variants, study genetic architecture and provide genomic predictions for individuals with genotypes but no phenotype. A single method of analysis that jointly fits all genotypes in a Bayesian mixture model (BayesR) has been shown to competitively address all 3 purposes simultaneously. However, BayesR and other similar methods ignore prior biological knowledge and assume all genotypes are equally likely to affect the trait. While this assumption is reasonable for SNP array genotypes, it is less sensible if genotypes are whole-genome sequence variants which should include causal variants. RESULTS: We introduce a new method (BayesRC) based on BayesR that incorporates prior biological information in the analysis by defining classes of variants likely to be enriched for causal mutations. The information can be derived from a range of sources, including variant annotation, candidate gene lists and known causal variants. This information is then incorporated objectively in the analysis based on evidence of enrichment in the data. We demonstrate the increased power of BayesRC compared to BayesR using real dairy cattle genotypes with simulated phenotypes. The genotypes were imputed whole-genome sequence variants in coding regions combined with dense SNP markers. BayesRC increased the power to detect causal variants and increased the accuracy of genomic prediction. The relative improvement for genomic prediction was most apparent in validation populations that were not closely related to the reference population. We also applied BayesRC to real milk production phenotypes in dairy cattle using independent biological priors from gene expression analyses. Although current biological knowledge of which genes and variants affect milk production is still very incomplete, our results suggest that the new BayesRC method was equal to or more powerful than BayesR for detecting candidate causal variants and for genomic prediction of milk traits. CONCLUSIONS: BayesRC provides a novel and flexible approach to simultaneously improving the accuracy of QTL discovery and genomic prediction by taking advantage of prior biological knowledge. Approaches such as BayesRC will become increasing useful as biological knowledge accumulates regarding functional regions of the genome for a range of traits and species.
  • Item
    Thumbnail Image
    Population structure and history of the Welsh sheep breeds determined by whole genome genotyping
    Beynon, SE ; Slavov, GT ; Farre, M ; Sunduimijid, B ; Waddams, K ; Davies, B ; Haresign, W ; Kijas, J ; MacLeod, IM ; Newbold, CJ ; Davies, L ; Larkin, DM (BMC, 2015-06-20)
    BACKGROUND: One of the most economically important areas within the Welsh agricultural sector is sheep farming, contributing around £230 million to the UK economy annually. Phenotypic selection over several centuries has generated a number of native sheep breeds, which are presumably adapted to the diverse and challenging landscape of Wales. Little is known about the history, genetic diversity and relationships of these breeds with other European breeds. We genotyped 353 individuals from 18 native Welsh sheep breeds using the Illumina OvineSNP50 array and characterised the genetic structure of these breeds. Our genotyping data were then combined with, and compared to, those from a set of 74 worldwide breeds, previously collected during the International Sheep Genome Consortium HapMap project. RESULTS: Model based clustering of the Welsh and European breeds indicated shared ancestry. This finding was supported by multidimensional scaling analysis (MDS), which revealed separation of the European, African and Asian breeds. As expected, the commercial Texel and Merino breeds appeared to have extensive co-ancestry with most European breeds. Consistently high levels of haplotype sharing were observed between native Welsh and other European breeds. The Welsh breeds did not, however, form a genetically homogeneous group, with pairwise F ST between breeds averaging 0.107 and ranging between 0.020 and 0.201. Four subpopulations were identified within the 18 native breeds, with high homogeneity observed amongst the majority of mountain breeds. Recent effective population sizes estimated from linkage disequilibrium ranged from 88 to 825. CONCLUSIONS: Welsh breeds are highly diverse with low to moderate effective population sizes and form at least four distinct genetic groups. Our data suggest common ancestry between the native Welsh and European breeds. These findings provide the basis for future genome-wide association studies and a first step towards developing genomics assisted breeding strategies in the UK.