Detecting low frequent loss-of-function alleles in genome wide association studies with red hair color as example.
Liu F., Struchalin MV., Duijn KV., Hofman A., Uitterlinden AG., Duijn CV., Aulchenko YS., Kayser M.
Multiple loss-of-function (LOF) alleles at the same gene may influence a phenotype not only in the homozygote state when alleles are considered individually, but also in the compound heterozygote (CH) state. Such LOF alleles typically have low frequencies and moderate to large effects. Detecting such variants is of interest to the genetics community, and relevant statistical methods for detecting and quantifying their effects are sorely needed. We present a collapsed double heterozygosity (CDH) test to detect the presence of multiple LOF alleles at a gene. When causal SNPs are available, which may be the case in next generation genome sequencing studies, this CDH test has overwhelmingly higher power than single SNP analysis. When causal SNPs are not directly available such as in current GWA settings, we show the CDH test has higher power than standard single SNP analysis if tagging SNPs are in linkage disequilibrium with the underlying causal SNPs to at least a moderate degree (r²>0.1). The test is implemented for genome-wide analysis in the publically available software package GenABEL which is based on a sliding window approach. We provide the proof of principle by conducting a genome-wide CDH analysis of red hair color, a trait known to be influenced by multiple loss-of-function alleles, in a total of 7,732 Dutch individuals with hair color ascertained. The association signals at the MC1R gene locus from CDH were uniformly more significant than traditional GWA analyses (the most significant P for CDH = 3.11×10⁻¹⁴² vs. P for rs258322 = 1.33×10⁻⁶⁶). The CDH test will contribute towards finding rare LOF variants in GWAS and sequencing studies.