One current challenge in medical genetics is that of translation. In particular, we are concerned if GWAS can inform the development of new therapeutics. GWAS studies have been successful in identifying disease- associated loci. However, they provide little information about the causal alleles, pathways, complexes or cell types that are involved. Nevertheless, many known druggable targets are associated with GWAS hits. We therefore expect that GWAS has great potential in guiding therapeutic development.
A new tool in our search for greater insight into genetic perturbations is next generation sequencing (NGS). NGS has made sequencing an individual’s genome a much less costly and time-consuming task. NGS has several uses in the context of medical genetics, including exome/genome sequencing of rare and severe diseases, as well as exome/genome sequencing for the completion of allelic architecture at GWAS locis. However, NGS has in turn brought about new challenges in computation and interpretation.
One application of NGS to the study of human disease is in the identification and characterization of loss of function (LoF) variants. LoF variants disrupt the reading frame of protein-coding genes, and are therefore expected to be of scientific and clinical interest. However, the identification of these variants is complicated by errors in automated variant-calling and gene annotation. Many putative LoF variants are therefore likely to be false positives. In 2012, MacArthur et al. set out to describe a stringent set of LoF variants. Their results suggest that the typical human genomes contain about 100 LoF variants. They also presented a method to prioritize candidate genes as a function of their functional and evolutionary characteristics .
The MacArthur lab is also involved in an ongoing effort by the Exome Aggregation Consortium to assemble a catalog of human protein-coding variation for data mining. Currently, the catalog includes sequencing data from over 60,000 individuals. Such data allows for the identification of genes that are significantly lacking in functional coding variation. This is important because genes under exceptional constraint are expected to be deleterious. Based on this principle, Samocha et al. were able to identify 1000 genes involved in autism spectrum disorders that were significantly lacking in functional coding variation.
This was done using a statistical framework that described a model of de novo mutation . Similarly, De Rubeis et al. were able to identify 107 genes under exceptional evolutionary constraint that occurred in 5% of autistic subjects. Many of these genes were found to encode proteins involved in transcription and splicing, chromatin remodelling and synaptic function, thus advancing our understanding of the disease mechanism of these variants.
NGS can also be used to study rare and severe diseases, such as in the case of the DGAT1 mutation. In a study by Haas et al., exome sequencing was used to identify a rare splice site mutation in the DGAT1 gene. This had resulted in congenital diarrheal disorders in the children of a family of Ashkenazi Jewish descent . In this case, sequencing not only had therapeutical applications for the surviving child but also provided insight into an ongoing DGAT1 inhibition clinical trial.
While NGS allows us to study highly penetrant variants that result in severe Mendelian diseases, there are also genetic studies that deliver hypotheses for intervention. One example of this is the discovery of SCN9A. The complete loss-of-function of SCN9A, also known as NaV1.7, results in congenital indifference to pain. This has resulted in the development of novel analgesics with ecacy exceeding that of morphine, as in the case of μ-SLPTX-Ssm6a, a selective NaV1.7 inhibitor . Another example is the loss-of-function variant of PCSK9, which lowers LDL and protects against coronary artery disease. This has led to the development of PCSK9 inhibitor REGN727, which has been shown to be safe and effective in phase 1 clinical trails .
NGS is also important for fine-mapping loci identified in GWAS studies. For example, GWAS studies from 2010 looking at Crohn’s disease implicated a region on chromosome 15 containing multiple genes. After fine-mapping, the International Inflammatory Bowel Disease Genetics Consortium (IIBDGC) was able to refine the association to a SMAD3 noncoding functional elements. Another example is a study by Farh et al. that looked at candidate causal variants for 21 autoimmune diseases. They showed that 90% of causal variants are non-coding, but only 10-20% alter transcription factor binding motifs, implying that current gene regulatory models cannot explain the mechanism of these variants . Finally, a study by Rivas et al. that analyzed a deep resequencing of GWAS loci associated with inflammatory bowel disease found not only new risk factors but also protective variants. For example, a protective splice variant in CARD9 that causes premature truncation of protein was shown to strongly protect against the development of Crohn’s disease .