Ending diagnostic odyssey using clinical whole-exome sequencing (CWES)

Objectives: Most rare diseases are genetic diseases. Due to the diversity of rare diseases and the high likelihood of patients with rare diseases to be undiagnosed or misdiagnosed, it is not unusual that these patients undergo a long diagnostic odyssey before they receive a definitive diagnosis. This situation presents a clear need to set up a dedicated clinical service to end the diagnostic odyssey of patients with rare diseases. Methods: Therefore, in 2014, we started an Undiagnosed Diseases Program in Hong Kong with the aim of ending the diagnostic odyssey of patients and families with rare diseases by clinical whole-exome sequencing (CWES), who have not received a definitive diagnosis after extensive investigation. Results: In this program, we have shown that genetic diseases diagnosed by CWES were different from that using traditional approaches indicating that CWES is an essential tool to diagnose rare diseases and ending diagnostic odysseys. In addition, we identified several novel genes responsible for monogenic diseases. These include the TOP2B gene for autism spectrum disorder, the DTYMK gene for severe cerebral atrophy, the KIF13A gene for a new mosaic ectodermal syndrome associated with hypomelanosis of Ito, and the CDC25B gene for a new syndrome of cardiomyopathy and endocrinopathy. Conclusions: With the incorporation of CWES in an Undiagnosed Diseases Program, we have ended diagnostic odysseys of patients with rare diseases in Hong Kong in the past 7 years. In this program, we have shown that CWES is an essential tool to end diagnostic odysseys. With the declining cost of next-generation sequencers and reagents, CWES set-ups are now affordable for clinical laboratories. Indeed, owing to the increasing availability of CWES and treatment modalities for rare diseases, precedence can be given to both common and rare medical conditions. sequencer with 100-bp paired-end module (Illumina). Image analysis and base-calling were performed using the standard Illumina data analysis pipeline. CWES data filtering was performed using VariantStudio (version 2.2.1, Illu-mina). The in silico prediction of the damaging effects of each single-nucleotide variant were assessed using PolyPhen and SIFT. Variants were ﬁ ltered based on population frequency, inheritance patterns, andinsilicopredictions.ResourcessuchastheHumanGeneMutation Database, 1,000 Genomes database, Exome Aggregation Consortium, OMIM, PubMed, and ClinVar were used to evaluate the sequence variants of interest. Interpretation of variants was in accordance with the 2015 American College of Medical Genetics and Genomics standards and guidelines for the interpretation of sequence variants [5]. The pathogenic variants were described according to the Human Genome Variation Society guidelines on nomenclature for the description of sequence variants (http://www.hgvs.org/mutnomen).


Introduction
According to the World Health Organization, a rare disease is one that affects a small percentage of the population (i.e., 0.65-1 affected person out of 1,000 individuals). According to the Rare List (http://globalgenes.org/rarelist/), 300 million people worldwide have approximately 7,000 rare diseases. Patients with rare diseases typically go through a long diagnostic odyssey due to the challenges that accompany the diagnosis of rare diseases. Over the past 30 years, most of the patients with rare diseases referred to us for genetic testing were either undiagnosed or misdiagnosed. For example, we encountered a patient with Wilson disease, which is a treatable liver disease, who was undiagnosed for 18 years [1]. Another patient received a definitive diagnosis of citrin deficiency as the cause of neonatal jaundice till the age of 14 [2]. On a positive note, we ended the diagnostic odyssey of a family with three adults who have dysferlinopathy that was misdiagnosed as polymyositis and connective tissue diseases for more than 10 years [3]. Further, we identified a POLG-related mutation in a patient with mitochondrial recessive ataxia syndrome (includes SANDO and SCAE) misdiagnosed as mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes (MELAS) [4]. The patient first presented at 8 years of age, and she finally received the correct diagnosis at 18 years of age. In a case such as this, a correct diagnosis can facilitate the provision of appropriate genetic counseling to the family (in this case, inheritance is autosomal recessive, not the maternal inheritance observed in MELAS). The correct diagnosis was beneficial for the patient and her family because they were advised to avoid valproic acid, which is a common anti-epileptic drug that can cause liver toxicity in POLG-carriers.
To end the diagnostic odyssey of patients with rare diseases, it is necessary to set up an Undiagnosed Disease Program. The program can provide important insights into the pathogenesis of rare diseases based on the novel genedisease association. Guided management can then be provided to the patients, and the accuracy of prognosis prediction can be enhanced. In the same vein, genetic counseling can be more specific, recurrent risk calculation can be more accurate, and direct screening of complications for early detection and treatment can be initiated. With the advent of next generation sequencing (NGS) technology, many patients with undiagnosed diseases can now enjoy the benefits of a diagnostic tool such as clinical whole-exome sequencing (CWES). Since DNA extraction and generation of NGS libraries can be automated by liquid handlers, a NGS facility for molecular diagnosis can be managed by a single operator. Since de-multiplexing and alignment of NGS reads are performed during sequence runs and processing of data files for variant detection are automated by the sequencers, lack of expertise in bioinformatic analysis is no longer a barrier for the implementation of CWES in clinical laboratories. Taken together, advancements in NGS and bioinformatics technologies enabled the setting up Undiagnosed Disease Programs in clinical laboratories.
In 2014, we started the Undiagnosed Diseases Program in Hong Kong with the aim of ending the diagnostic odysseys of patients and families with rare diseases who do not have a definitive diagnosis after extensive investigation. In this program, we have shown that CWES is an essential tool to end diagnostic odysseys by the minimal overlaps of monogenic diseases diagnosed by CWES and non-CWES/traditional approaches in the past 30 years in the author's laboratory.

Materials and methods
Patients were referred by clinicians specialized in cardiology, endocrinology, genetics, hepatology, metabolic medicine, nephrology, neurology, obstetrics, pathology, and pediatrics. Blood samples were collected from the patient and her family members after informed consent. Genomic DNA was extracted from whole blood samples by the QIAamp blood kit (Qiagen, Hilden, Germany) according to the manufacturer's instruction. Clinical whole-exome sequencing (CWES) was performed using SureSelectXT Human All Exon V4 target kits (Agilent Technologies, Santa Clara, USA) and Nextera kits (Illumina, San Diego, USA) according to the manufacturers' protocols. Sequencing was performed using an Illumina sequencer with 100-bp paired-end module (Illumina). Image analysis and base-calling were performed using the standard Illumina data analysis pipeline. CWES data filtering was performed using VariantStudio (version 2.2.1, Illumina). The in silico prediction of the damaging effects of each singlenucleotide variant were assessed using PolyPhen and SIFT. Variants were filtered based on population frequency, inheritance patterns, and in silico predictions. Resources such as the Human Gene Mutation Database, 1,000 Genomes database, Exome Aggregation Consortium, OMIM, PubMed, and ClinVar were used to evaluate the sequence variants of interest. Interpretation of variants was in accordance with the 2015 American College of Medical Genetics and Genomics standards and guidelines for the interpretation of sequence variants [5]. The pathogenic variants were described according to the Human Genome Variation Society guidelines on nomenclature for the description of sequence variants (http://www.hgvs.org/mutnomen).

Results and discussion
Through the Undiagnosed Disease Program, we were able to identify 42 monogenic diseases diagnosed by CWES and several of them are potentially treatable conditions [6][7][8][9][10][11][12] (Table 1). For instance, we identified a genetic defect in the thiamine pyrophosphokinase gene (TPK1) using CWES, thereby ending the 40-year diagnostic odyssey of a patient who presented with Leigh-like symptoms [6]. The TPK1 gene is responsible for encoding thiamine pyrophosphokinase, an important enzyme in thiamine metabolism, and a genetic defect in the TPK1 gene can lead to episodic encephalopathy. Thus, early dietary intervention/supplementation may reverse or slow down the disease progression. In contrast a total of 106 monogenic diseases were diagnosed by traditional/ non-CWES methods in the past 30 years in the author's laboratory [6, (Table 1). Through a Venn diagram analysis, we showed that the spectrum of monogenic diseases diagnosed by CWES overlapped minimally with that of diseases diagnosed using non-CWES/traditional laboratory methods ( Figure 1). Hence, it is more difficult to diagnose these diseases using clinical and traditional laboratory methods that do not involve CWES.
We also identified several novel genes for monogenic diseases using CWES. The genes include the TOP2B gene for autism spectrum disorder [7], the DTYMK gene for severe cerebral atrophy [8], the KIF13A gene for a new ectodermal syndrome [9], and the CDC25B gene for a new syndrome of cardiomyopathy and endocrinopathy [10]. In the case of the TOP2B gene, the proband presented with global developmental delay and intellectual disability associated with a de novo TOP2B mutation. CWES of the proband revealed that she was heterozygous for NM_001068.2:c.172C>T; NP_001059.2:p.His58Tyr of the TOP2B gene. The mutation in the patient is a de novo mutation. TOP2B encodes for the enzyme topoisomerase II isoenzyme beta, which is abundant in the developing brain and in the adult brain. Three years after the publication of this case, an identical de novo variant was identified in a Japanese patient with a similar phenotype [117]. In the case of the CDC25B gene, the patient was an 11-year-old Chinese girl born to consanguineous asymptomatic parents with a history of one fetal death and one infant death, both of unknown causes. The proband suffered intrauterine Lam: Ending diagnostic odyssey by CWES growth retardation, delayed development, bilateral cataract (at 5 years of age), primary hypothyroidism, growth hormone deficiency, and cardiomyopathy (at 9 years of age). CWES of the proband was performed, and we identified a novel homozygous nonsense variant of the CDC25B gene known as NM_021873:c.313G>T (p.Glu105*). The c.313G>T in the proband was expected to produce a truncated protein that is terminated at codon 105 with a loss of phosphorylation sites. We posit that a loss-of-function CDC25B gene will result in a new syndrome characterized by cataract, dilated cardiomyopathy, and multiple endocrinopathies. In the case of the DTYMK gene, the patients were two brothers aged one and 6 years who presented with microcephaly, marked cerebral atrophy with bilateral subdural hemorrhagic effusion, hypotonia, severe intellectual disability, and lactic acidosis. Two mutations in the DTYMK gene were identified in the brothers. The first mutation is the frameshift mutation NM_012145.3:c.287_320del;p.Asp96-Valfs*8 and the second one is the missense mutation NM_012145.3:c.295G>A;p.Ala99Thr. In the case of the KIF13A gene, the patient was a three-year-old girl who presented with developmental delay and hypomelanosis of Ito. An exomeand-genome approach revealed a heterozygous de novo frameshift variant in the KIF13A gene (i.e., NM_022113.6: c.2357dupA). The low mutant allelic ratio suggested that the mutation occurred postzygotically, leading to embryonic mosaicism. We suggest KIF13A may be the culprit gene in chr 6p22.3-p23 microdeletion syndrome because skin hypopigmentation has been reported in patients with the syndrome [118].

Conclusions
With the incorporation of CWES in an Undiagnosed Diseases Program, we have ended diagnostic odysseys of patients with rare diseases in Hong Kong in the past 7 years. With the technological advancement of NGS, CWES can now be completed in a clinical laboratory in 2 days. In addition, with the declining cost of next-generation sequencers and reagents, CWES set-ups are now affordable for clinical laboratories and precedence can be given to both common and rare medical conditions.  Table 1.