Genetic and epigenetic findings in anorexia nervosa

Abstract Polygenic factors are relevant for the genetic predispositions to the eating disorder anorexia nervosa (AN). The most recent genome-wide association study (GWAS) for AN comprised almost 17,000 patients with AN and controls. A total of eight genome-wide significant polygenic loci associated with AN have been identified. Each single polygenic locus makes only a small contribution to the development of AN. Analyses across different traits successfully identified regions/genes for AN that had not been detected by analyses of the single traits. Functional studies of the genes derived by GWAS studies aim to improve the understanding of the biological mechanisms involved in eating disorders. Epigenetic studies have not yet successfully contributed to the understanding of AN.


Introduction
The hallmark of AN is a significantly reduced body weight associated with mortality 12 times higher than all other causes of death in adolescence and early adulthood in females [1]. The Diagnostic and Statistical Manual of Mental Disorders, 5th ed. (DSM-V) [2] criteria for anorexia nervosa (AN) include in short: (A) restricted energy intake, so that a significantly low body weight ensues, (B) intense fear to gain weight, and (C) severely disturbed body weight or shape experience. There are also defined subtypes: (1) restricting type (weight loss is accomplished primarily through dieting, fasting, and/or excessive exercise) and (2) binge-eating/purging type (recurrent episodes of binge-eating or purging behavior like, e. g., self-induced vomiting or the misuse of laxatives and diuretics). These figures, as well as an overall lifetime prevalence of approximately 0.5 % [3], call for a more sophisticated understanding of the biological underpinnings of AN to provide targeted therapy and alleviate the substantial impact of AN on quality of life and long-term mortality. Genetics, however, have only contributed limited insights into the underlying causes of AN until recently. While it was known since the first family and twin studies that eating disorders (EDs) have a significant genetic background, only recently progress has been made narrowing down the genes likely underlying AN by combining new approaches in genetic research and statistics, as well as by increasing sample sizes as a result of international collaboration. Subsequently, we will first discuss formal genetic studies and then turn to the most recent research methodology that has introduced a new era of insights into the genetic underpinnings of AN.

Formal genetic studies
Family studies provide evidence of a 10 to 11 times higher risk of an ED in first-degree female relatives of patients with AN. Twin studies provide heritability estimates of 38-84 % in AN [4]. Interestingly, the remaining variance associated with the risk to develop AN in twins is attributable to unique environmental factors, as the effects of shared environmental factors seem negligible [5][6][7].
Estimates of genetic heritability in AN obviously vary considerably, which primarily reflects varying definitions of AN, with higher heritability for more narrow definitions, e. g., only including patients with the restricting type of AN [4,5].
However, independently of the exact definition of AN, findings regarding the heritability of AN have been replicated in several countries relying on different samples, which underscores a genetic cause of AN even though estimates essentially build on European and western populations.
Moreover, findings from family studies indicate a coheritability between AN and bulimia nervosa [8], which was recently confirmed by twin studies [5,9] that additionally relate AN to other psychiatric comorbidities as major depression and obsessive-compulsive disorder [10,11].

Linkage studies and candidate gene studies
To overcome the drawback of twin studies to only indicate heritability without information on genes and genetic variants involved in the clinical phenotype of AN, historical next steps in genetics have been linkage and candidate gene studies. Recently, a comprehensive overview of findings from these types of studies was published [12]. Analyzed genes were primarily related to neural signaling, either by neurotransmitters acting globally or by hormones affecting the hunger and satiety regulatory system in subcortical structures of the brain, such as the hypothalamus [5,12]. However, and even though promising due to feasible physiological mechanisms, especially at the core of candidate gene studies, the majority of results could not be confirmed in larger samples and meta-analyses provided ambiguous evidence [12,13]. Moreover, recent genetic approaches as genome-wide association studies (GWAS) could not confirm findings from either linkage or candidate gene studies (e. g., [14][15][16]). Summarizing, neither linkage nor candidate gene studies led to the discovery of solidly confirmed genes for AN, but these types of studies have been an important step in the chronology to gain insights into the genetic underpinnings of AN.

Polygenic variants identified by GWAS
GWAS, relying on the analysis of single nucleotide polymorphisms (SNPs), have proven extremely successful for the identification of genetic variations related to complex traits/disorders and have currently revealed more than 50,000 loci for numerous disorders and phenotypes (http: //www.ebi.ac.uk/gwas/). Assuming 1,000,000 analyzed SNPs per individual on average, control of type I error is mandatory by accounting for multiple testing. Thus, a Bonferroni-corrected threshold of P ≤ 5×10 −8 has emerged as the golden standard to account for the problem of multiple testing [17]. Unfortunately, a substantial number of potentially truly associated SNPs cannot be identified due to this stringent threshold [18] unless there is sufficient statistical power by adequate sample sizes.
For AN, the first GWAS that identified a significant locus comprised a total of 3,495 patients with AN and 10,982 controls [14]. A single genome-wide significant locus was identified on chromosome 12 (lead SNP: rs4622308). Interestingly, the respective chromosomal region comprises previously reported hits for diabetes mellitus type 1 and autoimmune disorders. A GWAS focusing on lowfrequency and rare variants (exome-chip) was performed in 2,158 cases with AN and 15,485 controls, and revealed no genome-wide significant association. This result might be explained by the small number of analyzed individuals combined with small effect sizes [15]. However, only most recently data from almost 17,000 patients affected by AN could be combined by an effort of the Anorexia Nervosa Genetics Initiative (ANGI) and the Eating Disorders Working Group of the Psychiatric Genomics Consortium (PGC-ED), which led to the identification of eight chromosomal regions genome wide significantly associated with AN [16]. One of the identified regions was also associated with type 2 diabetes mellitus. The eight chromosomal regions altogether comprised 121 genes. Most of these are primarily expressed in the brain. Subsequent analyses of these genes by different in silico tools (e. g., analyses of regulatory chromatin interactions) and by publicly available large-scale in vitro data (e. g., brain expression quantitative trait locus analysis) revealed that four of these genes (CADM1, MGMT, FOXP1, and PTBP2) might be more likely to be causally related to the etiology of AN. CADM1 protein levels are elevated in the hypothalamus. Genetic variants near CADM1 have also been implicated in BMI and age at menarche by GWAS. FOXP1 knockout mice display a reduction in body weight. Brain-specific deletion of FOXP1 causes structural defects in striatal development. MGMT and PTBP2 are implicated in epigenetic regulation and the assembly of other splicing-regulatory proteins, respectively, also in the brain [16].
Interestingly, the only single significant region identified by GWAS related to AN [14] until 2019 could be confirmed neither by Watson et al. [16] nor by Huckins et al. [15]. However, the number of participants (n = 2,158) in the study conducted by Huckins et al. [15] might have been too low for confirmation. Moreover, Watson et al. [16] discuss several mechanisms that might explain this lack of reproducibility, such as the winner's curse, moderator variables explaining significant between-cohort heterogeneity, and differences in linkage disequilibrium structure across compiled cohorts from the ANGI and the PGC-ED.
Summarizing results from GWAS, only a few regions related to AN have been identified by now. Nevertheless, irrespective of the GWAS conducted, SNP-based heritability was estimated at 20 %. Thus, by further increasing sample sizes, additional SNPs are expected to be discovered by GWAS soon.

Genetic variants and brain imaging
Given the findings by Watson et al. [16] implying primarily brain-expressed genes in the etiology of AN, we provide a short overview of the genotype-brain phenotype relationship in AN, as evidenced by brain imaging studies.
Structural MRI studies in AN imply that cortical changes observed in AN recover with weight restoration and are therefore likely not primarily genetically driven [1]. Moreover, also considering the subcortical architecture in AN, there is limited evidence of a significant effect of common genetic variants shaping subcortical brain volumes at a global level, when combining different genetic methods (e. g., linkage disequilibrium score regression [LDSC], genetic risk scores, and Mendelian randomization) to relate regional brain volumes to the genetic risk of AN based on GWAS summary statistics. However, there is suggestive evidence for the effects of single genetic markers on selected subcortical structures [19]. Despite preliminary evidence of a distinct pattern of changes in the white matter compartment of the brain in AN, it has not yet been tried to relate these changes to common genetic variants [20,21]. However, considering most recent GWAS findings in the analysis of brain imaging studies, employing new and complementary statistical methods (see above) and drawing upon larger sample sizes in MRI studies might disclose a significant genotype-brain phenotype relationship in AN.

Cross-trait analyses
LDSC is a relatively new statistical method that allows, amongst others, to estimate genetic correlations between different phenotypes from GWAS summary statistics. In doing so, LDSC accounts for linkage disequilibrium and is not biased by overlapping samples [13]. Initially, LDSC was used to estimate 276 genetic correlations with 24 traits, including obesity, AN, and educational attainment. A negative genetic correlation between BMI and AN was observed. Hence, a substantial subgroup of alleles associated with lower BMI overlaps with the SNP-based genetic predisposition to AN [22]. These findings have subsequently been confirmed [14], most recently by Watson et al. [16].
Other negative genetic correlations were found between AN and serum fasting insulin, insulin resistance, glucose levels, triglycerides, and low-density lipoproteins, and thus metabolic traits [14,16]. In sequel to the first genetic evidence of a distinct metabolic phenotype in AN, as evidenced by results from the GWAS conducted by Duncan et al. [14], Ilyas et al. [23] provided preliminary evidence of increased insulin sensitivity from a meta-analysis of 12 studies and Hussain et al. [24] reported elevated total cholesterol concentrations in acutely ill AN patients that even seem to persist after partial weight restoration by meta-analyses of 31 and 10 studies, respectively.
Positive LDSC genetic correlations were reported between AN and schizophrenia, neuroticism, educational attainment, and high-density lipoprotein cholesterol, which is well in line with psychiatric comorbidities reported in twin studies. Thus, AN seems to be genetically related to both mental phenotypes and metabolic traits and may constitute a metabo-psychiatric disorder [16]. Note, however, that even though findings reported by Ilyas et al. [23] and Hussain et al. [24] relied on meta-analyses and stringent statistical thresholds, aggregated sample sizes were still small. Thus, future GWAS with even larger samples sizes will have to confirm recent findings.
Complementary to LDSC, a number of so-called lookup studies (e. g., [25][26][27]) have been performed: Genomewide significant hits for one trait (e. g., BMI/obesity) were looked-up in GWAS data for related traits (e. g., anorexia nervosa), and vice versa, to identify shared genetic variants. Thus, the look-up of the 1,000 SNPs with the lowest P-values in a GWAS for AN (no genome-wide significant SNP in the respective study [28]) in the GWAS for BMI [29] revealed three genomic regions seemingly relevant for both. Risk alleles were the same for AN and lower BMI. Special attention was drawn to a locus on chromosome 10 because the SNP allele associations with lower BMI were mainly driven by females. The most recent GWAS for BMI [30] revealed that all SNPs identified in the look-up of AN have now become significantly associated with lower BMI genome-wide.

Epigenetics
Hübel et al. [31] recently reviewed the literature on epigenetic findings in AN considering global methylation as well as candidate gene studies and epigenome-wide associations studies (EWASs). Global methylation studies yielded an inconclusive picture with two studies reporting hypomethylation, one study reporting hypermethylation, and the fourth study with no evidence of methylation changes in AN in comparison with healthy controls. In candidate gene studies, most regions were only profiled once. Moreover, Hübel et al. [31] identified significant methodological drawbacks of the candidate gene studies conducted, for example, including acutely ill as well as recovered AN patients introducing significant heterogeneity by confounding factors substantially affecting DNA methylation. EWASs examining CpG sites have identified a differentially methylated spot annotated to TNXB, which could not be replicated applying stringent corrections for multiple testing. Thus, at the moment, there is no evidence of a distinct epigenetic pattern in AN. Epigenetic studies face several challenges, including the necessity of rigorous control of confounding factors, tissue specificity of epigenetic patterns, and, like GWAS, findings from EWASs do not by themselves imply causality. Overcoming these obstacles in epigenetic research by comprehensive study designs probing different tissues, verifying results by new genetic approaches such as Mendelian randomization allowing for causal inference, and by increasing sample sizes will provide additional and valuable insights in the (epi)genetics of AN.

Clinical implications
Even though not yet at sight in the near future, insights into the genetics of AN may serve for its primary or secondary prevention as well as its targeted treatment [13]. However, even today, findings on the heritability of AN from formal genetic studies allow to inform patients on the significant genetic background of AN as part of a treatment schedule. Such information may result in relief, but may likewise cause an unintended aggravation of psychological distress by, for example, inducing self-blame in parents whose offspring is affected by AN [4]. Thus, it is es-sential to convey correct, neutral information on the genetic aspects of AN. Only recently, Bulik et al. [4] provided guidance for clinicians on genetic counseling of patients affected by EDs. Summarizing their recommendations, Bulik et al. [4] point out that it is crucial to convey that EDs are not caused by either mere genetic or mere environmental factors, but that a vulnerability to develop an ED may be inherited. By conveying this information, they recommend emphasizing that even a (significant) genetic risk for an ED may be mitigated by environmental factors that can be altered significantly by the patient with an ED even though perfection in handling environmental aspects is neither necessary nor possible.
Compliance with ethical guidelines: For this article, no studies with human participants or animals were performed.
Conflict of interest: R. Hirtz and A. Hinney declare that they have no competing interests.