Histone proteomics implicates H3K36me2 and its regulators in mouse embryonic stem cell pluripotency exit and lineage choice

Objectives: Gene expression changes during embryonic stem cell (ESC) di ﬀ erentiation is regulated by epigenetic mechanisms. Understanding these can help uncover how cell fate decisions are made during early embryonic development. Comparison of chromatin of ESCs with lineage-committed cells can implicate chromatin factors functional in exit from pluripotency and the choice of proper lineages. Therefore, we quantitatively analyzed histone modi ﬁ cations in mouse ESC di ﬀ erentiation towards neuro-ectoderm and endoderm. Methods: We cultured mouse ESCs (mESCs) and di ﬀ erenti-ated them towards neuroectoderm or endoderm lineages and performed mass spectrometry on total histones. Subsequent Western blots veri ﬁ ed signi ﬁ cantly altered H3K36me2. RT-qPCR analyses showed changes in H3K36-speci ﬁ c methyl-transferases, demethylases and readers at mESC stage or during neuroectoderm/endoderm commitment. Results: We presented quantitative histone modi ﬁ cation levels in mESCs and lineage-committed cells. H3K36me2 increased speci ﬁ cally in neuroectoderm compared to mESCs or endoderm-committed cells. Regulation of H3K36 methylation might have a role in pluripotency exit and/or di ﬀ er-entiation. Nsd2 , Dnmt3b and Zmynd11 increased during di ﬀ erentiation regardless of lineage. Conversely, mESCs had higher Kdm4c and Msh6 expression than di ﬀ erentiated cells. Comparing neuroectoderm and endoderm-committed cells, we revealed Nsd1 , Setd5 and Dnmt3a had lineage speci ﬁ c expression pattern. Conclusions: Our results show quantitative changes in histone modi ﬁ cations during mESC lineage commitment and implicate H3K36me2 regulation for not only pluripotency exit but also lineage choice. Its regulatory proteins show stage (mESC vs. committed) or lineage (neuroectoderm vs. endoderm) dependent expression changes. Further work will be needed to discover their possible involvement in cell fate decisions and target genes.


Introduction
Embryonic stem cells (ESCs) are derived from inner cell mass of pre-implantation blastocyst. Their unlimited selfrenewal capacity and pluripotency make ESCs valuable tools for regenerative medicine and in vitro model of early embryonic development. Pluripotency state is maintained by coordinated regulation of chromatin environment and transcription factors (TFs) such as OCT4, SOX2, NANOG and C-MYC [1,2]. During differentiation, pluripotency network is suppressed together with the genes responsible for alternative cell fates while the genes of desired cell fate are activated [1].
Eukaryotic chromatin is composed of nucleosomes where DNA is wrapped around histone octamers. Each nucleosome contains two copies of core histones H2A, H2B, H3 and H4. Also, the linker histone H1 aids the further compaction of chromatin. Several lysine, arginine and serine residues in N-terminal tails of H3 and H4 are available for post translational modifications such as methylation, acetylation and phosphorylation [3]. Together with other epigenetic mechanisms like DNA methylation and chromatin remodeling, histone modifications regulate gene expression by changing the chromatin environment and through recruitment of other effector proteins [1].
Methylation of lysine 36 residue of H3 is one of the histone modifications involved in various processes such as regulation of gene expression, DNA damage repair pathways, DNA methylation, repression of cryptic transcription and alternative splicing control [4]. The abundancy and distribution of different levels of H3K36 methylations are controlled by opposing activities of methyltransferases and demethylases. In mammalian cells, SETD2 is primarily responsible for H3K36me3, while, a highly homologous protein, SETD5, is recently reported to also perform trimethylation of H3K36 [5]. Lower levels of H3K36 methylation is catalyzed by a distinct set of methyltransferases including NSD1-3, SETD3, ASH1L and SETMAR [4]. Lysine demethylases KDM2A and KDM2B are specific to H3K36me1 and H3K36me2 demethylation while KDM4A-D family is responsible for H3K36me2 and H3K36me3 demethylation [6]. Each level of H3K36 methylation has distinct distributions and functions. H3K36me1 is considered as a precursor state for higher methylations showing broad distribution in genome. H3K36me2 is found abundantly in intergenic and regulatory regions as well as towards the transcription start site, whereas H3K36me3 is concentrated towards the 3′ end of actively transcribed genes. Diverse roles have been linked to H3K36me2 and H3K36me3 exerted mainly through reader proteins that can recognize H3K36 methylation by their PWWP, bromodomains or Tudor domains [4]. Readers of H3K36 include proteins involved in DNA damage and repair response (MSH6, PHF1), de novo DNA methylation (DNMT3A, DNMT3B) and transcription and mRNA splicing regulation (MRG15, PSIP1, ZMYND11, PHF19) [7,8].
Exit from pluripotency and commitment to specific lineages require precise gene expression changes regulated by chromatin. However, it is not clear how histone modifications globally change in this narrow time window. We hypothesized that we can infer chromatin regulators of lineage choice by comparison of histone modifications at pluripotent mouse ESC (mESC) and early lineage commitment states. Semi-quantitative and single-target methods such as Western blot or immunofluorescence experiments lack the quantitative aspect required for accurate comparison. Therefore, mass spectrometric analysis of histone modifications at mESC, endoderm or neuroectoderm lineage commitment stages was employed, with the focus directed towards significantly altered histone modifications that correlate with mESC differentiation. The results showed a prominent increase in H3K36me2 methylation in neuroectoderm commitment compared to mESCs, but not in endoderm commitment. Then, mRNA levels of H3K36 methyltransferases and demethylases as well as H3K36 reader proteins were measured during neuroectoderm and endoderm commitment to provide a more comprehensive perspective on possible processes H3K36 methylation is involved in. The results implicate regulation of H3K36me2 as a critical step for not only pluripotency exit but also for lineage choice.
Materials and methods mESC growth and differentiation mESCs were cultured in serum-free conditions prior to neuroectoderm differentiation as previously described [9]. mESCs were differentiated into neuroectoderm following a monolayer adherent differentiation protocol [10] with minor modifications as previously described [11]. For RNA isolation, 0.5 × 10 6 cells were collected and stored in TRIzol™ (15596018; Thermo Fisher Scientific) at −20°C. Endoderm differentiation was performed as previously described [11]. 2 × 10 6 cells were collected from ESC state and from the fifth day of endoderm/neuroectoderm differentiation for histone extraction.

Histone acid extraction
Total histone proteins were isolated by acid extraction method as described [12]. Protein concentration was measured with Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific).

Histone proteomics
Cells stored as in 2.1 were sent to the Broad Institute Proteomics Platform. Histones were extracted and proteomic analysis was performed as described [13,14]. Results were normalized to total histone in the sample and to the unique normalization peptide of where the modification is found (H3K4me0 peptide for normalization of H3 modifications, for example) [13]. The results are provided as Supplementary File and visualized in GraphPad Prism. Supplemental Figure 1.

RNA isolation, cDNA conversion and RT-qPCR
Total RNA was isolated by using Qiagen RNeasy Plus Micro Kit (Qiagen, Valencia, CA, USA) following the provided protocol. 500 ng RNA was converted to cDNA using iScript™ cDNA Synthesis Kit (Bio-Rad Laboratories, Hercules, CA, USA). Information about primer sequences can be found in Table 1.

Statistical analyses
Data normality and variance homogeneity were assessed via Shapiro-Wilk and Levene's tests, respectively, in R. GraphPad's Prism 9 was used for graph generation and conducting multiple unpaired Student's t-tests. Spearman correlation test was performed in R to evaluate the relationship between H3K36me2 levels (quantified by Western blot) and the expression of selected genes on mESC state and fifth day of neuroectoderm differentiation. Correlation coefficients and p-values were shown in Table 2.

H3K36 dimethylation levels are increased during early stages of neuroectoderm differentiation
The aim was to infer which chromatin factors play a role for the mESC exit from pluripotency and the choice between endoderm and neuroectoderm lineages. To that end, histone modification changes were quantitatively catalogued through proteomic analysis of histone proteins in wild-type mESCs and cells differentiated towards endoderm or neuroectoderm lineages. The purpose of this method is to correctly identify the chemical modifications and the amino-acids they are linked to in histones. Histone proteomics is technically challenging due to their small size and the abundance of lysines and arginines which are generally used for cleavage reactions prior to mass spectrometry. Chemical modifications at these residues might alter cleavage efficiency or might be labile to sample preparation protocol. To ensure success, the analysis

Gene
Forward Reverse Sezginmert and Terzi Cizmecioglu: Histone proteomics of mESC lineage commitment was performed in the proteomics facility of the Broad Institute that specializes in histone proteomics. As mESCs differentiate, they lose the activity of the pluripotency network that keep them at the stem cell state and start expressing lineage specific TFs dependent upon intra-and extracellular signals. These pioneer TFs then start a cascade of cell character change through activation of several downstream TFs and eventually lead to differential molecular organization and more mature cell types. The earliest emergence of lineage specific TFs was focused on, with endoderm-committed cells collected when Foxa2 TF is first expressed, and neuroectoderm-committed cells collected when Sox1 is first expressed. These markers were identified by others as the first transcription factors expressed and led the differentiation towards said lineage and have been successfully used to study mESC early differentiation [10,14,[16][17][18][19][20]. Monitoring for the earliest signs of lineage specific marker expression helped focus at a tight window of change, from pluripotent state to lineage commitment. This approach can highlight the earlier and more direct histone modification alterations that guide lineage choice.
The abundance of histone peptides in each sample was first normalized to total histone level and then individually to the relevant histone level. Since the growth conditions of mESCs differ for endoderm or neuroectoderm commitment, mESC samples were collected independently for these conditions and the data for differentiated cells were normalized to their respective mESC samples.
No significant alterations in H2A or H2B modification levels were observed upon endoderm or neuroectoderm differentiation (Supplemental Figure 1A and B). Acetylated H4 levels decreased upon mESC differentiation (Supplemental Figure 1C and D), consistent with a role of histone deacetylases at pluripotency exit [14]. This study is focused on H3 modifications since they showed a larger overall change upon differentiation and some offered lineagespecific effects ( Figure 1A-D).
H3S10 phosphorylation, a marker for active mitosis [21], was abundantly found in ESC state compared to differentiated states ( Figure 1C and D), consistent with rapid cycling of mESCs. Elevated H3K9 and H3K36 methylation levels were observed in a neuroectoderm differentiation specific manner ( Figure 1A, C). Considering the wider range of difference observed for H3K36 methylation, this modification was selected as the focus of this study. H3K36 dimethylation (H3K36me2) increase was more pronounced than H3K36 trimethylation (H3K36me3) and the profile was not affected by the presence or absence of H3K27 acetylation or monomethylation ( Figure 1A). To validate these findings with Western blot, mESCs grown in serumfree conditions were differentiated towards neuroectoderm lineage [11]. Morphological changes associated with differentiation were observed (Figure 2A), where the round and three-dimensional mESC colonies gradually flattened and assumed epithelial-like morphology. The success of differentiation was verified with RT-qPCR analysis of key pluripotency markers (Nanog, Pou5f1 (Oct4); Figure 2B) and neuroectoderm lineage specific genes (Sox1, Pax6, N-cadherin; Figure 2C). Consistent with previous literature [19], the earliest neural lineage marker Sox1 level peaked at the third day of neuroectoderm differentiation while its downstream targets Pax6 and N-cadherin levels showed a progressive increase through day 5 of differentiation, confirming functional outcome of SOX1 expression. These trends were accompanied by a sharp decrease in the expression of both pluripotency markers; Nanog and Pou5f1. Western blot analyses of histone extracts confirmed that H3K36me2 level is substantially elevated on the fifth day of neuroectoderm differentiation compared to mESC state while H3K36me3 level remained unchanged ( Figure 2D). However, both H3K36me2 and H3K36me3 levels remain fairly similar during endoderm differentiation ( Figure 2E), suggesting a unique role of H3K36me2 during neuroectoderm lineage choice rather than at pluripotency exit.

Transcriptional levels of H3K36 methyltransferases during the course of neuroectoderm differentiation
Level of H3K36 dimethylation is regulated by methyltransferases and demethylases through the addition of two methyl groups to H3K36, or through the demethylation of H3K36me3 [6], respectively. First, the expression levels of methyltransferases that add up to two methyl groups to H3K36 were investigated; namely Nsd1-3, Setd3, Ash1l, Setmar and Smyd2.
RT-qPCR analyses through neuroectoderm differentiation time-course revealed a significant decrease in Nsd1 expression level ( Figure 3A) and an increase in Nsd2 level ( Figure 3B). A strong negative correlation was found between Nsd1 and H3K36me2 level and a significant positive correlation was observed between Nsd2 and H3K36me2 ( Table 2). Nsd3 level was stable through neuroectoderm differentiation ( Figure 3C). No significant trends were observed in the expression levels of Ash1l, Setmar and Smyd2 through neuroectoderm differentiation time course ( Figure 3D-F).
We next focused on trimethyltransferases (Setd2 and Setd5) for a possible decrease in their levels which may lead to accumulated H3K36me2 [5,22]. Although no pronounced trend was observed in Setd2 expression ( Figure 3G), Setd5 displayed a downward trend ( Figure 3H) during neuroectoderm differentiation. Since H3K36me2 is the substrate for trimethyltransferases, a decrease in Setd5 might be associated with the H3K36me2 accumulation during neuroectoderm differentiation ( Table 2). Expression of another SET domain containing protein, Setd3, showed a less pronounced downward trend ( Figure 3I).

Transcriptional levels of H3K36 demethylases during the course of neuroectoderm differentiation
In order to obtain a more comprehensive perspective, H3K36 demethylases were investigated next. H3K36 demethylation was reported to have crucial roles for mESC self-renewal [23]. KDM4B (JMJD2B) and KDM4C (JMJD2C) have cell and context dependent functions and act as H3K36me2/3 and H3K9me2/3 demethylases. KDM2A and KDM2B are demethylases specific to H3K36me1/2 [24].
Although no significant trend in Kdm2a,b and Kdm4a,b levels was found during the course of neuroectoderm differentiation (Supplemental Figure 2A-D), Kdm4c level decreased through neuroectoderm commitment (Supplemental Figure 2E). This decrease might be associated with decreased H3K36me2 demethylation activity which might explain the accumulation of H3K36me2 observed in neuroectoderm differentiation (Table 2).

Transcriptional levels of H3K36me2/3 readers during the course of neuroectoderm differentiation
Histone methylations exert their function through reader proteins that can specifically recognize the methylated histones and recruit downstream proteins [3].
The expression levels of H3K36me3 readers PHF1 (PCL1) and PHF19 (PCL3) were quite low in mESC state and during neuroectoderm differentiation (data not shown) [4]. DNMT3A and DNMT3B are de novo DNA methyltransferases that establish the DNA methylation patterns during early embryonic development [25]. They contain H3K36me2/3 Sezginmert and Terzi Cizmecioglu: Histone proteomics of mESC lineage commitment   [26]. A prominent decrease was observed in Dnmt3a (Supplemental Figure 3A) while Dnmt3b expression seemed to peak on the third day of differentiation (Supplemental Figure 3B).
H3K36me2/3 reader MSH6 is a part of DNA mismatch repair (MMR) machinery [27]. Msh6 expression has declined through neuroectoderm differentiation (Supplemental Figure 3C) which negatively correlated with H3K36me2 level change ( Table 2). ZMYND11 is involved in regulation of transcriptional elongation and mRNA splicing. It specifically recognizes H3.3K36me3 variant [8], enriched in actively transcribed regions. A progressive increase observed in Zmynd11 (Supplemental Figure 3C) implicates a possible role in neuroectoderm differentiation.
Some degree of increase in Psip1 (Supplemental Figure 3E) and decrease in Mrg15 (Supplemental Figure 3F) were observed, though statistically insignificant.

Transcriptional levels of selected genes during the course of endoderm differentiation
The RT-qPCR analyses revealed expression level changes of genes related to H3K36 methylation during early neuroectoderm lineage commitment. To identify whether these changes are unique for neuroectoderm lineage or rather they are associated with the exit of pluripotency, we investigated the expression of these genes during endoderm differentiation using transcriptomics data (unpublished data). Visualization of RNA-seq data through IGV shows a gradual decrease in pluripotency TFs Oct4, Sox2, Nanog and Klf4 (Supplemental Figure 4A). As the earliest marker of definitive endoderm [16], Foxa2 is expressed at day 3 of differentiation, along with meso/endoderm lineage TF Gsc (Supplemental Figure 4B). Downstream targets of FOXA2 (Sox17, Gata4, Gata6 and Hnf4a) emerge at differentiation day 4 (Supplemental Figure 4B), confirming functional outcome of Foxa2 expression. IGV (Supplemental Figure 4C) and differential expression (Supplemental Figure 4D) analyses showed that similar to neuroectoderm, Nsd2 and Zmynd11 expression increases significantly in endoderm lineage. Kdm4c and Msh6 expression levels decreased significantly in endoderm as they did in neuroectoderm lineage. However, the decrease in Nsd1 level is unique for neuroectoderm differentiation ( Figure 3A) with no corresponding significant change in endoderm differentiation (Supplemental Figure 4C and D). Dnmt3b level increased and remained high during the course of endoderm differentiation (Supplemental Figure 4C and D). Thus, this protein may have functions at different time points for endoderm and neuroectoderm differentiation. While Setd5 was trending downward during neuroectoderm differentiation ( Figure 3H), an upward trend was observed in endoderm differentiation (Supplemental Figure 4C). Since H3K36me2 is used as a substrate for trimethylation by SETD5, such a unique decrease encountered in neuroectoderm differentiation time course might explain the accumulation of H3K36me2.

Discussion
The focus of this study was histone modification changes during a tight window of ESC differentiation with large gene expression changes. This approach enabled to pinpoint which histone modifications are the earliest targets for loss of pluripotency and lineage choice. No significant alterations were observed in H2A or H2B modification levels upon endoderm or neuroectoderm differentiation (Supplemental Figure 1A and B), suggesting the modifications in these histones might not play a big role in ESC fate decisions. It is also possible that alterations in H2A/H2B modifications follow those that happen in H3 and H4 and the early differentiation time-frame was not suitable to observe such changes. However, it should be noted that no global change in H2A/H2B modifications through mass spectrometry analysis does not preclude possible gene-specific changes. Alterations in acetylated H4 level were found during mESC differentiation (Supplemental Figure 1C and D). Consistently, it was previously reported by us that histone deacetylase complexes are critical at pluripotency exit [14].
The most pronounced differences were noted in H3 modifications (Figure 1), especially in H3K36 methylations ( Figure 1A and B). Thus, H3K36 methylations and related methyltransferases, demethylases and readers became the central focus of this study. The data presented here helps explain the chromatin-based mechanisms regulating the mESC state and the exit from pluripotency, as well as the lineage choice.
H3K36me3 can be recognized by MSH6, a part of the MMR pathway [27]. H3K36me3 is deposited in gene bodies of actively transcribed genes and H3K36me3-mediated MMR can preferentially protect actively transcribed genes from mutations induced by mismatches [28]. Rapidly dividing ESCs can have mismatches in higher frequency [29], establishing a critical need for proper MMR. Consistently, Msh6 is expressed higher in mESCs compared to differentiated cells (Supplemental Figures 3C, 4C-D).
Histone variant H3.3K36me3 marks actively transcribed genes and can be recognized ZMYND11 [8]. Zmynd11 expression increased with differentiation towards both lineages (Supplemental Figures 3D, 4C-D), indicative of a role not in lineage choice but in the exit from pluripotency. ZMYND11 is involved in transcriptional repression of pluripotency factor c-Myc [8]. This implies a role of ZMYND11 in the suppression of the pluripotency network at the exit from pluripotency stage.
H3K36me3 prevents cryptic intragenic transcription in mammalian cells [30] and ensures proper transcription elongation. Recently, it is shown that H3K36me3 reader MRG15 recruits H3K4-demethylase KDM5B to gene bodies, decreasing the possibility of cryptic initiation [31]. KDM5B target genes include self-renewal-associated genes and its knockdown increases the expression of lineage specific genes [31] presumably through shutting down the pluripotency network. Consistently, high Mrg15 expression in mESCs was observed with a downward trend during the course of differentiation (Supplemental Figures 3F, 4C-D).
H3K36me2/3-specific demethylase KDM4C recruits PRC2 complex which adds repressive H3K27me3 modification. The targets of KDM4C includes neural differentiation marker gene Pax6 and mesendoderm differentiation marker Brachyury [23]. Thus, our observation of decreasing Kdm4c expression level in both lineage differentiation processes indicates a role for H3K36 methylation and KDM4C in differentiation.
Nsd1 and Nsd2 methyltransferases showed contrasting trends during the course of neuroectoderm differentiation ( Figure 3A and B), supporting their unique roles [24]. NSD1 represses differentiation related genes in mESCs through recruitment of HDAC1 at their enhancers [32] which can explain high Nsd1 expression in mESCs compared to neuroectoderm differentiation ( Figure 3A). NSD1 was previously also linked to limiting the accumulation and expansion of H3K27me3 signal at PRC2 target gene promoters and gene bodies [33,34]. Since this was observed globally rather than at select group of genes (pluripotency or developmentally related), it is hard to argue whether this function of NSD1 is linked to mESC differentiation or lineage choice. Additionally, since NSD1 is the main methyltransferase responsible for intergenic H3K36me2, the expression trend implies that the H3K36me2 increase could be mainly intragenic rather than intergenic. NSD1-mediated H3K36me2 is necessary for DNMT3A recruitment and DNA methylation in intergenic regions [34]. The downward trend observed in neuroectoderm differentiation for Dnmt3a (Supplemental Figure 3A) also supports that the increase in H3K36me2 may be primarily intragenic. Intragenic H3K36me2 is found towards transcription start site of the genes and overlaps with the promoter proximal pausing, a mechanism proposed to tune gene expression level [35]. This might indicate a possible link between H3K36me2 accumulation in neuroectoderm differentiation and promoter pausing.
The progressive increase in Nsd2 expression is observed in both neuroectoderm and endoderm lineages. Since diseases associated with NSD2 have a strong neurodevelopmental tie [36], the activity and function of NSD2 might be different in the two lineage differentiations. NSD2 can also dimethylate H4K20, H3K4 and H3K27 along with non-histone proteins related to DNA repair, any of which might be more critical in endoderm differentiation.
Establishment of DNA methylation patterns by H3K36me2/3 readers DNMT3A and DNMT3B is crucial for proper embryonic development. Although similar in biochemical function, knockout studies and disease associations suggest DNMT3A/B might have distinct roles [37][38][39]. Dnmt3a and Dnmt3b expression profiles (Supplemental Figure 3A and B) are suggestive of their unique functions through the neuroectoderm differentiation.
Alternative splicing (AS) plays a critical role in lineage specification [7]. This is particularly important for neuronal development and central nervous system [40]. H3K36 readers PSIP1/SRSF1 and MRG15/PTBP1 regulate exon inclusion/exclusion. During hESCs differentiation, H3K36me3 mediated AS switch using PSIP1 helps to shut down the pluripotency network [7].
Here, we present quantitative histone modification alterations through early mESC lineage commitment towards neuroectoderm and endoderm. We also lay out the mRNA level expression of H3K36 methyltransferases, demethylases and reader proteins in mESC stage and during neuroectoderm or endoderm lineage commitment. Our data point to a role of H3K36 methylation in regulating early stages of embryonic development with a special role of H3K36me2 in neuroectoderm differentiation. We found that Nsd1 and Setd5 methyltransferases were differentially expressed between lineages, establishing a need for deeper study in their possible roles for mESC differentiation. Analysis of chromatin reader proteins revealed similar expression changes independent of lineage, underlining a need for detailed study in their mechanism of action during mESC lineage commitment.
Coordination Unit projects (GAP-108-2021-10585 and TEZ-YL-108-2022-11040). Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.