Role of phase partitioning in coordinating DNA damage response: focus on the Apurinic Apyrimidinic Endonuclease 1 interactome

Liquid-liquid phase separation (LLPS) is a way to concentrate biochemical reactions while excluding noninteracting components. Disordered domains of proteins, as well as interaction with RNA, favor condensation but are not mandatory for modulating this process. Recent insights about phase-separation mechanisms pointed to new fascinating models that could explain how cells could cope with DNA damage responses, conferring both spatial and temporal fine regulation. APE1 is a multifunctional protein belonging to the Base Excision Repair (BER) pathway, bearing additional ‘non-canonical’ DNA-repair functions associated with processes like RNA metabolism. Recently, it has been highlighted that several DNA repair enzymes, such as 53BP1 and APE1, are endowed with RNA binding abilities. In this work, after reviewing the recent literature supporting a role of LLPS in DDR, we analyze, as a proof of principle, the interactome of APE1 using a bioinformatics approach to look for clues of LLPS in BER. Some of the APE1 interactors are associated with cellular processes in which LLPS has been either proved or proposed and are involved in different pathogenic events. This work might represent a paradigmatical pipeline for evaluating the relevance of LLPS in DDR.


Phase separation in nuclear organization and functions related to DNA damage response
Nuclear dynamics, among other crucial cellular processes, has been recently established to be tuned, at least in part, by the widespread phenomenon of phase separation [1]. After a decade of active research, it is now accepted that this demixing process is a thermodynamicallydriven phenomenon, giving rise to a variety of dynamic bodies (i.e., biomolecular condensates, BMCs), primarily composed of nucleic acids and proteins [2] interacting through quinary interactions [3], frequently involving unstructured portions of proteins, especially intrinsically disordered regions (IDRs) [4,5]. These components are thought to be under the control of effective regulatory systems, through the action of a number of cellular factors which precisely tune the assembly and the de-aggregation of these bodies via post-translational modifications (PTMs), thus promoting a localized induction of condensates [6]. Some examples of nuclear processes proposed to be shaped by phase separation are heterochromatin domain formation, transcription, nucleolar metabolism, and DNA damage response (DDR). Indeed, it has been shown that chromatin structure dynamics may be regulated through phase-separation of several proteins (e.g., HP-1 and BRD4) involved in the reading of epigenetic marks on histone tails [7][8][9]; histone tail-DNA interactions might have a role in this process, as well [9]. An outstanding example of phase separation in the nucleus is represented by the nucleolus, the cellular body devoted to ribosome biogenesis. In recent years, it has been demonstrated that nucleoli may arise by phase separation induced by transcription of rRNAs from their genomic loci [10,11] and that their three layers constitute an example of nested phase-separated domains [12]: indeed these domains, namely the fibrillar center, the dense fibrillar component and the granular component, each contributing to ribogenesis through different steps of rRNA maturation and ribosome assembling, although sharing the rRNA as a major phase separating agent and being in direct interaction, do not mix and preserve their functional specificity as a consequence of the biophysical properties of the components [12]. An additional process, for which a phase separation mechanism has been proposed, is represented by DNA transcription: recent studies have shed new light on the actual mechanism of recruitment of transcription factors, proposing cooperative kinetics to explain the effects driven by enhancers and super-enhancers, via demixing processes of transcription factors themselves [13,14]. However, transcription and other processes claimed to be driven by phase separation (e.g. heterochromatinization) remain to be fully characterised because they differ from biomolecular condensates in some aspects, which are reviewed in [1]. In particular, some compartments, characterized by many phase-separating features, do not strictly respect the canonical features of liquid biocondensates (namely the round shape, no shear elasticity, and internal dynamics), raising the question of whether phase-separation could be displayed in several different aspects. For example, paraspeckles, although regarded as demixed bodies upon NEAT1 increase, show a one-axis preferential growth, unusual for LLPS-based granules. Heterochromatic domains, instead, which were proposed to form by phase separation because of their apparent properties of coalescing, for excluding inert probes and for causing density transition in HP-1 distribution, were finally shown to undergo their round shape degeneration several cell cycles after phase separation occurred, thus denying the initial hypothesis [15].

Role of RNA in promoting the recruitment of DNA repair enzymes at the lesion site
RNA is an important element of biomolecular condensates, and some studies demonstrated its involvement in DNA repair. The first evidence of this involvement was provided by the DDR-related action of retrotransposons in yeast: indeed, it was shown that retrotransposon elements might replace homologous sequences and become integrated at the lesion site [16,17]. In [18], the authors suggested an additional link existing between retrotransposons and DDR: while reverse transcriptases could promote repair by canonical transcripts, integrases might promote cDNA insertion, and cDNA might act as a template to bridge the double-strand break (DSB), leading to the repair by "in trans" or "in cis" mechanisms. Additionally, with regard to the repair of DSBs, it was recently suggested that Rad52 might promote transcript-dependent DSB repair through inverse strand exchange, likely followed by reverse transcription of a ssDNA overhang [19][20][21]. This perspective is supported by the mounting evidence accounting for R-loops formation as a physiological regulator in the genome [22,23]. Notably, Rad52 (in yeast) and FUS (in human) were observed to contribute to the formation of a molecular biocondensate at DSB sites, carrying out different roles, namely the organization of nuclear microtubule filaments [24], protecting the resected end of lesions and promoting DNA-damage signaling [25,26], as well as recruiting other DSB-repair-related enzymes [27]. Similarly, 53BP1, which is known to be important for DSB signaling and to affect the progression of the cell cycle, was demonstrated to localize to liquid compartments [25,26].
Recently, it was found that ncRNAs seem to play a critical role in the formation of a liquid compartment at the DSB site. In detail, a novel class of RNAs has been defined and named DDRNAs: they are produced from the processing of dilncRNAs (damage-induced long noncoding RNAs), which are transcribed at DSBs foci in a bidirectional manner [28,29]. DDRNAs are guided to the lesion site by dilncRNAs and both of them are supposed to contribute to the recruitment of repair enzymes [28,30].
Additionally, examples of the active role of lncRNAs promoting, in trans, the recruitment of the DSB repair machinery in a demixing-mediated manner have been described. The lncRNA LINP1, for example, allows the formation of a liquid department where Ku70 and Ku80 can demix to effectively accomplish non-homologous end joining (NHEJ) repair [31]. In addition, other ncRNAs have also been associated with DDR signaling [32] and cellwide effects [33,34].
A typical example showing the involvement of RNAmoiety in DDR is PARylation (and Mono ADP-ribosylation, MARylation), a post-translational modification consisting in the addition of single or multiple ADP-ribose molecules to both proteins and DNA [35]. These modifications are introduced by the PARP enzymes family and represent one of the main signals of genomic damage in cells [36,37]. In particular, PARP enzymes are now known to catalyze the addition of MAR or PAR moieties at single-strand break (SSB) and DSB loci, thus mediating the signaling of those damaging events [35,38]. These modifications are also known to direct the formation of damaged DNA-enriched compartments and to recruit demixing factors, like FUS [39,40]. To date, albeit it is known that: i) PARP interacts with several BER factors (e.g., XRCC1, POLβ, and LIG3); ii) PARP modulates the activity of glycosylases [41,42] and the 3'-exonuclease activity of APE1 [43] and that iii) PARP inhibition significantly hampers the efficiency of the BER pathway [44,45], there is no evidence for the formation of a demixed compartment hosting the BER-mediated DDR.

Interactomes of DNA repair enzymes: focus on RNA processing proteins
Along with the involvement of RNA molecules themselves, it is widely accepted that the interaction with RNA represents a key feature of most DDR enzymes, both in direct and indirect manners. It was shown that many RNA binding proteins (RBPs) are required to ensure proper production of DDR factors (as reviewed in [46]) via posttranslational regulation of their transcripts, allowing them to escape the general translational repression occurring upon DNA damage and thus indirectly influencing the repair process. Nonetheless, RBPs were shown to directly take part in DDR, since enzymes involved in mRNA and miRNA processing have been associated with DNA repair. For instance, RBM14 is a RBP involved in alternative splicing and it is recruited to DSB sites via PARP1 [47,48]; likewise, HNRNPD is necessary for the DNA resection step in the homologous recombination pathway [49]. Helicases, for example DEAD-box helicases, are interesting RNAinteracting proteins involved in RNA metabolism [50,51] and in DDR [52,53] and some of them were shown to participate in demixing bodies [54,55].
Interestingly, the small non-coding RNA machinery, including DICER and DROSHA, is also important for DSB repair, since its products appear to be fundamental in recruiting some repairing factors [29,56].
These observations collectively suggest that a strong involvement in RNA metabolism is common among phase separating factors acting in DDR and that this might be a key feature to be investigated.

The Apurinic/Apyrimidinic Endonuclease (APE1) is a crucial BER enzyme
The Apurinic/Apyrimidinic Endonuclease (APE1) is a central enzyme in the BER pathway, acting as the main AP-endonuclease in mammalians [57]. However, this enzyme was recently characterized as possessing many non-canonical functions ( Figure 1) associated with RNA metabolism, including taking part in the biogenesis of ncRNAs (possibly through the interaction with DROSHA [58] or functionally interacting with other phase separating factors, like NPM-1 during rRNA biogenesis) [59], as well as with several RNA species [60]. These novel functions, partly supported by the capability of APE1 to bind to different nucleic acids, tremendously expand its functions toward the RNA world, which is clearly connected to LLPS. The ability of APE1 to bind and process RNA seems to be empowered by the disordered N-terminus of this protein, which is thought to be a recent evolutionary acquisition in mammals and it is found to be highly conserved in mammals, possibly constituting an important gain of function example [61]. This domain, in fact, was found to be essential for APE1 recruitment to nuclear subcompartments, including the nucleolus [62], and for its interaction with other BER factors [63] and other partners like NPM1 [61], which, in recent years, have also been linked to several novel functions including LLPS [64].
It is still not known whether the unstructured domain of APE1 and its RNA-binding abilities might represent an evolutionary gain of function to promote BER recruitment and to coordinate the action of the different enzymes, considering that every intermediate reaction product (i.e., the abasic site generated by glycosylases, the nick generated by APE1, etc.) results even more toxic than the original lesion processed by BER itself. This is particularly important, given that APE1 is much more abundant than all the other BER proteins and, in tumor cells, it is highly overexpressed. Therefore, the assumption that APE1 is only required for DNA repair by BER is somehow too limited.

Bioinformatics analysis of demixing proteins: the case study of the APE1 interactome suggests a novel hypothesis for triggering the Base Excision Repair pathway
The aforementioned features define APE1 as a reasonable candidate for a preliminary investigation questioning the involvement of a LLPS mechanism in its recruitment. Although there are other DDR-related, BER-involved proteins, known for being central for the recruitment of other factors and exhibiting even multiple IDRs (e.g., XRCC1), nevertheless APE1 represents a particularly interesting subject because of its role in RNA metabolism.
Here, we propose a bioinformatics approach that might be employed to obtain some useful insights on the LLPS world, defining a pipeline to help to direct the following experimental activity. This pipeline employs the APE1 interactome we recently defined [65]. This pool of interactors was assessed by an unbiased pull-down approach, performed taking advantage of a FLAG-tagged APE1 recombinant form, followed by protein complexes characterization through a MS/MS approach. In that work, which allowed us to identify almost 500 APE1-PPIs, we intentionally used a non-targeted approach to identify components involved in direct APE1-protein interactions, as well as molecules whose interaction with APE1 is indirectly mediated by RNA/DNA or other proteins. This approach, very stringent in order to avoid misleading identification of protein complex elements, might enhance the identification of cofactors colocalizing with APE1 in biomolecular condensates, making this analysis more suited for BMCs applications. In fact, the multiple biological functions attributed to APE1 and its localization in various subcellular districts further expand the list of its possible protein binding partners, as deriving from direct or indirect interactions. The condition we observed for APE1 is similar to the one recently reported for two APE1-binding partners, namely XRCC6 and XRCC5, which have been similarly demonstrated binding to RNA/DNA and to about 300 proteins [66], or previously described for proteins present in chaperone machineries [67]. Other examples of proteins having hundreds of interactors (as deduced by a single immunocapture experiment) are already present in the scientific literature [68][69][70]. At the same time, we were aware of the relevance of the lack of specificity for ascertaining direct PPIs in the methodology we used. Nevertheless, this approach is similar to others already used in previous papers allowing the identification of original APE1-binding partners [59,71,72]. It is finally to be noted that about 100 APE1-PPIs (among the ones reported in this study) were already demonstrated to be real APE1-PPIs in previous studies from us or other research groups; important examples in this context are NPM1, SFPQ and hnRNPK, as proved by independent binding and functional assays [59,71,73,74].
In the first step of our proposed bioinformatics analysis, we retrieved the disorder content of APE1-PPIs  [58,73,89]. The APE1 crystal structure bound to abasic DNA is displayed starting from the PDB deposited structure (6W0Q, [90]) and was modified using PyMOL software. Rainbow colors: APE1 sequence; black: substrate DNA.
from the MobiDB database (version 3.1.0) [75], which integrates manual reviews and in silico predictions. Out of 515 interactors, we were able to retrieve data for about 350 proteins (for the remaining proteins, the disorder index was not computed). We examined the distribution of these values and most of the interactors were characterized by a low disorder content (below 0.2) (Figure 2A).
Then, we collected the APE1-PPI data from PhaSePro [76], a manually curated collection of proteins characterized as demixing in vivo. The output consisted of seven APE1 interactors, namely APP, NPM1, LGALS3, HNRNPA1, FUS, SFPQ, and ESR1, fully characterized as demixing by ad hoc experiments. Their disorder content was pointed out as a reference in the general distribution ( Figure 2A).
As a control, we also evaluated the disorder content of all the proteins profiled in the PhaSePro database for LLPS, focusing in particular on characterizing the minimum value required for partitioning and the dependence, if any, on demixing partners. Interestingly, we could define two different groups ( Figure 2B): one, including the majority of the examined proteins, which showed a progressive increase in disorder content starting from about 0.2 and ramping up to 0.9 and a second, much smaller, centered on 0.05. To our surprise, the feature that mainly differentiated the latter was the requirement of a demixing partner, TIA1 representing the only exception; on the contrary, less than one-third of the proteins belonging to the first group behaved in the same way, with LAT curiously characterized by having the highest disorder content but also the necessity of a demixing partner.
Considering the two distributions, we noticed that all the reviewed demixing proteins known for not requiring additional partners were basically characterized by an internal disorder content greater than 0.15; thus, we focused on the APE1 PPI having a disordered content above that threshold, defining a subset composed of 88 members. We compared them to entries in PhaSepDB [77], which aggregates a wide range of direct and indirect evidence of proteins phase separation (e.g., fully demonstrated or just suggested by high-throughput data), defining a final set of 49 likely demixing interactors (Table 1).
To gain some insights on the biological processes involving these proteins, we performed a functional enrichment analysis [78] employing ClueGO [79], a Cytoscape [80] plugin allowing to use different ontologies/ databases, focusing on biological processes ( Figure 3A) and intracellular localization ( Figure 3B).
We first analyzed the set of 49 likely demixing interactors defined by PhaSepDB, and the results highlighted 34 significantly enriched terms associated with six major processes by ClueGO, as shown in Figure  3A. Interestingly, most of these terms were associated with gene expression and RNA processing, while the rest was associated with viral and telomeric regulation.
A second analysis of the same gene set took into consideration the intracellular localization. We obtained enriched terms related to euchromatin, spliceosome, and translation preinitiation complex, cellular departments strictly linked to the metabolism of nucleic acids that, hence, might act as dynamic scaffolds for liquid-like structures ( Figure 3B).
We repeated the same analysis on the seven interactors fully characterized as demixing by PhaSePro, adding APE1 as the ideal center of the functional network. Interestingly, one of the three enriched terms (Figure 4) pointed to the formation of amyloids, common hallmarks of neurodegenerative diseases, which have been related to liquid-demixing proteins [81,82].
Lastly, we compared the list of 49 interactors to the MSigDB database, a collection of gene sets co-expressed and/or involved in physiological and pathological processes (i.e., molecular signatures) obtained by a data mining approach that functionally complements the ontologies previously investigated using ClueGO [83,84]. For this investigation, the H, C4, C6 and C7 collections of MSigDB were chosen as terms of comparison. These signatures have different origins and meanings: the H collection is made of signatures characterizing welldefined biological processes, C4 gathers cancer-related signatures originated by data mining of large microarray data, C6 collects signatures related to pathways deregulated in cancer while C7 collects signatures related to the immune system and its deregulation. A false discovery rate (FDR) threshold was set to 0.05 to establish significant results (Benjamini-Hochberg correction implemented by MSigDB). We took into account the first 50 enriched results ( Table 2).
We obtained a significant association with different kinds of cancer signatures, namely: liver, prostate, and hematological tumors; furthermore, additional signatures were also linked to the activation of the PBMCs (peripheral blood mononucleated cells). Finally, enriched terms also pointed to general biological processes such as ribogenesis, protein biosynthesis, and mRNA splicing, consistently with previous ClueGO results. These terms strongly suggest the existence of a relationship, both physical and functional, with: i) nucleic acids (especially RNA); ii) their metabolism, and iii) the compartments in which they accumulate.
We hypothesize that this connection might occur through direct or indirect recruitment to liquid demixing bodies, the impairment of which could be related to several pathological conditions. This particular association with nucleic acids might also represent evidence for a novel APE1 function related to phase-separation: on such a basis, more investigation is demanded to elucidate the possible role of biocondensates in BER pathway triggering, that could explain the role of these RNA-interacting partners, and could also be of interest for establishing novel protocols for drug development.

Concluding remarks
This review provides several pieces of evidence suggesting the involvement of new mechanisms and new experimentally validated protein candidates belonging to The results here obtained, through the proposed bioinformatics pipeline, would be useful in further experimental validation to identify demixing cofactors that might be crucial in reproducing phase-separation in vitro [85]. For example, on the basis of the close relationship existing between APE1 and NPM1, the consequential follow up of this investigation could be represented by exploring the possible joint phase-separation of these proteins in the presence of rRNA, which is known to be required for NPM1 phase separation and that might also be necessary to APE1 demixing. Moreover, drugs selectively impairing LLPS might be employed to validate this hypothesis: 1,6-hexanediol, for example, has been already used in previous works to show the demixed status of some bodies. Some doubts have been raised about this proving technique since this molecule might produce artifacts in living cells; in fact, its amphipathic nature is not able to  [66], KEGG [67], CORUM [68], ClinVar [69], Reactome [70], GOBiologicalProcess [71]). (B) Clusters of enriched terms (adjusted pValue<0.05) associated with subcellular localization (Queried database: GOCellularComponent [71]). Percent values refer to the amount of enriched terms associated with each cluster. P-values are referred to the clusters. impair some of the most common molecular interactions giving rise to BMCs.
We suggest that our approach may contribute to uncovering new molecular strategies for the therapy of human diseases that have been recently linked to phase partitioning, especially neurodegenerative diseases (e.g., Alzheimer's disease, amyotrophic lateral sclerosis (ALS), frontotemporal dementia). In fact, well known ALS-related mutations, such as substitutions in the mostly disordered C-terminus of TDP-43 and the hexanucleotide expansion in C9orf72, were shown to impair phase separation of these proteins, suggesting their relevance in the onset of the pathological condition [86,87]. Additionally, altered proteostasis, which leads to the formation of aggregates in neuronal cells, is a hallmark of such conditions and was related to phase separation. Nonetheless, a complete understanding of how this aggregation influences the pathological outcome is still missing and requires further investigation to elucidate the exact relationship linking the fiber formation and the toxic effect [88]. Another interesting question, arising from the hypothesis of pathological misregulation of liquid compartments, concerns how these bodies could be physiologically maintained and the impaired regulatory mechanisms leading to irreversible aggregation, if any [88]. Lastly, since anti-cancer therapies are partly based on inefficient DNA repair, further characterization of the molecular mechanisms and the associated dynamics, along with the advanced unfolding of the interactomes involved in such pathways, might uncover new oncological targets. Therefore, the uncovering of a possible LLPS-related mechanism would certainly improve our knowledge on how to target deregulated processes of the DDR, selectively impacting on several human pathologies.
Author contributions: GT designed the paper outline and the research plan. DT performed the analysis. GA and ED analyzed data. DT, GA and ED wrote the manuscript. GT supervised the writing of the whole manuscript.