The biomarkers of key miRNAs and gene targets associated with extranodal NK/T-cell lymphoma

Abstract Gene expression profiling studies have shown the pathogenetic role of oncogenic pathways in extranodal natural killer/T-cell lymphoma (ENKL). In this study, we aimed to identify the microRNAs (miRNAs) playing potential roles in ENKL, and to evaluate the genes and biological pathways associated to them. Gene expression profiles of ENKL patients were acquired from the gene expression omnibus (GEO) database. Most differentially expressed (DE)-miRNAs were identified in ENKL patients using limma package. Gene targets of the DE-miRNAs were collected from online databases (miRDB, miRWalk, miRDIP, and TargetScan), and used in Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) analyses on Database for annotation, visualization, and integrated discovery database, and then used in protein–protein interaction (PPI) analysis on STRING database. Hub genes of the PPI network were identified in cytoHubba, and were evaluated in Biological networks gene ontology. According to the series GSE31377 and GSE43958 from GEO database, four DE-miRNAs were screened out: hsa-miR-363-3p, hsa-miR-296-5p, hsa-miR-155-5p, and hsa-miR-221-3p. Totally 164 gene targets were collected from the online databases, and used in the GO and KEGG pathway analyses and PPI network analysis. Ten hub genes of the PPI network were identified: AURKA, TP53, CDK1, CDK2, CCNB1, PLK1, CUL1, ESR1, CDC20, and PIK3CA. Those hub genes, as well as their correlative pathways, may be of diagnostic or therapeutic potential for ENKL, but further clinical evidence is still expected.


Introduction
Natural killer (NK) cell tumors are categorized as extranodal NK/T-cell lymphoma (ENKL), aggressive NK-cell leukemia, chronic NK-cell lymphoproliferative disorders, etc., according to the 2016 World Health Organization (WHO) classification [1,2]. Among them, ENKL is relatively more common, which include both nasal and extra-nasal ENKL categories of disease in the current 2016 WHO classification [3]. ENKL is rarely diagnosed in Western countries, but relatively more common in East Asian countries [2]. ENKL is associated with Epstein-Barr virus (EBV) because EBV can infect T and NK cells and lead to EBV-associated T/NK cell lymphoproliferative disorders [1], according to observations of EBV molecules in tumor tissue and the correlations between EBV load and disease diagnosis and prognosis [4]. ENKL is a unique clinical entity characterized by an aggressive clinical course, and the prognosis is poor [5].
The pathogenesis of this tumor is poorly understood, but in recent years gene expression profiling studies have demonstrated the pathogenetic role of several oncogenic pathways in ENKL, such as AKT, STAT3, NF-KB, Notch-1, and Aurora kinase A [6,7]. In this sense, an important task is to find novel molecular and biological candidates for diagnostic, predictive, and prognostic potential in ENKL, including microRNAs (miRNAs).
miRNAs are small non-coding RNAs, which can regulate the gene expression in many cellular processes, such as cell proliferation, differentiation, and tumor genesis [8]. Since the discovery of miRNAs in circulation, a large amount of miRNA-related biological studies have confirmed that each miRNA can affect numerous target mRNAs [9] and each mRNA can be targeted by multiple miRNAs, enabling wide regulatory complexity of gene expression adjustment [10]. Furthermore, quantification of miRNAs can have potential diagnostic and prognostic utility in lymphoma [11][12][13]. miRNAs are easily detectable, relatively stable, and tissue specific, making them attractive candidate biomarkers [14]. Enhancing our understanding of the relationships between miRNAs and gene targets can help reveal detailed mechanisms and identify novel biomarkers for lymphoma.
In this study, we aimed to identify the miRNAs playing potential roles in the ENKL patients, which might further help provide valuable advice in clinical treatment.
To that end, we acquired typical gene expression profiles of ENKL patients from the Gene Expression Omnibus (GEO) database, and identified four miRNAs that are most differentially expressed (DE) in those ENKL patients using limma package in R. After that, gene targets of these four DE-miRNAs were collected by employing online databases, and then in terms of those gene targets, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed on the Database for Annotation, Visualization, and Integrated Discovery (DAVID), protein-protein interaction (PPI) were performed on STRING, and finally hub genes of the PPI network were identified in cytoHubba and evaluated in Biological Networks Gene Ontology (BiNGO) tool, in Cytoscape. Hope our study is helpful for revealing the mechanisms of ENKL pathogenesis, and for identifying the diagnostic and therapeutic biomarkers of ENKL.

Data source
The expression profiles of ENKL patients were collected from the GEO database. We searched the GEO for series about the expressions of miRNAs in ENKL patients which must contain both the tumor and normal samples, and only two series GSE31377 [15] and GSE43958 [16] were regarded suitable for this study. In GSE31377, we regarded the samples of ENKL patients' tissues and NK tumor celllines as "ENKL" group, and those of normal NK cells from healthy individuals' blood and ENKL patients' normal tissues as "normal" group, while in GSE43958, we took the NK tumor cell-line samples as "ENKL" group, and those of normal NK cells from healthy individuals' blood as "normal" group ( Table 1). Differential expression analysis was performed between "ENKL" and "normal" groups in both series, separately.

Identification of DE-miRNAs
The limma R package v3.12 was applied and served as the processor. An R GUI application v1.77 with R v4.0.4 was installed via home-brew 3.0 with -cask option on Mac OS Catalina. The limma package was installed in the package installer tool of this application. DE-miRNAs of the normal and ENKL groups were screened out according to the criteria adjusted p < 0.05 and absolute log fold change |log 2 FC| >1, where the FDR method was used to adjust the p, with an option adjust="BH" for the toptable( ) function of limma. As both series GSE31377 and GSE43958 gave DE-miRNA lists, the common DE-miRNAs appearing in both lists were collected for subsequent analysis.

miRNA-target interactions
Investigating the gene targets of miRNAs is crucial for identifying the regulatory mechanisms and functions of miRNAs. Herein we predicted the gene target of the DE-miRNAs by employing four miRNA-target prediction tools: miRDIP, miRWalk, miRDB, and TargetScan. The gene targets were determined by overlapping results from the four websites: first, for each miRNA, gene targets from the four websites were overlapped and common ones were picked up as a candidate list; and second the candidate lists of the four miRNAs were merged. After the miRNA targets were determined, the interactions of them would be visualized in Cytoscape.

GO and KEGG pathway enrichment analyses of gene targets
GO annotation and KEGG pathway enrichment analyses of the determined gene targets were performed using the DAVID, which revealed gene enrichments associated with biological processes (BPs), cellular components (CCs), molecular functions (MFs), and KEGG pathways associated with the gene targets. Number of the gene targets enriched in each of those items would be listed.

PPI network and biological analysis of hub genes
To gain insights into the interactions between gene targets, a PPI network was constructed using the STRING tool to reveal the molecular mechanisms of ENKL. The STRING was able to give a complex full PPI network, and usually the interactions between the hub genes were more interesting. The full network from the STRING was imported into the cytoHubba plugin of the Cytoscape to identify the hub genes. For biological analysis of hub genes, another plugin of Cytoscape BiNGO was employed. BiNGO has the capability to evaluate and visualize the pathways of enriched BPs, CCs, and MFs. We only performed the analysis on enriched BPs of the ten hub genes in this study.

Statistical analysis
Statistical analysis was used only for the identification of DE-miRNAs. Specifically, when applying the limma package, first, a linear model lmfit( ) would be fit for the expression data to estimate the expression levels of genes in each group, and then contrast fit contrasts.fit( ) was performed to estimate the expression differences of genes between ENKL group and control group. After that, empirical Bayes statistics ebayes( ) was employed for t-statistics, moderated F-statistics, and logodds of differential expression, and finally top DE-miRNAs were exported via toptable( ) of limma, where p values of the t-statistics were adjusted using FDR method with the option adjust="BH". The significant DE-miRNAs of each series that we were interested in were selected according to the criterion adjusted p < 0.05, and |log 2 FC| > 1.

Statement of ethics:
Ethical approval was not required in this study because all the data were collected from public papers and databases.

164 gene targets and interactions were identified for DE-miRNAs
After searching the four online databases, finally 164 genes associated with the four DE-miRNAs were determined as the ones playing potential roles in ENKL pathogenesis ( Table 4). The bridge genes between hsa-miR-363-3p and hsa-miR-221-3p are FNIP2 and ANKIB1, and the other bridge gene between hsa-miR-221-3p and hsa-miR-155-5p is ZNF652, while the genes associated with hsa-miR-296-5p appear isolated from others.

Enrichment of 164 gene targets in BPs, CCs, MFs, and KEGG pathways
Using the 164 gene targets, the DAVID listed the top 10 enriched gene counts of 61 BPs, 15 CCs, 18 MFs, and 10 KEGG pathways ( Figure 2). The significantly enriched entries of BPs include positive and negative regulations of transcription from RNA polymerase II, and protein phosphorylation; the majority of the enriched CCs include nucleus, cytoplasm, plasma membrane, cytosol, and nucleoplasm; and the most enriched MFs is protein binding. The enriched entries of KEGG pathways include regulation of actin cytoskeleton, FcγR-mediated phagocytosis, proteoglycans in cancer, focal adhesion, ErbB signaling pathway, thyroid hormone signaling pathway, sphingolipid signaling pathway, microRNAs in cancer, rap1 signaling pathway, and pathways in cancer.
3.4 Hub genes of 164 gene targets: AURKA, TP53, CDK1, CDK2, CCNB1, PLK1, CUL1, ESR1, CDC20, and PIK3CA According to the 164 gene targets, STRING gave a complex network with 1,638 lines and 253 nodes. A reduced graph of this network is shown in Figure 3, in which only the top 100 scored nodes are kept, and the node color represents the connection degree of node, evaluated in cytoHubba. Hub genes, playing essential potential roles in the network, were distinguished in terms of the connection degree. We selected the top 10 hub genes for further analysis: AURKA, TP53, CDK1, CDK2, CCNB1, PLK1, CUL1, ESR1, CDC20, and PIK3CA ( Figure 4). Figure 5 is the biological pathways of enriched BPs of hub genes, where the nodes with associated gene count less than five have been removed for a more clear network. This biological process pathway network involves 64 nodes and 91 edges.

Discussion
ENKL is a highly aggressive tumor, and a better understanding about the molecular abnormalities could provide important insights into the biology of this disease, as well as potential new therapeutic choice. Numerous trials have indicated that miRNAs may play key roles in the development of ENKL. Accordingly, strategies for the diagnosis and treatment of ENKL might be furnished via analysis of correlative data in GEO database and construction of ENKL-associated miRNA-target interactions regulatory network.
miRNA-155 is highly expressed in ENKL and NK tumor cell-lines where it activates the AKT signaling pathway to down-regulate tumor suppressor genes, leading to tumor cell proliferation [25]. As ENKLs are closely associated with EBV, miR-155 plays an important role in the regulating immune responses and the development of viral infections, indicating that this miRNA can promote the development of tumors by regulating the expression of inflammatory factors and changing the microenvironment [24,26,27]. Furthermore, inhibition of miR-155 expression could improve the sensitivity of ENKL to doxorubicin, even reverse the drug resistance [24,28]. Therefore, miR-155 is a key factor in the emergence and development of ENKL, making it a potential clinical indicator to evaluate the diagnosis, treatment, and prognosis of patients with ENKL.
The miR-221-3p in this study was found upregulated in GSE43958 but downregulated in GSE31377. In fact such difference has been noticed in clinical practice by Guo et al. [34]. Plasma miR-221 of 79 ENKL patients were measured in that study, and were found upregulated or downregulated differently. The authors compared the upregulated and downregulated groups, and found significant differences with respect to gender (p = 0.003) and ECOG performance status (p = 0.023), but insignificant with respect to their response rate and complete remission rates. They also found the upregulation of miR-221 associated with shorter overall survival. So it is indicated that miR-221 may either be upregulated or downregulated in ENKL patients, but upregulated miR-221 may be associated with poorer prognosis. We must note, however, that as the discrepancy of regulations of hsa-miR-221-3p exists in GSE43958 and GSE31377, the role of hsa-miR-221-3p in ENKL needs still further investigation. miR-296 is located in chromosomal region 20q13.32. It has a high degree of sequence conservation among species and plays important roles in many BPs [35]. miR-296 is upregulated in various types of cancers, such as DLBCL, gastric cancer, etc. [36]. It functions as an oncogene by targeting important tumor suppressors, such as CDX1, STAT5A, SOCS2, ICAM1, CASP8, NGFR, NF2, and HGS, and thereby regulates important cancer-related processes, such as cell proliferation, apoptosis, invasion, migration, blood vessel development, and chemoresistance [36]. In this study, bioinformatics show that miRNA-296-5p was upregulated, and the TMEM135, TEAD3, NUAK2, NSD1, MACF1, KCTD15, GPC2, GAB2, FGFR1, FAM53B, EPN1, ECHDC1, DYNLL2, CMTM4, and BAHD1 are potential target genes of miRNA-296-5p, but no definitive evidence has yet been reported in literature. Our findings indicate that miR-296-5p might play roles in inflammatory and malignant tumors, but further clinical evidence is still expected.
Thus, more detailed research is needed to explore the potential functions of the identified DE-miRNAs in ENKL. We found that several hub genes, such as AURKA, TP53, CDK1, CDK2, CCNB1, PLK1, CUL1, ESR1, CDC20, and PIK3CA, and correlative pathways, including regulation of actin  Table 4: miRNA-target interactions of hsa-miR-363-3p, hsa-miR-296-5p, hsa-miR-221-3p, and hsa-miR-155-5p according to miRDIP, miRWalk, miRDB, and TargetScan cytoskeleton, FcγR-mediated phagocytosis, proteoglycans in cancer, focal adhesion, and ErbB signaling pathway, may be of diagnostic or therapeutic potential for ENKL. A wide range of diseases, including cancer, result from malfunctioning of actin cytoskeletal proteins [37] and proteoglycans [38]. So further studies and more clinical samples are underway to determine whether an actin cytoskeleton or proteoglycan panel targeted antibody could be useful in the classification and better characterization of ENKL. Focal adhesion kinase (FAK) is a promising target for the treatment of solid tumors because its expression has been linked to tumor progression, invasion, and drug resistance [39]. Several FAK inhibitors have been developed and tested for efficacy in treating advanced cancers [39]. As lymphoma's biological behavior is more inclined to solid tumor [40], and we found focal adhesion to play an important role in ENKL, so FAK may serve as an effective therapeutic target to ENKL. Two important types of ErbB inhibitor are in clinical use: humanized antibodies directed against the extracellular domain of EGFR or ErbB2, and small-molecule tyrosine-kinase inhibitors that compete with ATP in the tyrosine-kinase domain of the receptor [41]. As we lack experience of ErbB inhibitor used for ENKL, so a larger scale verification for it is still needed. Hope in the future, further factors that underlie clinical response to signaling pathway targeted therapeutics could be uncovered by continuing the combination of fundamental and clinical studies. We also hope that the network of biological process pathways will contribute not only to the development of novel  therapeutics, but also in allowing us to optimally use those already in the clinic.

Limitations
There are several limitations to our present study. (1) The number of samples we obtained from GSE31377 and GSE43958 is relatively small, and the data from the two series were not comprehensive enough, especially the amount of data in GSE43958 were very limited, even with plenty of blanks, which partially limits the result of this study. However, other series in GEO either contain fewer samples, or lack healthy samples as control group; (2) in GSE31377, the "ENKL" group contains samples of both ENKL tissue and NK tumor cell-lines, while in GSE43958, the "ENKL" group contains samples of only NK tumor cell-lines, thus uncertainties might be introduced because the miRNA and genes in NK tumor cell-lines may be different from those in ENKL tissues; (3) the normal NK cells were collected from healthy individuals' blood, while ENKL tissue samples were from tissues. This difference in their sources may also affect the results. Thus, more samples are needed for validation with quantitative reverse transcription polymerase chain reaction in further research. In addition, the functions and molecular mechanisms of genes are very complicated, thus predictions based only on bioinformatics need cellular and animal experiments for verification.

Conclusion
In conclusion, by employing the GEO series GSE31377 and GSE43958 and bioinformatic analysis, we not only found four DE-miRNAs in ENKL: hsa-miR-363-3p, hsa-miR-155-5p, hsa-miR-221-3p, and hsa-miR-296-5p but also determined ten hub genes that may serve as potential biomarkers of ENKL: AURKA, TP53, CDK1, CDK2, CCNB1, PLK1, CUL1, ESR1, CDC20, and PIK3CA. Our findings might help improve the understanding of the pathogenesis of ENKL, help provide reliable biomarkers for precise diagnosis and individualized treatment of ENKL, and help discover novel therapeutics of ENKL in the future.