Calumenin contributes to epithelial-mesenchymal transition and predicts poor survival in glioma

Abstract Background Calumenin (CALU) has been reported to be associated with invasiveness and metastasis in some malignancies. However, in glioma, the role of CALU remains unclear. Methods Clinical and transcriptome data of 998 glioma patients, including 301 from CGGA and 697 from TCGA dataset, were included. R language was used to perform statistical analyses. Results CALU expression was significantly upregulated in more malignant gliomas, including higher grade, IDH wildtype, mesenchymal, and classical subtype. Gene Ontology analysis revealed that CALU-correlated genes were mainly enriched in cell/biological adhesion, response to wounding, and extracellular matrix/structure organization, all of which were strongly correlated with the epithelial-mesenchymal transition (EMT) phenotype. GSEA further validated the profound involvement of CALU in EMT. Subsequent GSVA suggested that CALU was particularly correlated with three EMT signaling pathways, including TGFβ, PI3K/AKT, and hypoxia pathway. Furthermore, CALU played synergistically with EMT key markers, including N-cadherin, vimentin, snail, slug, and TWIST1. Survival and Cox regression analysis showed that higher CALU predicted worse survival, and the prognostic value was independent of WHO grade and age. Conclusions CALU was correlated with more malignant phenotypes in glioma. Moreover, CALU seemed to serve as a pro-EMT molecular target and could contribute to predict prognosis independently in glioma.


Introduction
In central nervous system, glioma is the most prevalent and fatal primary cancer in adults [1]. Despite a substantial body of improvements in therapy, the prognosis for most glioma patients is still dismal. Particularly for patients who suffered from higher grade glioma (WHO grade IV, glioblastoma, GBM), which is the most malignant and lethal type, the median survival remains less than 15 months [2,3]. There is a growing recognition that epithelial-mesenchymal transition (EMT) plays a key role in mediating tumorigenesis, stemness, invasiveness, resistance to radiochemotherapy, and early recurrence in glioma [4][5][6][7]. It is therefore imperative to identify novel EMT-related molecules for potential glioma diagnosis and intervention.
Calumenin (CALU) has been widely reported in a range of malignancies including head and neck cancer [8], endometrial cancer [9], colon [10] and colorectal cancer [11], lung cancer [10,12], melanoma [13], hepatocellular and pancreatic carcinoma [14], and breast cancer [15]. CALU, a calcium-binding protein localized in the endoplasmic reticulum (ER), is mainly involved in such ER functions as protein folding and sorting. Besides, CALU has recently been shown to influence cell mobility, migration, invasion, and metastasis during particular events, such as tumorigenesis, wound healing, immune response, and coagulation [16][17][18][19][20][21]. Several studies have explored the relationship between CALU expression and survival and yielded relatively consistent results. In most types of cancer, a higher level of CALU in lesions indicated a more malignant phenotype and a shorter survival for patients.
However, the expression patterns and biological functions of CALU in gliomas have rarely been described. Only one study presented by Sreekanthreddy et al. [22] investigated the prognostic potential of serum CALU in GBM, which accounted for about 40% of pan-glioma. Here, we analyzed clinical and transcriptome data of 998 patients, aiming at exploring the role of CALU in gliomas.

Sample and data collection
From Chinese Glioma Genome Atlas website (CGGA, http:// www.cgga.org.cn/), we selected 301 glioma samples with mRNA microarray data. From The Cancer Genome Atlas website (TCGA, http://cancergenome.nih.gov/), we obtained 697 glioma patients with RNA-sequencing data. Clinical data, including WHO grade, IDH mutation status, molecular subtype, and prognosis, were also available. Thus, a total of 998 samples were included in the present study. Baseline characteristics of glioma samples in both datasets were summarized in Table S1. In CGGA_301 dataset, microarray data, which had already been normalized and centered (using GeneSpring GX 11.0 platform) by data provider, were directly utilized. However, in TCGA_697 dataset, RNAseq data (RSEM normalized, level 3) were log2 transformed before data analysis. Because this study used online databases, it did not require approval of the Ethics Committee.
Ethical approval: The conducted research is not related to either human or animals use.

Statistical analysis
Statistical analyses were primarily performed with R language (version 3.6.2). A set of R packages, such as ggplot2, pROC [23], pheatmap, corrgram, circlize [24], and gsva, were used to handle corresponding calculations and to produce figures. Cox proportional hazard regression analyses were performed with coxph function of survival package. Gaussian test was performed before data analysis that required Gaussian distribution. We performed Pearson correlation to calculate the correlation coefficient between CALU and every gene. Genes that strongly correlated with CALU were screened out with Pearson |r| > 0.6 in each dataset. Gene Ontology analysis (GO) of CALU-correlated genes was implemented based on DAVID [25] website (version 6.8, https://davidd.ncifcrf.gov/). For Gene Set Enrichment Analysis (GSEA) [26] and Gene Set Variation Analysis (GSVA) [27], a series of gene sets were obtained from the GSEA network (http:// software.broadinstitute.org/). A p-value less than 0.05 was considered to be statistically significant. Two-sided significance tests were adopted throughout.

CALU was significantly upregulated in GBM, IDH wildtype, mesenchymal, and classical subtype
According to the WHO grade system, CALU expression was analyzed in both CGGA and TCGA datasets, and the results congruently showed a significantly positive correlation between WHO grade and CALU expression (Figure 1a and d). Moreover, when IDH mutation status was defined as a subclassifier, we observed that IDH wildtype GBM exhibited the highest expression pattern of CALU in both CGGA and TCGA datasets. Besides, CALU expression in IDH mutant glioma seemed to be universally lower than that in IDH wildtype, across different WHO grade, except for lower grade glioma (LGG) in CGGA, which exhibited apparent trends although not significant (Figure 1b and e). Subsequently, the distribution of CALU expression among different molecular subtypes (defined by TCGA network) was investigated. As shown in Figure 1c and f, CALU was significantly upregulated in classical and mesenchymal subtype compared to neural and proneural subtype. These findings indicated that higher CALU expression was usually accompanied by higher malignancy potential of glioma.

CALU-related biological process
In total, 621 genes in CGGA chort and 965 in TCGA cohort were identified as CALU-related genes. To ensure the accuracy of the analysis, we subsequently identified 203 genes that overlapped between two independent cohorts, all of which were positively correlated with CALU (Table S2). Based on these genes, GO analysis revealed that genes that significantly correlated with CALU were highly enriched in a set of biological processes that correlated with EMT, including cell/biological adhesion, response to wounding, extracellular matrix/ structure organization, collagen fibril organization, and collagen biosynthetic process (Figure 2a and b). Moreover, the association between CALU expression and EMT was revealed by GSEA analysis. CALU expression was found to be positively associated with the gene set of HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSI-TION in both CGGA dataset (NES = 1.897, FDR = 0.035) and TCGA dataset (NES = 1.818, FDR = 0.075) (Figure 2c and d). These findings suggested that CALU might be particularly involved in EMT process during glioma progression.

CALU-related EMT signaling pathways
To get further understanding of the association between CALU and EMT, seven gene sets, representing distinct EMT signaling pathways [28], were obtained from GSEA network ( Table S3). Through cluster analyses, we identified 3 EMT signaling pathways (TGF-β, PI3K/AKT, and hypoxia), which might be strongly correlated with CALU (Figure 3a and b). Moreover, seven gene sets were transformed into seven metagenes with GSVA analysis, which were subsequently put into Pearson correlation together with CALU. According to Pearson r among seven metagenes and CALU, Corrgrams were plotted to assess their interrelationships. CALU was found to be positively correlated with TGF-β, PI3K/AKT, and hypoxia, in line with what we observed in clusters. However, only a very weak correlation was revealed between CALU expression and four other pathways (WNT, MAPK, NOTCH, and HEDGEHOG), which might be ascribed to signal noise (Figure 3c and d).

CALU was synergistic with EMT key markers
Assuming that CALU played a vital role in regulating glioma EMT, we investigated the association between CALU and EMT markers, including N-cadherin, E-cadherin, snail, slug, and vimentin. Pearson correlation tests were performed with CALU and the above five EMT markers in both CGGA and TCGA. Circos plots were derived from Pearson r-values between CALU and five markers.
As shown in Figure 4a and b, CALU expression showed high agreement with N-cadherin, snail, slug, and vimentin. In contrast, a weak relationship between CALU and Ecadherin was found in Circos plots, which could be defined as a noise. Heretofore, some other members, including TWIST1/2, β-catenin, and ZEB1/2, have been reported as key markers in EMT [29]. Thus, we additionally put them into analysis together with CALU. CALU expression was tightly associated with TWIST1 in both CGGA and TCGA datasets (Figure 4c and d).

Higher CALU was related to a worse prognosis
To evaluate the prognostic value of CALU in glioma, Kaplan-Meier (KM) survival curves were plotted. In panglioma samples, when comparing the two groups defined by median CALU expression, we observed that higher CALU expression predicted a significantly shorter survival, as shown in Figure 5a and d. Moreover, glioma patients were further divided into LGG and GBM subgroup. In both subgroups, patients with higher CALU exhibited universally worse survival than those with relatively lower CALU (Figure 5b, c, e and f), except for TCGA GBM, which also showed an apparent trend. To identify the independent effect of CALU on glioma prognosis, Cox regression analyses were performed with covariates including CALU expression, age, and WHO grade. Multivariate analyses revealed that CALU expression was a significant prognosticator independent of age and WHO grade in both CGGA and TCGA ( Table 1).

Discussion
We explored CALU expression at transcriptional level via a cohort of 998 glioma samples and demonstrated that CALU expression was positively correlated with WHO grade. In addition, upregulation of CALU was usually paralleled with a more malignant and aggressive phenotype, such as IDH wildtype, classical subtype, and mesenchymal subtype. Survival analyses revealed that higher CALU was related to a worse prognosis, independent of age and WHO grade. These results concordantly indicated that CALU might contribute to malignant progression of glioma, which were in line with the results from a previous GBM study [22]. Thus, unveiling the regulative mechanism of CALU may facilitate to develop a novel gene for potential glioma diagnosis and treatment. CALU is one of the members of CREC protein family. This molecule family mainly consists of Cab45, Reticulocalbin 1, ERC-55, and CALU and is characterized by multiple EFhand motifs with low affinity of Ca 2+ -binding [30]. Under normal physiological conditions, CALU primarily participates in regulating Ca 2+ -dependent protein folding, sorting and maturation in the ER [31], Ca 2+ homeostasis [32,33], and muscle contraction/relaxation [34]. However, in tumor microenvironment, CALU was reported to play a critical role in promoting a series of malignant phenotypes including cancer cell survival [21], filopodia formation and cell migration [20], invasiveness [12], metastasis [15,35], cancer development [10], and resistance to chemotherapy [13]. So far, very little is known about the biological function of CALU in glioma. In the current study, GO analysis was performed to elucidate the biological function of CALU in glioma and it revealed that CALU showed high association with multiple EMT-related biological processes, including cell adhesion, biological adhesion, extracellular matrix/structure organization, collagen fibril organization, and collagen biosynthetic process. GSEA in both CGGA and TCGA further exhibited a remarkable relationship between CALU and EMT. EMT has been extensively reported to act as a critical mechanism not only in invasiveness, but also in early recurrence and resistance to therapy in glioma [5,6,36]. These findings suggested that CALU might facilitate the malignant progression of glioma primarily via modulating EMT process, which has not yet been reported previously. Despite no report with regard to the pro-EMT effect of CALU, the other two members (Cab45 [37] and EFHD2 [38]) from the same protein family have been described in EMT regulation, which indirectly supported the potential role of CALU in glioma EMT.
We then chose a panel of EMT pathways and markers and examined their interrelationships with CALU. CALU was revealed to be highly associated with TGFβ, PI3K/ AKT, as well as hypoxia pathway, indicating that CALU might regulate glioma EMT through these signaling pathways. Furthermore, most of the EMT biomarkers showed robust correlation with CALU, suggesting a synergistic effect among CALU and these members during EMT  process. These findings further validated the potential role of CALU during EMT process in glioma.
In conclusion, CALU was upregulated in more malignant gliomas and predicted much worse prognosis. Furthermore, CALU seemed to be mainly involved in EMT process of glioma, potentially through modulating TGFβ, PI3K/AKT, and hypoxia pathway. However, limitations still exist in this study. First, no biological validation was performed, which might compromise the robustness of results. Further researches focusing on in vivo/in vitro experimental studies are warranted. Second, despite the large sample in our study, data from TCGA and CGGA were mainly retrospectively collected, the control of data quality was heterogeneous, and some data were unavailable, which might lead to potential bias.