Radiation increases COL1A1, COL3A1, and COL1A2 expression in breast cancer

Abstract Background Radiotherapy-associated secondary cancer is an important issue for the treatment of breast cancer (BCa). This study aimed to investigate the molecular mechanism and genetic risk factors for radiation-associated secondary diseases in BCa. Methods The differentially expressed genes (DEGs) between preradiation and postradiation BCa samples in the GSE65505 dataset were obtained. The pathways related to the radiation-associated DEGs in the protein–protein interaction (PPI) network modules were identified. miRNAs targeted to the key genes in the PPI network were identified, and their association with BCa prognosis was analyzed. Results A total of 136 radiation-associated DEGs preradiation and postradiation BCa samples were screened out. The PPI network consisted of a significant module that consisted of 21 upregulated DEGs that were associated with “hsa04512: ECM–receptor interaction,” “hsa04151: PI3K-Akt signaling pathway,” and “hsa04115: p53 signaling pathway.” Sixteen DEGs, including three collagen genes collagen type I alpha 1 chain (COL1A1), COL3A1, and COL1A2, were enriched in 17 radiation-associated pathways. The three genes were upregulated in BCa tissues compared with controls and were also elevated by radiation. They were targeted by hsa-miR-29a/c, and the expression levels of hsa-miR-29a/c were associated with a poor prognosis of BCa. Conclusions The upregulation of COL1A1, COL3A1, and COL1A2 might be genetic risk factors for radiation-associated secondary diseases in BCa.


Introduction
Breast cancer (BCa) is the most commonly diagnosed malignant cancer and the leading cause of cancer-related premature mortality in women worldwide [1]. About 2.1 million newly diagnosed cases of BCa were estimated in 2018 [1]. Epidemiologic studies have well established a series of risk factors for BCa, including family history, race, physical inactivity, obesity, genetic variants, and instability [2,3]. However, the treatment of BCa is still an important issue due to the high recurrence and mortality of this disease.
Adjuvant radiotherapy is the standard treatment for BCa patients that are undergoing breast-conservation surgery or mastectomy. Radiotherapy reduces BCa recurrence and death significantly [4][5][6]. Overall, radiotherapy reduces the 10 year risk of first recurrence from 35.0 to 19.3% and decreases the 15 year risk of death from 25.2 to 21.4% [4]. However, there are increasing evidence showing that radiotherapy causes secondary cancers and heart diseases decades later [7][8][9][10]. For instance, patients exposed to radiotherapy for prostate cancer and BCa had an overtime increased risk of bladder, colorectal, lung, and thyroid cancers compared with patients unexposed to radiotherapy [8,10]. In addition, the chest wall symptoms in BCa patients exposed to radiotherapy were worse than that of patients unexposed to radiotherapy [9]. Hodgkin lymphoma patients that were treated with chest radiotherapy have a high risk of BCa [11]. However, the molecular mechanisms of the pathogenesis of radiation-associated secondary cancers are not clear until now.
Research has pointed out that the smoking habit is a risk factor for radiation-associated secondary primary cancer [12]. In addition, there is evidence showing genetic susceptibility to radiation-associated secondary BCa [11]. Hodgkin lymphoma patients that had a high risk of radiation-associated secondary BCa had a higher polygenic risk score that was composed of the features of nine single-nucleotide polymorphisms [11]. For radiationassociated secondary cancers in patients with BCa, however, the molecular mechanism has not been reported until now.
For filling this gap, we performed this study to identify the radiation-associated genes in BCa. The identification of radiation-related genetic susceptibility and genetic risk factors would be of great value for better understanding the molecular mechanisms underlying the radiation-associated secondary cancers in patients with BCa.

Data processing and identification of radiation-associated genes
The series matrix files from paired preradiation and postradiation tumor samples (n = 58) were downloaded and preprocessed using the Limma package (version 3.34.0; https:// bioconductor.org/packages/release/bioc/html/limma.html). Then the differentially expressed genes (DEGs) between paired preradiation and postradiation tumor samples were identified using the Limma package with the criteria of p < 0.05 and |log 2 (fold change, FC)| ≥ 0.585 (|FC| ≥ 1.5). The sample hierarchical clustering analysis was performed for DEGs using the pheatmap package (version 1.0.8; https://cran.r-project.org/package=pheatmap). The genes common to 15, 18, and 21 Gy radiotherapy-induced DEGs were identified using the Venn diagram analysis and were regarded as radiation-associated genes in BCa.

Enrichment analysis of radiationassociated genes
To illuminate the radiation-induced molecular functional changes in BCa, the Gene Ontology biological processes and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways that were associated with the radiation-associated genes DEGs were identified. Functional enrichment analysis was performed using the clusterprofiler package (https://github.com/GuangchuangYu/clusterProfiler). The threshold for selecting the significant biological processes and pathways were set at adjusted p value <0.05 and gene count ≥2.

Construction of protein-protein interaction network (PPI) of radiationassociated DEGs
The interactions between the products of radiation-associated DEGs in the GSE65505 dataset were predicted in the STRING database (Version: 10.0, http://www.stringdb.org/) with the default parameter (score >0.4). Then the PPI network of radiation-associated genes was constructed using the Cytoscape software (version 3.4.0, http://chianti.ucsd.edu/cytoscape-3.4.0/). The important module in the PPI network (score >5) was identified using the MCODE plugin in Cytoscape. Radiation-associated genes that were included in the important modules were subjected to the functional enrichment analysis using the clusterprofiler package. The significant pathways that were associated with the genes in important module were selected using the same criteria as mentioned above.

Identification of BCa-associated pathways
To select the key pathways related to the radiation-associated DEGs in the important module, the BCa-associated pathways in Comparative Toxicogenomics Database (CTD, http://ctdbase.org/) were identified using the searching term of "breast neoplasm." The items overlapped between the CTD database and the KEGG pathways were regarded as the key pathways in BCa, and the enriched genes were regarded as the key radiation-associated DEGs in BCa.

Construction of miRNA-target network
The miRNA-target pairs of the aforementioned key radiation-associated DEGs in BCa were predicted using the Webgestal online tool [13] using the overrepresentation enrichment analysis methods. Then the miRNA-target pairs with the p value of less than 0.05 were selected and used for the construction of miRNA-target network. The network was visualized using the Cytoscape software.

Identification of prognosis-associated genes and miRNAs in BCa
To investigate the association of genes and miRNAs with the prognosis of BCa, the correlation of aforementioned radiation-associated genes and miRNAs in the miRNA-target network with the overall survival (OS) of patients with breast invasive carcinoma (BRCA) in the Cancer Genome Atlas (TCGA) were analyzed using the user-friendly and interactive web resource of UALCAN (http://ualcan.path.uab.edu/index. html). Significantly difference was identified using the criterion of logRank p < 0.05.

Identification of radiation-associated genes in BCa in GSE65505
A total of 405, 204, and 325 DEGs were identified in BCa tumor samples receiving 15 Gy, 18 Gy, and 21 Gy intensity-modulated radiotherapy in the GSE65505 dataset, respectively ( Figure 1a). Hierarchical clustering analysis demonstrated the distinct expression profiles of these DEGs in samples preradiation and postradiation ( Figure 1b). Venn diagram analysis showed that common 133 DEGs were upregulated by the three radiotherapies, and only three DEGs were commonly downregulated ( Figure 1c). Then, 136 radiationassociated DEGs in BCa were identified in this study (Table A1).

Enrichment analysis of upregulated radiation-associated DEGs in BCa
The functional enrichment analysis of the 136 radiationassociated DEGs indicated that these genes were enriched into 563 biological processes that were associated with extracellular matrix (ECM) organization, inflammatory response, response to metal ion, neutrophil-mediated immunity, and cell chemotaxis, and 38 KEGG pathways including advanced glycation end-receptor for dvanced glycation end products (AGE-RAGE) signaling pathway in diabetic complication, p53 signaling pathway, PI3K-Akt signaling pathway, and ECM-receptor interaction. The top ten biological processes and KEGG pathways are shown in Table A2.

PPI network of radiation-associated DEGs in BCa
The PPI network that derived from the 136 radiation-associated genes was composed of 112 nodes (products of genes) and 529 lines (interaction pairs; Figure 2a). One module of a score of 9.9 and 21 upregulated radiationassociated DEGs was identified from the PPI network ( Figure 2b). The interaction degree of these nodes is shown in Table 1. The insulin like growth factor 1 (IGF1), matrix metallopeptidase 2 (MMP2), collagen type I alpha 1 chain (COL1A1), COL1A2, and COL3A1 gene had the interaction degrees of 30, 27, 22, 18, and 16, respectively.

Pathways that related to radiationassociated genes
The 21 upregulated radiation-associated DEGs in the module were associated with 19 KEGG pathways such as "hsa04933: AGE-RAGE signaling pathway in diabetic complications," "hsa04512: ECM-receptor interaction," "hsa04151: PI3K-Akt signaling pathway," and "hsa04115: p53 signaling pathway" ( Table 2). The function properties of the 21 radiation-associated genes were basically in line with the 136 radiation-associated DEGs in BCa. Among the 19 pathways, 17 pathways that were common to BCa-related pathways in the CTD database were regarded as the key radiation-associated pathways in BCa. Moreover, 16 upregulated radiation-associated DEGs (including IGF1, MMP2, COL1A1, COL3A1, and ZFP36) were involved in the 17 key radiation-associated pathways ( Table 2).

The association of radiation-associated genes with the prognosis of BCa
Based on the analysis in the UALCAN database, we found that two genes, BTG2 and JUNB, and four miRNAs, including hsa-miR-29a, hsa-miR-29c, hsa-miR-224, and hsa-miR-196b, were significantly associated with the prognosis in BCa (Figure 4). The high expression levels of JUNB (p = 0.031), BTG2 (p = 0.019), hsa-miR-29a (p = 0.0058), and hsa-miR-29c (p = 0.014) as well as the low expression levels of hsa-miR-224 (p = 0.033) and hsa-miR-196b (p = 0.051) were associated with a low survival probability in the TCGA BRCA patients (Figure 4). Besides, we found three genes, including COL1A1, COL3A1, and COL1A2, were significantly upregulated in TCGA BRCA tumor samples compared with normal controls ( Figure 5), whereas the other seven genes were not significantly changed in BCa tumor samples (data now shown).  *Notes the overlapping KEGG pathways between the comparative toxicogenomics database and that related to the 21 radiation-sensitive genes.

Discussion
The increasing incidence of radiation-associated secondary cancers is challenging the treatment strategies for human solid cancers. Based on the tumorigenicity [11], we speculated that the mechanisms responsible for radiation-associated secondary cancers involve the dysregulation of a cluster of genetic factors. As expected, we identified that the upregulation of three collagen genes (COL1A1, COL3A1, and COL1A2), BTG2, and JUNB might be responsible for the radiation-associated secondary cancers in patients with BCa. COL1A1 and COL3A1, respectively, encodes the α1 chain of Type I and III collagen, both of which are the fibril-forming collagens [14,15]. Type I and Type III procollagens are crucial components of ECM, which is important for tumor microenvironment [14]. They act important roles in the proliferation and/or metastasis of cancer cells. Previous study showed that the Type I and Type III procollagens were elevated in human scirrhous carcinoma of breast with a spatial specificity [16]. In situ hybridization showed that the expression levels of the Type I and Type III procollagens were decreased with the increased distance between the fibroblasts and tumor cells [16]. The upregulation and association of COL1A1, COL3A1, and COL1A2 with collagen prolyl-4-hydroxylase α subunit 2 (P4HA2) had been reported in BCa [17][18][19]. In addition, the other factors that related to ECM or collagens, including MMP2, CCL2, and tissue inhibitor of metalloproteinase-1 (TIMP1), were also identified to be upregulated in BCa tissues compared with control [18].
Here, in our present study, we identified that the exposure to radiotherapy elevated the expression of COL1A1, COL3A1, and COL1A2 in BCa tissues rather than decreased them. According to the above studies, we assumed that the consistent upregulation of the three genes might of great value for exploring the development of radiationassociated secondary nonbreast diseases in BCa.
There are tremendous evidence showing that COL1A1, COL3A1, COL1A2, or P4HA2 promotes the metastasis of BCa and contributes to a poor survival in patients with ER+ BCa [14,18]. By contrast, the inhibition of COL1A1 or P4HA2 counteracted the metastasis and suppressed the proliferation of the BCa cells [14,18]. The study by Srour et al. [20] showed that MMP2, COL1A1, COL3A1, and COL1A2 had lower expression levels in lymph node metastasis compared with triple-negative BCa (TNBC). This finding was in line with the fact that from Gan. [16] who showed that the expression of the Type I and Type III procollagens were decreased with the increased distance between fibroblasts and tumor cells. The interesting evidence was that the elevated COL1A1 expression in BCa had been reported to be correlated with a better response to cisplatin-based chemotherapy [14]. A recent study by Liu et al. [21] showed that there was an opposite conclusion that COL1A1 expression was negatively correlated with radiosensitivity in cervical cancer. Liu et al. [21] reported that the COL1A1 activation could inhibit X-ray radiation-induced apoptosis in cervical cancer. Our present study showed that COL1A1 and COL3A1 were targeted by hsa-miR-29a/b/c that were decreased in radio-resistant nasopharyngeal carcinoma cells [22]. The expression of COL1A1 exerted a radio-resistance effect in nasopharyngeal carcinoma cells [22]. Accordingly, we suspected that the consistent radiation-associated upregulation of the COL1A1, COL3A1, and COL1A2 genes in BCa tissues might be used as genetic susceptibility and risk factors for radio-resistance. This might be related to the radiation-associated secondary nonbreast diseases in patients with BCa. Figure 3: The miRNA-target regulatory network involving the radiation-associated genes in the key pathways related to BCa. Node size indicates the interaction degree. The higher the degree, the larger the node size is. The miRNA-target interaction pairs were identified using the Webgestal online tool. The miRNA-target pairs with a p < 0.05 are used for the construction of the miRNA-target network.
The association of hsa-miR-29 family members with tumorigenesis and drug-resistance in BCa cells had been proven previously [23][24][25]. hsa-miR-29a acts an oncogenic role [23,24,26], whereas the hsa-miR-29c exhibits a tumor suppressor role in BCa [25,26]. However, we found that the low expression levels of hsa-miR-29a/c were associated with a good survival outcome of BCa. Also, both of them targeted to the upregulated COL1A1 and COL3A1 genes. These results suggested that the predicted miRNA-mRNA regulatory networks, including hsa-miR-29a/c-COL1A1/COL3A1, might have crucial roles in the progression of BCa and the development of radiation-associated secondary diseases.

Conclusion
In summary, this present study showed that radiotherapy significantly upregulated the expression of COL1A1, COL3A1, and COL1A2 genes in BCa tumor tissues. The upregulation of them might be risk factors for radiation-associated secondary nonbreast diseases in patients with BCa. However,  the association of COL1A1, COL3A1, and COL1A2 upregulation with radiation-associated secondary diseases should be validated using large cohort trials. Author contributions: Yao GR: conception, study design, data acquisition, analysis and important interpretation, manuscript drafting, and approval of final manuscript; Zhao KY: study design, data acquisition, and analysis; Bao KK: data acquisition and analysis; Bu LY: conception, data analysis and interpretation, manuscript revision, and approval of final manuscript; and Li J: manuscript revision and data acquisition.

Conflict of interest:
The authors declare that they have no competing interests.

Appendix
*The three downregulated radiation-sensitive genes.