Potential biomarkers for inflammatory response in acute lung injury

Abstract Acute lung injury (ALI) is a severe respiratory disorder occurring in critical care medicine, with high rates of mortality and morbidity. This study aims to screen the potential biomarkers for ALI. Microarray data of lung tissues from lung-specific geranylgeranyl pyrophosphate synthase large subunit 1 knockout and wild-type mice treated with lipopolysaccharide were downloaded. Differentially expressed genes (DEGs) between ALI and wild-type mice were screened. Functional analysis and the protein–protein interaction (PPI) modules were analyzed. Finally, a miRNA-transcription factor (TF)-target regulation network was constructed. Totally, 421 DEGs between ALI and wild-type mice were identified. The upregulated DEGs were mainly enriched in the peroxisome proliferator-activated receptor signaling pathway, and fatty acid metabolic process, while downregulated DEGs were related to cytokine–cytokine receptor interaction and regulation of cytokine production. Cxcl5, Cxcl9, Ccr5, and Cxcr4 were key nodes in the PPI network. In addition, three miRNAs (miR505, miR23A, and miR23B) and three TFs (PU1, CEBPA, and CEBPB) were key molecules in the miRNA-TF-target network. Nine genes including ADRA2A, P2RY12, ADORA1, CXCR1, and CXCR4 were predicted as potential druggable genes. As a conclusion, ADRA2A, P2RY12, ADORA1, CXCL5, CXCL9, CXCR1, and CXCR4 might be novel markers and potential druggable genes in ALI by regulating inflammatory response.


Introduction
As a clinically severe respiratory disorder, acute lung injury (ALI) usually occurs in critical care medicine caused by various direct or indirect injury factors, and further progresses to acute respiratory distress syndrome (ARDS), with high rates of mortality and morbidity [1,2]. ALI is the most severe form of the viral infection sustained by acute respiratory syndrome coronavirus-2 (SARS-CoV-2) [3,4]. ALI clinically manifests as progressive hypoxemia and respiratory distress due to alveolar barrier dysfunction and formation of alveolar edema [1,4]. Despite the intensive advances in modern treatment technology and the increased understanding of pathogenesis, significant mortality caused by ALI remains a serious issue [5,6]. Therefore, it is important to further reveal novel biomarkers and therapeutic methods, as well as the underlying mechanisms of ALI.
With the development of high-throughput sequencing and the continuous updating of bioinformatics technology, a large number of noncoding RNAs such as microRNAs (miRNAs) have been found to be predominately correlated with a wide variety of diseases [7]. miRNAs are arbitrarily defined as endogenous smallmolecule RNAs with the length of 20-25 nucleotides, and exert biological functions by impeding translation of target mRNAs [8]. It is now well appreciated that miRNAs participate in disease progression by modulating various key cellular biological processes, including cell proliferation, differentiation, apoptosis, migration, and invasion [8]. To date, miRNAs have revealed abnormal expression in various diseases, and some of which serve as biomarkers for the targeted treatment of ALI [9].
Notably, it has been reported that statins are effective drugs for the treatment of ALI and coronavirus disease 2019 (COVID-19) by inhibiting the mevalonate pathway, which is a druggable target for COVID-19 [10,11]. As downstream of the mevalonate pathway, geranylgeranyl pyrophosphate synthase large subunit 1 (GGPPS1) is considered a target to treat lung fibrosis [12]. Recent studies have also demonstrated that lung-specific GGPPS1 knockout can attenuate lung inflammation and injury in lung injury mice [13,14]. However, few studies have investigated the underlying mechanism of ALI with GGPPS1-knockout.
Herein, we downloaded the microarray data of ALI, and the differentially expressed genes (DEGs) between ALI lung specimens from lung-specific GGPPS1-knockout and wild-type mice treated with lipopolysaccharide (LPS) were screened. Functional analysis was further explored to investigate the functional biological annotations associated with DEGs related to ALI. Meanwhile, to further explore the functional network of DEGs, the networks of protein-protein interaction (PPI), miRNA-transcription factor (TF)-target regulation, and the drug-gene interaction were predicted.

Data acquisition
The gene expression profile dataset GSE89311 was downloaded from the National Center of Biotechnology Information Gene Expression Omnibus database (http://www.ncbi.nlm. nih.gov/geo/) on September 20, 2020. The dataset was based on the platform of GPL10787 Agilent-028005 SurePrint G3 Mouse GE 8x60K Microarray (Probe Name version). The GSE89311 dataset included eight lung tissues isolated from GGPPS1-knockout C57BL/6 mice (n = 4) treated with LPS for 12 h and wild-type C57BL/6 mice (n = 4) treated with LPS for 12 h.

Data preprocessing and DEG screening
The preprocessing of data was performed by the Limma package [15] of R software, which mainly consisted of background correction by MAS method, normalization by quantile methods, and expression calculation. When the probe was not mapped to any gene symbol, this probe would be removed from our analysis. When, however, one gene was mapped by multiple probes, the mean value of the probes was considered the expression value of this gene. Subsequently, the classical Bayes method provided by Limma package was employed to screen DEGs between GGPPS1-knockout and wild-type mice. Notably, the DEGs in this study were defined according to the cutoffs of P value < 0.05 and |log 2 (fold change)| > 1.0. Bidirectional clustering heatmap and volcano plots were constructed based on DEGs.
2.3 Gene Ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analyses To assess functions and significantly enriched pathways of DEGs, GO functional annotation correlated with GO biological process analysis and KEGG pathway enrichment analysis were conducted with clusterProfiler [16]. p value <0.05 and count ≥5 were regarded as thresholds for enrichment analyses.

PPI network construction and module analysis
The Search Tool for the Retrieval of Interacting Genes/ Proteins (STRING) database (Version: 10.0, http://www. string-db.org/) [17] provides PPI prediction function and is available online. Therefore, we conducted the PPI analysis of DEGs on the basis of this database using the PPI score of 0.7 (high confidence) and expected to identify crucial protein pairs. Afterward, the PPI network was constructed and visualized using the Cytoscape software (version: 3.2.0, http://www.cytoscape.org/). Moreover, the CytoNCA plugin (version 2.1.6, http://apps.cytoscape.org/ apps/cytonca) [18] was used to analyze the network topology properties of the nodes with the parameters of without weight. The important nodes involved in PPI, namely the hub proteins, were obtained by ranking the network topological properties of each node. Furthermore, the molecular complex detection (MCODE, version 1.4.2, http:// apps.cytoscape.org/apps/MCODE) [19] plugin of Cytoscape was used to analyze modules with similar functions in the original PPI network. GO biological process and KEGG pathway analyses of modules were further performed to evaluate the functions of sub-modules.

Prediction of miRNA-TF-target regulation
Overrepresentation Enrichment Analysis (ORA) enrichment method through WebGestalt (http://www.webgestalt.org/) [20] was used to perform TF-target and miRNA-target enrichment prediction from all module genes. Next, the miRNA-TF-target relationship pairs were obtained based on the threshold of P value <0.05, and the network software was constructed using Cytoscape.

Prediction of drug-gene interaction
The Drug-Gene Interaction Database (DGIdb) is used to mine existing resources and generate assumptions about how genes are therapeutically targeted or prioritized for drug development. Based on module genes, drug-gene interaction was predicted by DGIdb2.0 (http://www.dgidb. org/) [21]. We only screened for FDA-approved drugs to predict all drug-gene relationship pairs. Meanwhile, the drug-gene interaction network was carried out and visualized by Cytoscape software (version: 3.2.0, http://www. cytoscape.org/). Because the database defaults to human genes, the module genes were transformed into mousehuman homologous genes, and the transformed genes were used to predict drug-gene interaction.

Selection of genes associated with lung cancer prognosis
The potential association of key hub genes in ALI with the prognosis of lung cancers was analyzed to investigate the crucial roles of these genes in lung diseases. The lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) data in the GEPIA online tool (http://gepia. cancer-pku.cn/) were used for the survival analysis. The association of gene expression with the overall survival of LUAD and LUSC patients was extracted. Significant correlation was regarded as logrank P value <0.05.

Identification of DEGs
Totally, 421 DEGs (including 224 downregulated and 197 upregulated) were identified in lung samples isolated from GGPPS1-knockout mice treated with LPS for 12 h compared with wild-type mice treated with LPS for 12 h (Figure 1a and b).

PPI network analysis
PPI network was undertaken using the STRING database to explore the PPI relationships of the overlapping DEGs.
Overall, 216 nodes and 504 protein pairs were achieved ( Figure 3a). Furthermore, the module analysis of PPI network indicated three sub-modules, and Cxcl9, Ccr5, Cxcr4, Cxcl5, Sirpb1a, Sirpb1b, Nps, Calca, and Ramp3 were key nodes in these modules ( Figure 3b). Specifically, module-A (score = 14) contained 14 nodes and 91 protein pairs. Genes in module-A were mainly associated with viral protein interaction with cytokine and cytokine receptor, chemokine signaling pathway, leukocyte migration, and regulation of cell migration based on the analysis of KEGG pathways and GO biological processes (Figure 4a). Also, one upregulated DEG and eight downregulated DEGs were included in module-B (score = 9), and they were primarily implicated with one KEGG pathway of "mmu04380:Osteoclast differentiation" and the GO biological processes related to regulation of cell-cell adhesion, endocytosis, T cell activation, and leukocytes (Figure 4b). Genes in module-C (score = 6; four upregulated genes and two downregulated) were prominently related to two KEGG pathways of "mmu04270:Vascular smooth muscle contraction" and "mmu04080:Neuroactive ligand-receptor interaction" and GO biological processes related to regulation of smooth muscle contraction and cAMP-mediated signaling (Figure 4c).

Association of genes with lung cancer prognosis
Using the LUAD and LUSC data in GEPIA online tool, the correlations of the nine DEGs in the drug-gene interaction network were analyzed. Results showed that the high expression levels of P2RY12 (logrank p = 0.0029), CXCR4 (logrank p = 0.049), and ADRA2A (logrank p = 0.0037) were significantly correlated with a good prognosis of patients with LUAD (Figure 7a), while the low expression levels of ADORA1 (logrank p = 0.019) and CXCR1 (logrank p = 0.016) were obviously related to higher survival ratios in LUSC patients (Figure 7b). No other correlation was identified.

Discussion
Extensive studies have concentrated on illuminating pathogenesis of ALI via bioinformatics analyses over the past few years [22][23][24]. In the present study, 421 DEGs between ALI lung specimens from lung-specific GGPPS1knockout mice and wild-type mice treated with LPS for The PPI network and three sub-modules of DEGs, respectively. The yellow circular node represents upregulated genes; the green prismatic node represents downregulated genes.
12 h were identified. The upregulated DEGs mainly enriched in the peroxisome proliferator-activated receptor (PPAR) signaling pathway, drug metabolism-cytochrome P450, fatty acid metabolic process, and lipid catabolic process, while downregulated DEGs were prominently related to cytokinecytokine receptor interaction, hematopoietic cell lineage, T cell activation, and regulation of cytokine production. Cxcl9, Ccr5, Cxcr4, and Cxcl5 were key nodes in the PPI network. In addition, three miRNAs (miR505, miR23A, and miR23B) and three TFs (PU1, CEBPA, and CEBPB) were key molecules in the miRNA-TF-target network. Finally, several genes, such as ADRA2A, P2RY12, ADORA1, CXCR1, and CXCR4, were predicted as potential druggable genes for ALI with GGPPS1-knockout, and were associated with the prognosis of lung cancers.
Accumulating evidence has demonstrated that the main mechanisms of ALI are cyclic atelectasis and alveolar overdistention, which contribute to the activation of inflammatory cells and further aggravate ALI [25]. GGPPS1 is a catalase downstream of the mevalonate pathway, which is known for the synthesis of cholesterol and considered as a target to treat COVID-19 [10,11,26]. GGPPS1 is a key enzyme that has been reported to be highly expressed and involved in the pathogenesis of inflammatory diseases, including idiopathic pulmonary fibrosis [12], LPSinduced ALI [13], cigarette smoke-induced inflammation [27], and LUAD [28], through increasing the productions of inflammatory cytokines. The high expression level of GGPPS1 is considered to be responsible for the development of alveoli and airways in the fetal lung [29]. In this study, we found that Cxcl9, Ccr5, Cxcr4, and Cxcl5 were identified as key genes for the function of GGPPS1-knockout in ALI. These genes participated in ALI by regulating inflammation and immune responses.
CXC chemokine ligands (including upregulated CXCL9 and CXCL5) and chemokine receptors (including downregulated CXCR1, CCR5, and CXCR4) are the members of chemokine family [30]. Research showed that the knockout of GGPPS1 attenuates lung inflammation and LPS-induced ALI [13,14]. The study by Xu et al. [13] showed that lung-specific GGPPS1-knockout decreased interleukin (IL)-1β level, cleaved caspase-3 expression, and apoptotic cell percentage in lung tissues in LPS-induced ALI mice. The interactions of chemokines and their receptors can participate in a variety of physiological functions, such as cell growth, development, differentiation, apoptosis, and distribution, and play important roles in a variety of pathological processes, including inflammation, pathogen infection, trauma repair, and tumor formation and metastasis [30,31]. Mevalonate promotes differentiation and proliferation of multiple types of cells, including colon cancer cells [26], cardiomyocytes [32], vascular smooth muscle cells [33], and regulatory T (Treg) cells [34]. The mevalonate pathway is essential for the growth and proliferation of cancer cells and energy homeostasis by controlling the uptake of glucose and amino acid [26,33], and the inhibition of mevalonate suppresses glucose and amino acid uptake in colon cancer cells [26]. The results showed that the downregulation of GGPPS1 suppressed mevalonate-mediated energy homeostasis and cell proliferation.
As reported, the cxcl5 and cxcl1 genes and inflammatory cytokines, including IL-6 and IL-1β, were significantly increased in ALI mice compared with controls [35]. Besides, excessive neutrophil responses result in life-threatening injury in the lung. Chemokines like CXCL5, CXCL9, CXCL10, and CXCL11 are upregulated following SARS-CoV-2 infection, ALI, or ARDS [36][37][38][39]. Elevated CXCL5 drives neutrophil recruitment and harms lung barrier function [38,39]. Berger et al. showed that reduced neutrophil numbers resulted in increased burden of Streptococcus pneumoniae infection in Cxcl5 -/mice and Cxcl5 absence resulted in reduced alveolar  neutrophil recruitment and decreased vascular leakage compared with wild-type mice [39]. In addition, CXCR4 overexpression in mesenchymal stem cells can improve the therapeutic effect for ALI [40]. These data showed that the immune responses were crucial for the development and pathogenesis of lung diseases including ALI. Also, targeting chemokines and chemokine receptors, including CXCL9, CXCL5, CXCR1, CCR5, and CXCR4, may possibly provide a therapeutic perspective in ALI.
The α-2A adrenergic receptor (ADRA2A) gene mainly functions in the central nervous system through the regulation of neurotransmitter released by adrenergic neurons [41,42]. It has been reported to be associated with attention-deficit hyperactivity disorder [43]. The adenosine A1 receptor subtype gene (ADORA1) plays a protective role against hypoxia damage in cells via regulating the level of adenosine, which controls both the inflammation and neurodegeneration [44,45]. It also involves in the central nervous system disease including the pathogenesis of parkinsonism and cognitive dysfunction [46]. A recent study by Valasarajan et al. [47] showed that the dysregulation of ADORA1 was involved in pulmonary hypertension. Also, the positive modulation of adenosine and adenosine kinase on inflammation has been confirmed in endothelial [48]. However, there is less information on the association between these genes with inflammation responses in lung diseases. In our study, the upregulation of ADORA1 and ADRA2A in ALI lung tissues were confirmed. They also correlated with the prognosis of LUSC and LUAD, respectively. These results highlighted the potential involvement or contribution of them to ALI.
Based on the miRNA-TF-target network, TFs, such as PU1, CEBPA, and CEBPB, were significant regulated molecules for the role of GGPPS1-knockout in ALI in the current study. PU1 as a hematopoietic lineage-specifying TF has been reported to play important roles in regulating gene expression of various immune cells, including Bcell, dendritic-cell, granulocyte, and macrophage [49]. Several research studies have suggested that PU1 can promote inflammatory response in many inflammatory diseases, including allergic inflammation [50], asthmatic airway inflammation [51], and pulmonary inflammation response to LPS [52]. In addition, CEBPA and CEBPB initially have been revealed to exert functions in adipogenesis and hematopoiesis. Recent studies have shown that methylation of CEBPA promoter exerts positive regulating role in lung inflammation [53]. Interestingly, both CEBPA and CEBPB can play a role in the collaboration of PU1, and the binding of PU1 to CEBPA/CEBPB may be involved in the inflammatory response by activating macrophages [54,55].

Conclusion
In conclusion, 421 DEGs between ALI tissue specimens from lung-specific GGPPS1-knockout and wild-type mice were identified. Among these DEGs, the genes such as ADRA2A, P2RY12, ADORA1, CXCR1, and CXCR4 might be novel markers and potential druggable genes in ALI by regulating inflammatory response, which might be regulated by miRNAs (miR505, miR23A, and miR23B) and TFs (PU1, CEBPA, and CEBPB).