Show Summary Details
More options …

# Open Life Sciences

### formerly Central European Journal of Biology

Editor-in-Chief: Ratajczak, Mariusz

IMPACT FACTOR 2018: 0.504
5-year IMPACT FACTOR: 0.583

CiteScore 2018: 0.63

SCImago Journal Rank (SJR) 2018: 0.266
Source Normalized Impact per Paper (SNIP) 2018: 0.311

ICV 2017: 154.48

Open Access
Online
ISSN
2391-5412
See all formats and pricing
More options …
Volume 12, Issue 1

# Individualized identification of disturbed pathways in sickle cell disease

Chun-Juan Lu
• Department of Blood Transfusion, Heilongjiang Provincial Hospital, Haerbin 150036, Heilongjiang, China
• Other articles by this author:
/ Yan Wang
/ Ya-Li Huang
• Corresponding author
• Nuclear Medicine Department, Qilu Hospital of Shandong University, Jinan, 250012, Shandong PR, China
• Email
• Other articles by this author:
/ Xin-Hua Li
Published Online: 2017-12-29 | DOI: https://doi.org/10.1515/biol-2017-0049

## Abstract

### Background

Sickle cell disease (SCD) is one of the most common genetic blood disorders. Identifying pathway aberrance in an individual SCD contributes to the understanding of disease pathogenesis and the promotion of personalized therapy. Here we proposed an individualized pathway aberrance method to identify the disturbed pathways in SCD.

### Methods

Based on the transcriptome data and pathway data, an individualized pathway aberrance method was implemented to identify the altered pathways in SCD, which contained four steps: data preprocessing, gene-level statistics, pathway-level statistics, and significant analysis. The changed percentage of altered pathways in SCD individuals was calculated, and a differentially expressed gene (DEG)-based pathway enrichment analysis was performed to validate the results.

### Results

We identified 618 disturbed pathways between normal and SCD conditions. Among them, 6 pathways were altered in > 80% SCD individuals. Meanwhile, forty-six DEGs were identified between normal and SCD conditions, and were enriched in heme biosynthesis. Relative to DEG-based pathway analysis, the new method presented richer results and more extensive application.

### Conclusion

This study predicted several disturbed pathways via detecting pathway aberrance on a personalized basis. The results might provide new sights into the pathogenesis of SCD and facilitate the application of custom treatment for SCD.

## 1 Introduction

Sickle cell disease (SCD) is one of the most common and life-threatening genetic blood disorders, that features intermittent vaso-occlusive events and chronic hemolytic anemia [1]. The 2015 Global Burden of Disease reports estimate that the prevalence of sickle cell trait is more than 400 million cases globally [2], and it results in more than 100,000 deaths [3]. SCD in general involves the mutation of the hemoglobin-beta gene. Previous study has presented a global map of gene variation associated with SCD by genome-wide association analysis [4]. Moreover, with the development of analytical techniques, more than one hundred blood and urine biomarkers have been revealed to be involved in the pathogenesis of SCD [5]. However, most known biomarkers give limited clinical value, and few biomarkers provide useful prognostic information in managing the condition.

High-throughput genome-wide association studies have led to a paradigm shift in the way that investigators explore complex diseases. Genome-wide analyses have discovered several genes influencing the likelihood of developing SCD [4, 6, 7]. Since most biological processes arise from integrated activities among many genes, interpreting the consequences on a pathway level contributes to understanding how gene perturbations account for disease [8]. Thus, the characterization of pathway changes is imperative for understanding the molecular mechanisms of SCD. Existing pathway algorithms have been classified into three categories: over-representation analysis, functional class scoring and pathway topology-based approach [9]. However most existing pathway techniques mainly concern the identification of disturbed pathways in specific disease condition, but ignore the case that pathway aberrance may occur in an individual subject. Individualized pathway analysis is conducive to personalized interpretation of disease data. Currently, an individualized pathway analysis method has been proposed to identify disturbed pathways in disease [10]. The new pathway analysis strategy calculated the individualized pathway aberrance score (iPAS) of one disease sample by comparing with accumulated normal samples, making it possible to interpret disease data in a personalized or customized way.

In this study, we employed the iPAS method to identify disturbed pathways by quantifying the individual pathway aberrance compared with accumulated normal samples. Specifically, this method systematically provided a series of analysis steps: gene expression data collection and preprocessing, pathway data recruitment and preprocessing, gene-level statistics, pathway-level statistics, and significant disturbed pathway analysis. Under this framework, we expected to identify the disturbed pathways in SCD and to further understand the underlying mechanism of SCD.

## 2.1 Gene expression data

Gene expression data of SCD were retrieved from the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/), with an accession number of E-GEOD-35007 [4]. The gene expression data were obtained from 250 SCD patients and 61 age-matched controls, using the Illumina HumanHT-12 V4.0 beadchip platform. These control samples were defined as accumulated control samples, referring to “nRef” hereinafter. The detailed sample characteristics were presented in the previous study [4]. All raw data and the annotations were obtained from the manufacturer’s documents, and the probes were re-annotated to gene symbols. Finally, a total of 31,426 gene symbols were obtained for subsequent analysis.

## 2.2 Pathway data

In this study, all human biological pathways were retrieved from the Reactome pathway database (http://www.reactome.org/). Pathways with a very large number of genes are too complex to be understood by human experts. Thus, pathways with gene size > 100 were removed from our study. Then, by intersecting genes between pathway data and gene expression data, we obtained a total of 1,022 pathways (covering 4,928 genes) for subsequent analysis.

## 2.3.1 Data preprocessing

Standard pre-treatment was conducted to the control quality of the gene expression data. For normal genes, background correction and normalization was implemented to eliminate the effect of nonspecific hybridization by RMA algorithm and quantile based algorithm [11, 12]. Then, Micro Array Suite 5.0 was applied to revise perfect match and mismatch value [13], and the medianpolish method was used to summarize the expression value [11].

## 2.3.2 Gene-level statistics

First, we calculated the average expression value and standard deviation of genes in normal conditions. Then, the gene expression value in individual disease samples was standardized by the average and standard deviation of nRef as reference. For each gene i in the disease group, we calculated the gene expression value as follows: $zi=gDi−mean(gnRef)stdev(gnRef)$

Where gDi stood for the expression value of gene i in one SCD sample, mean(gnRef) represented the average expression value of gene i in nRef and stdev(gnRef) was the standard deviation of nRef.

## 2.3.3 Pathway-level statistics

It has been indicated that the Average Z method performed best in highlighting pathway aberrance and in further revealing clinical importance [10]. Thus, we employed the Average Z method to evaluate the pathway aberrance of individual samples in this study. A vector Z = (z1, z2zn) stood for the expression status of a pathway, where zi represented the standardized expression of i-th gene, and n was the number of genes belonging to the given pathway. The iPAS status of one pathway was defined as follows: $iPAS=∑inZin$

After that the expression matrix was obtained for each pathway in each individualized sample. The mean value of iPAS values of each pathway was defined as the pathway aberrance level of each disease sample.

## 2.3.4 Disturbed pathways analysis

After obtaining the pathway aberrance of all pathways in individualized SCD samples, we performed a significance analysis to identify the disturbed pathways in SCD. In this study, the wilcoxon-test [14] was implemented to generate the pathway statistics values and false discovery rate (FDR) [15] was utilized to adjust the p-value. The pathways under the threshold of p-value < 0.01 were considered as disturbed pathways in SCD. Meanwhile, the top ten disturbed pathways based on the p-values were selected to conduct a clustering analysis.

## 2.4 Changed percentage of disturbed pathways

To further validate disturbed pathways identified by the iPAS method, we counted the changed percent for each pathway across all SCD samples. To achieve this, we first determined the distribution character of each pathway statistic value in normal and disease samples to establish a basis. Then, statistical analysis was conducted on the disturbed pathways (p-value < 0.01) in the disease group to obtain the changed percent for each pathway in all SCD cases. In this paper, the pathways whose disturbed percentage was > 80% were extracted for further analysis.

## 2.5 Differentially expressed genes (DEGs) based pathway analysis

As a validation step, we performed a DEG-based pathway analysis. To achieve this, DEGs between SCD patients and controls were identified by Linear models for microarray data (LIMMA) package [16], and the p-values were proofread by FDR [17]. Genes under the criteria of |log(FoldChange)| > 2 and p-value < 0.01 were supposed to be DEGs between SCD patients and controls. Then we performed a pathway enrichment analysis based on the Reactome pathway database, using the online Database for Annotation, Visualization and Integrated Discovery (DAVID) [18]. Pathways with p-value < 0.01 were considered as significant pathways.

## Ethical approval

The conducted research is not related to either human or animals use.

## 3.1 Identification of disturbed pathways

In the present study, 61 normal controls in the gene expression profile E-GEOD-35007 were defined as nRef (reference) for 250 SCD patients. Quantile normalization was performed on disease genes to evaluate their gene-level statistics using the accumulated normal data. Meanwhile, a total of 1022 pathways were obtained from the Reactome pathway database. We extracted gene-level statistical values of all genes in each pathway, and denoted the mean value as pathway-level statistics of this pathway. According to the iPAS method based on Average Z measures, we obtained the pathway aberrance scores of individual SCD samples. Via wilcoxon-test for the pathway-level statistics, the p-value of each pathway was calculated. Under the criterion of p-value < 0.01, a total of 618 disturbed pathways were identified in SCD compared with normal condition. The top ten disturbed pathways were shown in Table 1. A clustering analysis was conducted based on the top ten disturbed pathways, with the resulting heatmap illustrated in Figure 1. Moreover, we calculated the classification performance of the top 10 disturbed pathways for all samples. Ideally, all samples should be classified into two major clusters. Our results showed that the top 10 disturbed pathways could separate SCD patients from normal controls with an accuracy of 0.89. The most significant disturbed pathway was metabolism of porphyrins (p = 8.31E-27).

Figure 1

Hierarchical clustering of the top ten disturbed pathways for sickle cell disease subjects and normal samples. The colors in the matrix represented the pathway statistic values.

Table 1

The top ten disturbed pathways in sickle cell disease.

## 3.2 Changed percentage of disturbed pathways

In order to further validate the altered pathways in SCD, we calculated the changed percentage of each disturbed pathway across 250 SCD samples. A total of 6 disturbed pathways changed in more than 80% of SCD individuals (Table 2). Activation of PUMA and translocation to mitochondria were the most significantly affected pathways with changes occurring in 223 disease individuals (89.2%), and metabolism of porphyrins changed in 220 disease individuals (88.0%). Among these 6 disturbed pathways, 4 of them occurred in the top 10 disturbed differential pathways in SCD.

Table 2

The disturbed pathways changed in more than 80% sickle cell disease samples.

## 3.3 DEGs based pathway analysis

After data preprocessing of the gene expression profile, LIMMA package was employed to calculate the gene differential expression values. In this research, under the criteria of |logFoldChange| > 2 and p-value < 0.01, we obtained forty-six DEGs in SCD, including 2 down-regulated genes and 44 up-regulated genes. Then a pathway enrichment analysis was performed based on these DEGs using DAVID. Under the threshold p-value < 0.01, only one significant pathway, heme biosynthesis (p = 2.0E-04), was identified and was also a disturbed pathway identified by the iPAS method.

## 3.4 Gene level statistics of DEGs in disturbed pathways

After quantile normalization of genes in all cases, we obtained the gene level statistics of each gene separately. Then the gene level statistics of DEGs were performed, as shown in Figure 2. It was easily found that the gene expression levels of most DEGs in disease condition were higher than those in normal condition. It could be inferred that the differential gene-levels may lead to the aberrance of pathways in SCD compared with nRef.

Figure 2

Expression pattern of the differentially expressed genes in the disturbed pathway. Each line represented a sample (red: normal, blue: disease)

## 4 Discussion

In the current study, we employed the Average Z-based iPAS method to quantify the individual pathway aberrance compared with accumulated normal samples, and identified several disturbed pathways in SCD that may contribute to a better understanding of the mechanism of SCD. Using the iPAS method, we screened out a total of 618 disturbed pathways between normal and disease conditions. Further analysis showed that 6 of them changed in more than 80% SCD individuals, such as metabolism of porphyrins, and heme biosynthesis. Furthermore, traditional DEG-based pathway enrichment analysis was also conducted to validate the new method. By functional enrichment analysis of DEGs, we obtained only one significant pathway in SCD, i.e., heme biosynthesis. This significant pathway was one of disturbed pathways identified by iPAS method. Relative to the traditional pathway enrichment analysis, the iPAS method employed Average Z measure to quantify pathway aberrance across individual samples, showing richer results and more extensive application.

Heme biosynthesis was the only one common disturbed pathway identified by both iPAS method and DEG-based pathway enrichment analysis, implying its crucial role in the pathogenesis of SCD. Heme, a complex of protoporphyrin IX with iron, is a prosthetic group of various hemoproteins and an essential cofactor in many biological processes [19, 20]. It is well known that heme biosynthesis is one of the most important metabolic pathways in mammals, and defective heme biosynthesis will give rise to severe metabolic disorders, such as erythropoietic porphyria and sideroblastic anemia [21]. SCD is a group of genetic blood disorders, and patients with SCD possess sickle hemoglobin, an oxygen-transport protein in the red blood cells [22, 23]. With the increase of oxidative stress, heme biosynthesis was markedly increased in circulating endothelial cells in SCD [24].

Relative to the traditional pathway analysis, the iPAS method identified more disturbed pathways, such as metabolism of porphyrins, activation of PUMA and translocation to mitochondria, and pre-NOTCH expression and processing. Metabolism of porphyrins was the most significant disturbed pathway and changed in 88.0% SCD subjects. It is well known that hemolysis of red blood cells is an inherent characteristic associated with SCD, and results in the release of porphyrins and its metabolites [25, 26]. Porphyrins comprise a portion of hemoglobin, the major constituent of human red blood cells. Disorders of porphyrin metabolism might be closely associated with the physiopathology of SCD. A previous patent proposed a method for clinically screening SCD by detecting porphyrins and porphyrin metabolites in human dentition [27]. Activation of PUMA and translocation to mitochondria changed in 89.2% of SCD samples. PUMA, also known as p53 upregulated modulator of apoptosis, is a p53-dependent pro-apoptotic protein, and activated PUMA interacted with Bcl-2 family members signals apoptosis to the mitochondria [28]. SCD is a disease of hypoxia because of insufficient numbers of erythrocytes for oxygen delivery [29]. Hypoxia and oxidative stress could induce p53-dependent cell apoptosis [30]. Recent findings indicated that PUMA participated in hypoxia-triggered cell apoptosis by interfering with the mitochondrial pathway [31]. Forster et al. [32] found that PUMA was significantly differentially expressed with over 2-fold changes in erythroid progenitor cells in β-thalassaemia. The disturbed pathway pre-NOTCH expression and processing changed in 82.8% of SCD subjects. Notch is a binary switch for cell-fate decisions mediated by cell-cell interactions, and aberrant Notch signal transduction is associated with cancer and other human diseases [33]. Pre-NOTCH is the nascent form of Notch precursor. Notch proteins are expressed in hematopoietic cells and have been indicated to play a fundamental role in regulating the induction of hematopoietic stem cells and lineage cell fate decisions [34, 35]. The disturbed pathways identified by this new method might play potentially important roles in the pathogenesis of SCD. Further studies should be performed to explore the underlying specific mechanisms between these disturbed pathways and SCD development.

In conclusion, iPAS method is an applicable strategy for exploring disturbed pathways based on gene expression data. Using iPAS method, we identified several disturbed pathways, such as metabolism of porphyrins and heme biosynthesis, in SCD. These pathways might play significant roles in the pathogenesis of SCD, and could be considered as potential predictive and prognostic markers for SCD.

## Acknowledgements

This work was supported by Key Project of Research and Development of Shandong Province, China (2015GSF118131).

## References

• [1]

Rees D.C., Williams T.N., Gladwin M.T., Sickle-cell disease, Lancet, 2010, 376, 2018-2031

• [2]

GBD 2015 Disease and Injury Incidence and Prevalence Collaborators, Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015, Lancet, 2016, 388, 1545-1602

• [3]

GBD 2015 Mortality and Causes of Death Collaborators, Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015, Lancet, 2016, 388, 1459-1544

• [4]

Quinlan J., Idaghdour Y., Goulet J.P., Gbeha E., de Malliard T., Bruat V., et al., Genomic architecture of sickle cell disease in West African children, Front Genet, 2014, 5, 26

• [5]

Rees D.C., Gibson J.S., Biomarkers in sickle cell disease, Br J Haematol, 2012, 156, 433-445

• [6]

Raghavachari N., Xu X., Harris A., Villagra J., Logun C., Barb J., et al., Amplified expression profiling of platelet transcriptome reveals changes in arginine metabolic pathways in patients with sickle cell disease, Circulation, 2007, 115, 1551-1562

• [7]

Chang Milbauer L., Wei P., Enenstein J., Jiang A., Hillery C.A., Scott J.P., et al., Genetic endothelial systems biology of sickle stroke risk, Blood, 2008, 111, 3872-3879

• [8]

Glazko G.V., Emmert-Streib F., Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets, Bioinformatics, 2009, 25, 2348-2354

• [9]

Khatri P., Sirota M., Butte A.J., Ten years of pathway analysis: current approaches and outstanding challenges, PLoS Comput Biol, 2012, 8, e1002375

• [10]

Ahn T., Lee E., Huh N., Park T., Personalized identification of altered pathways in cancer using accumulated normal tissue data, Bioinformatics, 2014, 30, i422-429

• [11]

Irizarry R.A., Bolstad B.M., Collin F., Cope L.M., Hobbs B., Speed T.P., Summaries of Affymetrix GeneChip probe level data, Nucleic acids research, 2003, 31, e15-e15 Google Scholar

• [12]

Bolstad B.M., Irizarry R.A., Astrand M., Speed T.P., A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, Bioinformatics, 2003, 19, 185-193

• [13]

• [14]

Gehan E.A., A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored Samples, Biometrika, 1965, 52, 203-223

• [15]

Nichols T., Hayasaka S., Controlling the familywise error rate in functional neuroimaging: a comparative review, Stat Methods Med Res, 2003, 12, 419-446

• [16]

Smyth G.K., Linear models and empirical bayes methods for assessing differential expression in microarray experiments, Stat Appl Genet Mol Biol, 2004, 3, 3 Google Scholar

• [17]

Reiner A., Yekutieli D., Benjamini Y., Identifying differentially expressed genes using false discovery rate controlling procedures, Bioinformatics, 2003, 19, 368-375

• [18]

Alvord G., Roayaei J., Stephens R., Baseler M.W., Lane H.C., Lempicki R.A., The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists, Genome Biol, 2007, 8, 183

• [19]

Layer G., Reichelt J., Jahn D., Heinz D.W., Structure and function of enzymes in heme biosynthesis, Protein Sci, 2010, 19, 1137-1161

• [20]

Ajioka R.S., Phillips J.D., Kushner J.P., Biosynthesis of heme in mammals, Biochim Biophys Acta, 2006, 1763, 723-736

• [21]

Fujiwara T., Harigae H., Biology of Heme in Mammalian Erythroid Cells and Related Disorders, Biomed Res Int, 2015, 2015, 278536

• [22]

Stegenga K., Burks L.M., Using photovoice to explore the unique life perspectives of youth with sickle cell disease: a pilot study, J Pediatr Oncol Nurs, 2013, 30, 269-274

• [23]

Ashley-Koch A., Yang Q., Olney R.S., Sickle hemoglobin (HbS) allele and sickle cell disease: a HuGE review, Am. J. Epidemiol., 2000, 151, 839-845

• [24]

Nath K.A., Grande J.P., Haggard J.J., Croatt A.J., Katusic Z.S., Solovey A., et al., Oxidative stress and induction of heme oxygenase-1 in the kidney in sickle cell disease, Am J Pathol, 2001, 158, 893-903

• [25]

Kato G.J., Steinberg M.H., Gladwin M.T., Intravascular hemolysis and the pathophysiology of sickle cell disease, J. Clin. Invest., 2017, 127, 750-760

• [26]

Johnson L.W., Schwartz S., Relation of porphyrin content to red cell age: analysis by fractional hemolysis, Proc. Soc. Exp. Biol. Med., 1972, 139, 191-197

• [27]

Richard P.A., Method of screening for sickle cell disease by detection of porphyrins and porphyrin metabolites in human dentition, 1980, Patent No. 4236526 Google Scholar

• [28]

Nakano K., Vousden K.H., PUMA, a novel proapoptotic gene, is induced by p53, Mol. Cell, 2001, 7, 683-694

• [29]

Sun K., Xia Y., New insights into sickle cell disease: a disease of hypoxia, Curr Opin Hematol, 2013, 20, 215-221

• [30]

Yu J., Zhang L., No PUMA, no death: implications for p53-dependent apoptosis, Cancer Cell, 2003, 4, 248-249

• [31]

Li Y., Liu X., Rong F., PUMA mediates the apoptotic signal of hypoxia/reoxygenation in cardiomyocytes through mitochondrial pathway, Shock, 2011, 35, 579-584

• [32]

Forster L., McCooke J., Bellgard M., Joske D., Finlayson J., Ghassemifar R., Differential gene expression analysis in early and late erythroid progenitor cells in beta-thalassaemia, Br J Haematol, 2015, 170, 257-267

• [33]

Greenwald I., Notch and the awesome power of genetics, Genetics, 2012, 191, 655-669

• [34]

Ohishi K., Katayama N., Shiku H., Varnum-Finney B., Bernstein I., Notch signalling in hematopoiesis, Semin. Cell Dev. Biol., 2003, 14, 143-150

• [35]

Burns C.E., Traver D., Mayhall E., Shepard J.L., Zon L.I., Hematopoietic stem cell fate is established by the Notch–Runx pathway, Gene Dev., 2005, 19, 2331-2342

Accepted: 2017-10-30

Published Online: 2017-12-29

Conflict of interest: Authors state no conflict of interest.

Citation Information: Open Life Sciences, Volume 12, Issue 1, Pages 418–424, ISSN (Online) 2391-5412,

Export Citation