P-selectin is an adhesion molecule which plays an important role in the development of inflammation. It is encoded by the SELP gene located on chromosome 1q21-q24. Various single nucleotide polymorphisms (SNPs) of SELP have been reported to be associated with various inflammatory disease conditions. The genetics behind these diseases could be better understood by knowing the structural and functional impact of various genetic determinants of SELP. So far, this is the first comprehensive and systematic in silico analysis of SNPs in SELP. A total of 2780 SNPs of SELP were retrieved from NCBI dbSNP. Only conserved and validated SNPs with minor allele frequency (MAF) ≥ 0.05 were subjected to further analysis. Based on these criteria, we selected 4 non-synonymous SNPs (nsSNPs) and 119 non-coding SNPs (ncSNPs). The nsSNPs were analyzed for deleterious effects using SIFT, Polyphen-2, nsSNPAnalyzer, SNP & Go, SNPs3, Mutperd and I-mutant web tools. The template prediction for variant structure modeling was performed using MUSTER and SWISS-MODEL. The functional impact of ncSNPs was analyzed by SNPinfo and RegulomeDB. The in silico analysis predicted 3 nsSNPs and 21 ncSNPs as potential candidates for future case-control association studies and functional analysis of SELP.
Selectins are carbohydrate-binding proteins and belong to the family of the C-type lectins. These consist of N-terminal lectin domain, epidermal growth factor (EGF) domain, short consensus repeats, a transmembrane domain and a cytoplasmic tail. There are three types of selectins i.e. P-, E- and L- selectin . P-selectin is stored in α-granules of platelets and in Weibel-Palade bodies of endothelial cells. It is translocated to the cell surface of activated endothelial cells and platelets [2-4]. E-selectin is expressed on cytokine-activated endothelial cells, while L-selectin is expressed on all granulocytes, monocytes and on most lymphocytes. The genetic polymorphisms of all three selectins have been reported to be associated with various inflammatory diseases including coronary artery disease, diabetes mellitus, Ischemic stroke and myocardial infarction [5-10].
During development of inflammatory response, P-selectin is translocated to the surface as well as released in a soluble form (sP-selectin) in the blood stream . It binds to P-selectin glycoprotein ligand-1(PSGL-1), a dimeric molecule rich in N- and O- glycans, expressed on almost all the leukocytes . This interaction is involved in the initial steps of leukocyte recruitment during inflammation i.e. tethering and rolling as well as in thrombus formation and stabilization [13-15]. P-selectin has been reported to be involved in the pathogenesis of various inflammatory disease conditions including atherosclerosis, which is chronic inflammation of arteries. It is characterized by the formation of atherosclerotic plaques, containing necrotic cores, calcified regions, accumulated modified lipids, inflamed smooth muscle cells, endothelial cells, leukocytes and foam cells . Various large scale comprehensive studies indicated the association of genetic polymorphisms of several molecules including Platelet-activating factor acetylhydrolase (PLA2G7), Proprotein convertase subtilisin/kexin type 9 (PCSK9), 7-dehydrocholesterol reductase (DHCR7) and Von Willebrand factor (VWF)with atherosclerotic plaque formation [17-21]. In addition to these, studies have also indicated P-selectin to be an important risk factor in the atherosclerotic plaque development in both early and advanced stages [22, 23]. In light of these findings, P-selectin has been suggested to constitute one of the important markers for atherosclerosis.
P-selectin is encoded by SELP gene located on chromosome 1q21-q24 spanning >50 kb and contains 17 exons encoding structurally distinct domains . The inactivation of SELP in atherosclerosis prone ApoE-/- mice had shown decreased monocyte recruitment to neointima formation sites after carotid artery injury and markedly reduced atherosclerosis plaque formation . Various SNPs of SELP are linked to susceptibility towards various inflammatory diseases [26-28]. A genome wide association study (GWAS) implicated association of two regions of P-selectin gene with systemic lupus erythematosus (SLE) in UK and USA families . Another GWAS indicated association of rs6136 of SELP with soluble levels of P-selectin in a European population . Furthermore, Exome variant server data showed various probably damaging SNPs of SELP which are also part of Illumina human genome chips (http://evs.gs.washington.edu/EVS/) .
SNPs constitute more than 90% of all the human genetic variations and are located in both coding and non-coding regions of the gene. The nsSNP is a single base change in a coding region that can affect the protein sequence by changing an amino acid residue . These may affect the protein function by destabilizing its structure or by affecting various physico-chemical properties . The ncSNPs are also important as they may alter the gene expression by affecting transcription binding function, splicing regulation and miRNA binding etc. . To better understand their clinical relevance, it is a prerequisite to prioritize the variants on the basis of their functional impact. But, it would be difficult, time consuming and very expensive to experimentally illustrate the functional impact of all the variants in case-control association studies Thus, the present study is designed to determine the functionally important SNPs in SELP using various computational tools. To the best of our knowledge, this is the first comprehensive and systematic in silico analysis of SELP.
2 Materials and Methods
We retrieved the information regarding gene sequence, protein sequence and rs ID of various SNPs from NCBI dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/). The SNPs validated by genotype data, multiple independent submissions, HapMap project, 1000 Genomes Project and with minor allele frequency (MAF) ≥0.05 were subjected to further analysis.
2.1 Prediction of variants in the conserved regions of the gene
The variants in the conserved regions of the gene were identified by Ensembl genome browser (http://www.ensembl.org/). Sixteen eutherian mammals: Cat (Felis catus), Chimpanzee (Pan troglodytes), Cow (Bos taurus), Dog (Canis lupus familiaris), Gorilla (Gorilla gorilla), Horse (Equus caballus), Macaque (Macaca mulatta), Marmot (Marmota), Mouse (Mus musculus), Olive baboon (Papio anubis), Orangutan (Pongo pygmaeus), Pig (Sus scrofa domesticus), Rat (Rattus norvegicus), Rabbit (Oryctolagus cuniculus), Sheep (Ovis aries) and Vervet-AGM (Chlorocebus pygerythrus) were selected for the genomic alignment of human SELP gene. A pairwise alignment of all the species was performed in UCSC browser.
2.2 Analysis of functional impact of coding nsSNPs
SIFT (Sorting Intolerant from Tolerant) predicts the change in protein function by amino acid substitution (AAS) (http://sift.jcvi.org/). This web tool is used to evaluate the deleterious or tolerated effect of SNPs using the sequence homology approach. It uses a multistep process that first searches sequences having similar function or that are closely related, second achieve multiple alignments of the selected sequences and then last calculate the normalized probability for each AAS. SNPs having SIFT score ≤ 0.05 are predicted to be deleterious and those with score >0.05 are predicted to be tolerable . We submitted protein FASTA sequence and dbSNP IDs of selected nsSNPs as the input file.
Polyphen-2 (version 2.0.9) server determines the AAS involved in the structural and functional modifications (http://genetics.bwh.harvard.edu/pph2/). It uses specific empirical rule to determine the fate of proteins structure and function due to nsSNPs. Furthermore, it uses the Basic local alignment search tool (BLAST) to identify homologues of the input, calculates the position-specific independent counts (PSIC) scores for each of the variants. The PSIC score ranges from 0 to1, where 0 indicated benign effect of AAS, 1 indicated probably damaging effect (more assertive prediction), and score lies in-between indicated possibly damaging effect (less assertive prediction) . SNP IDs were submitted as input queries for this tool.
SNPs & Go, a support vector machine (SVM) based method, predicting disease associated nsSNPs using protein functional annotation (http://snps.biofold.org/snps-and-go/). It generates the probability of each variant to be associated with human diseases. A probability score >0.5 predicted the related nsSNP to be disease associated and <0.5 predicted the neutral effect of nsSNP . We submitted protein FASTA sequence and variant IDs were as input files.
nsSNPAnalyzer identifies disease associated nsSNPs using multiple sequence alignment and protein structure analysis (http://snpanalyzer.uthsc.edu/). Information regarding secondary structure, solvent accessibility and environmental polarity is also provided by this software . Protein sequence in FASTA format and a substitution file denoting the SNP identities were used as input data.
SNPs3D is a web database that provides the information regarding disease-gene relationship at the molecular level (http://www.SNPs3D.org). It provides three primary modules. The first module analyses the impact of nsSNPs on protein structure. The second module provides information regarding relationships between various candidate genes and the third module predicts the candidate genes that are involved in particular diseases. In the present study, the first module was selected, which uses SVM to designate each SNP as deleterious or non-deleterious to protein function. A negative SVM score represents a deleterious variant while a positive score indicates the non-deleterious variant . We submitted SELP gene ID as the input file.
Mutpred is a web tool that classify AAS as deleterious/ disease associated or neutral variation (http://mutpred.mutdb.org/about.html). In addition to this, it predicts the molecular basis of deleterious/disease associated AAS. It identifies the changes of functional sites and structural features between native and variant sequences . These changes are represented as probability of gain or loss of function or structure and can provide understanding of probable basis of disease state. It gives the deleterious mutation probability scores for each AAS, the score <0.5 indicated the neutral effect and score >0.5 indicates the deleterious effect . Furthermore, Mutpred, also provides improved classification accuracy over the other human mutation prediction tools. Protein FASTA sequence and AAS were submitted as the input file.
I-Mutant (version 2.0) is a neural network based programme which is used to predict the change in the stability of the protein structure due to single nucleotide change (http://folding.biofold.org/i-mutant/i-mutant2.0.html). Information regarding free energy change (DDG) was also calculated by this programme. Negative value of DDG indicates the decrease in the stability of protein structure and vice versa. The DDG classify the results in three categories i.e., largely stable (DDG > -0.5 kcal/mol), largely unstable (DDG < -0.5 kcal/ mol), or neutral (-0.5 ≤ DDG ≥ 0.5 kcal/mol) . We submitted protein FASTA sequence, position AAS and the variant residue as the input file.
2.3 Modeling the effect of deleterious variations on protein structure
SWISS-MODEL template library is used for protein structure homology modeling (http://swissmodel.expasy.org/). It interprets the number of closely related templates to build a complete structure of the protein. It allows users to search the template with highest identity, to evaluate the sequence similarity among different templates and cluster them to choose the best one for modeling. P-selectin protein structure was searched using Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB). Furthermore, FASTA sequence of the desired domain was submitted as the input file for SWISS-MODEL.
Protein BLAST is a protein sequence alignment tool. On submission of query sequence, it provides various protein sequences with different identities. It also allows users to align the query sequence with other manually submitted sequences. The templates with identities ranging from 30-80% were submitted along with the query sequence to select the best template for modeling of variant structure.
2.4 Analysis of functional impact of ncSNPs
SNPinfo is a set of web based tools used for selection of functional SNPs. (http://snpinfo.niehs.nih.gov/snpinfo/snpfunc.html). It analyzes SNPs on the basis of genome wide association studies (GWAS), linkage disequilibrium (LD) and functional characterization. It consists of three main pipelines naming genome pipeline, linkage pipeline and functional SNP prediction. The FuncPred tool of SNPinfo server is designed for functional SNP prediction. It predicts the effect of SNPs on specific functions including splicing regulations, transcriptional binding factor site, miRNA binding site etc. . SNP IDs of selected intronic SNPs in Asian population were submitted as the input file.
RegulomeDB is a novel approach and database which guides interpretation of regulatory variants in the human genome (http://regulomedb.org). To identify regulatory potential of ncSNPs, RegulomeDB includes high-throughput, experimental data sets from Encyclopedia of DNA Elements (ENCODE) and other sources, as well as computational predictions and manual annotations. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool. Thus, it provides a small set of putative sites with testable hypotheses . RegulomeDB assigns a range of scores to variants to specify their potential (Supplementary table 1). Gene ID i.e. SELP was submitted as the input file for RegulomeDB.
The complete strategy adopted for prediction of functionally important SNPs is depicted in Figure 1.
|1.||rs1018828||169588496||3’ near gene||0.0919||✓||Y|
|121.||rs3917651||169630824||5’ near gene||0.0543||✓||Y|
|122.||rs3753306||169630994||5’ near gene||0.2592||✓||Y|
|123.||rs1800808||169630994||5’ near gene||0.1042||✓||Y|
|124.||rs1800805||169632043||5’ near gene||0.2296||✓||Y|
|125.||rs1800807||169632197||5’ near gene||0.4800||✓||Y|
|126.||rs3917647||169632102||5’ near gene||0.4635||✓||Y|
MAF- minor allele frequency; ✓-validated SNPs; cSNP - conserved SNPs; Y: lie in conserved region; I: Intron; E: Exon
There were 2780 active entries of SNPs for SELP available in NCBI dbSNP database (as accessed on September 9, 2015). The dbSNP was selected because it is the most extensive and reliable SNP database and MAF of SNPs of SELP . Based on the criteria of validation status and MAF, 4 non-synonymous SNPs, one 3’ near gene SNP, Six 5’ near gene SNPs and 112 intronic SNPs were selected and subjected to further analysis (Table 1). The synonymous SNPs were excluded from the present study, as these do not result in amino acid change, thus affecting the protein function, solubility and stability. Furthermore, all the 124 selected SNPs were found to reside in the conserved regions of the SELP gene (Figure 2). The SNPs having MAF >0.05, with non-validated status and not located in conserved regions, were excluded from the present analysis as well.
3.1 Identification of functionally important nsSNPs and their effect on protein stability
Three nsSNPs i.e. rs6125, rs6127 and rs6133 were predicted as deleterious or disease associated variants by one or more computational tools used. However, rs6131 was predicted to have neutral/benign effect on protein structure and function by all the tools used. rs6125 was expected to have probably damaging, disease associated and deleterious effects with the score of 0.766 (polyphen-2), 0.721(Mutpred), -0.46 (SNPs3D, data not shown). rs6127 was predicted as disease associated variant with the probability score of 0.515 (Mutpred) (Table 2). Furthermore, the nsSNP rs6133 was found to be disease associated only by nsSNPAnalyzer (data not shown). These variants (rs6125, rs6127 and rs6133) predicted to be deleterious or diseases associated with at least one tool were subjected to I-Mutant analysis. All the three variants were shown to decrease the stability of protein structure as the DDG predicted was less than -0.5 kcal/mol (Table 3).
|SNP IDs||Amino acid|
|SIFT||Polyphen-2||SNP & Go||MutPred|
|rs6125||V209M||0.38/0.12||Tolerated||0.766||Possibly damaging||0.249/0.074||neutral||0.721||Disease associated/ deleterious|
|rs6127||D603N||0.82/1.00||Tolerated||0.000||Benign||0.224/0.083||neutral||0.515||Disease associated/ deleterious|
*SIFT score: ≤0.05 predicted deleterious effect, <0.05 predicted tolerated effect; PSIC score (0-1): 0 indicated benign effect, 1 indicated probably damaging effect, in- between 0 and 1 indicated probably damaging effect; SNP & Go probability score: <0.5 predicted the disease associated, >0.5 predicted the neutral effect; MutPred probability score: <0.5 indicated disease associated/deleterious effects, >0.5 indicated neutral effect. *rs6125 predicted deleterious/disease associated by Polyphen-2, MutPred and SNP3d (data not shown); rs6127 was predicted deleterious/disease associated by MutPred; rs6133 was predicted disease associated by nsSNPAnalyzer (data not shown).
|rs number||Position of AA change||Wild type residue||New residue||Stability||DDG(kcal/mol)|
*AA stands for amino acid; DDG (delta delta G) represents free energy change
The complete structure of P-selectin protein with 830 amino acids was not available until the drafting of this manuscript. Only a 195 residue structure comprising lectin/EGF domains was available with PDB id 1G1Q. However, most of the identified disease associated genetic variants of SELP were found to be located in nine short consensus repeats (CRs). The SNP rs6125 (V209M) is located in first CR (195-259 amino acids), rs6127 (D603N) in 7th CR (570-631) and rs6133 (V640L) resides in 8th CR (640-701). An effort was made to predict the complete structure of P-selectin using SWISS-MODEL. Protein Sequences of all the three CRs (query sequences) were submitted to SWISS-MODEL to identify the best templates for variant structure modeling. SWISS-MODEL provided three best matched templates with PDB ID 4c16.1.A, 4c16.2.A and 4csy.2.A (E-selectin) having sequence identity 44.26% for all the three query sequences (Supplementary table 2). To check the sequence identity near the position of interest, the templates were aligned with query sequences using Protein BLAST. Inspite of 44.26% sequence identity, none of the templates were aligned with the regions of the SELP covering above mentioned functionally important variants. Hence, the predicted deletrious variations were not modeled in the protein structure.
3.2 Functional impact of intronic region, upstream regulatory region and downstream regulatory region SNPs
Out of 112 intronic SNPs, SNPinfo predicted 10 SNPs to affect the transcription factor binding site (TFBS) activity and 2 SNPs were found to effect the splicing. All the seven 3’ and 5’ near gene SNPs were also found to affect TFBS activity (Table 4). However, miRNA binding site activity was not affected by these SNPs. On analysis of 119 SNPs of non-coding region by RegulomeDB, 79 SNPs were found to have regulatory effect ranging from 1f to 6 (Table 5). Out of these 79 variants, rs2205895 was found most likely to affect the transcription factor binding (TFB) with a RegulomeDB score of 1f. Moreover, four variants were observed less likely affecting the TFB with a score of 3a. The other variations showed minimal binding effect on TFBS. The five SNPs i.e. rs2205895, rs3917811, rs2235302, rs3917779, rs3917739 with highest ranks were more likely to have the regulatory effect, thus prioritized for further analysis.
|SNP ID||Allele||MAF||SNPinfo Prediction|
|(Affecting TFBS activity)||Splicing (ESE or ESS)|
*MAF- minor allele frequency; TFBS- transcription binding site; ESE- exonic splicing enhancer; ESS- exonic splicing silencer
|SNP ID||Category||Description||RegulomeDB Score|
|rs2205895||Likely to affect binding and linked to expression of a gene target||eQTL + TF binding + DNase peak||1f|
|rs3917779||Less likely to affect binding||TF binding + any motif + DNase peak||3a|
|rs3917840||Minimal binding evidence||TF binding + DNase peak||4|
|rs3917819||Minimal binding evidence||TF binding or DNase peak||5|
|rs3917705||Minimal binding evidence||TF binding or DNase peak||5|
|rs3917832||Minimal binding evidence||Motif hit||6|
*eQTL- expression quantitative trait loci; TF-transcription factor
All the functionally important SNPs in SELP including 3 nsSNPs and 24 regulatory SNPs located in distinct regions of SELP are shown in Figure 3.
The present study is designed to identify the functionally important genetic variants in the human SELP gene using various computational tools. Different computational tools use different algorithms to predict the results. In terms of sensitivity and specificity, the results vary considerably between different algorithms with different methods achieving high scores for each. It is difficult to select one method as best. There are several reports on the comparison of various in silico tools. According to Thusberg and Vihinen (2009), SIFT and PolyPhen are reported to have better performance in identifying deleterious nsSNPs out of various tools selected. The accuracy of SIFT and PolyPhen 2.0 was further validated by Hicks et al. (2011), which makes these tools more applicable for the prediction. On the other hand, Simon Williams (2012) predicts MutPred, SNPs&GO and one other algorithm to produce the optimum predictions. Furthermore, I-Mutant is ranked as one of the most reliable predictors based on the work performed by Khan and Vihinen (2010). Therefore, the present SNPs selection was based on the overall analysis of these SNPs by various tools. Similarly on the basis of available literature, we selected SNPinfo and RegulomeDB for the analysis of ncSNPs [43, 44, 50].
In the present approach, we used stringent criterions to select genetic variants to avoid errors in the outcome. The first criterion was validation status that supports the evidence of actual existence of variants in a gene . To ensure the high frequency of a particular variant in a given population, the second precedence was MAF. In large scale projects like HapMap, the MAF ≥0.05 was set as standard for selection of genetic variants . Accordingly, MAF≥0.05 was selected to prioritize the validated SNPs. The third important criterion was evolutionary conserved region variants, as these variants can have crucial effects on protein structure and function.
A total of 27 SNPs (3 non synonymous and 24 non-coding) were found to have putative functional importance. Out of these, 11 SNPs have been previously studied in association with various diseases and P-selectin levels. A nsSNP rs6125 (V209M) is located at exon 5 of SELP, near to the epidermal growth factor-like domain of SELP, which is important for specificity and ligand binding. The prevalence of its variant 209M is found to be high in ventricular fibrillation patients . Furthermore, NHLBI GO Exome Sequencing Project (ESP) predicted rs6125 as probable damaging for protein structure as well as function, which is in consonance with the findings of the present study (http://evs.gs.washington.edu/EVS/). Another nsSNP rs6127 (D603N) is reported to be associated with recurrent spontaneous abortions, myocardial infarction and increased risk of albuminuria in multiple studies [9, 53]. It is located within the CR domain (exon 11) of the P-selectin protein. This is shown to be important for P-selectin and its ligand binding. So, it may be assumed that variation at position 603 of P-selectin results in a protein that can affect the recruitment of leukocytes to the endothelium. This hypothesis can prove the contribution of leukocyte/endothelium interaction mechanism in coronary heart disease. Furthermore, rs6127 was found to be associated with increased thrombosis risk in antiphospholipid syndrome patients . Third functionally important nsSNP, rs6133 (V640L) is located in a region near the trans-membrane domain (exon12) of P-selectin that may have a functional role in P-selectin/ leukocyte interaction and the amino acid substitution at position 640 is predicted to be functionally deleterious . In a previous study, this variation is found to be associated with low soluble P-selectin levels in European-Americans and African-Africans .
A Genome- wide linkage study showed the association of two distinct regions of SELP with systemic lupus erythematosus (SLE) in UK and USA populations . In the case of UK SLE families, three haplotype blocks are defined in SELP. The first haplotype block comprised of 21.4 kb in promoter region, EGF, Lectin like domain and 1-3 CR regions. Block 2 covers up a region of 14.6 kb, which involves the rest of the CR region. The 3rd block covers the transmembrane domain and 3’UTR region, having a length of 11.2 kb. A risk haplotype, tagged by C allele of rs3753306, was located in the first haplotype block. The 2nd block had a protective haplotype, tagged by T allele of rs6133. This genetic association study is carried out in US SLE families to confirm the replication of UK-SELP associations. These results revealed the association of rs3917657 and rs6131 in US samples. The over-transmitted alleles of these two SNPs i.e. C and G, were also carried on UK risk haplotype. The combined UK-US data set showed the stronger association of rs3917657 and rs3753306 as compared to individual data sets of both the SLE families. Furthermore, the promoter variant rs3753306 is found to affect the transcription factor binding activity. The C allele of rs3753306 was shown to disrupt trans-activating transcription factor binding site and limiting the function of P-selectin. It is proposed that this allele may reduce the recruitment of pro-inflammatory leukocytes by decreasing production of P-selectin .
Other studies showed the association of the variants rs1800805 (-1969G/A), rs1800807 (-2123C/G), rs1800808 (-1817T/C) with increased risk of cardiac heart disease (CHD) and MI among Asians and Caucasians, but in contrast this pattern was reverted in case of Africans. This finding further asserts the important role of ethnicity in susceptibility to various diseases [56, 57]. Furthermore, the variant rs1800807, located within a putative transcription factor binding site for c-Ets-1, is reported to be associated with higher soluble P-selectin levels. The intronic variant rs2235302 is found to be associated with higher P- selectin levels and increased thickness of carotid intima media . Another genomic study also reveals the significant association of rs2235302 with higher sP-selectin levels . The variant rs732314 is found to be associated with propensity to low HDL cholesterol and coronary heart disease . Another variant rs3917779, located in the 10th intron at the binding site of CCCTC-binding factor (CTCF), is reported to be associated with proliferative diabetic retinopathy in Iran . The CTCF binding factor involves in transcription regulations in various ways, including activation or repression of selective promoter, blocking of enhancer, hormone responsive silencing, alternative silencing and genomic imprinting [60-61]. The study proposed that TT genotype of rs3917779 may affect the transcription by abolishing the binding site of CTCF binding factor . Thus, these studies serve as strong evidence for the contribution of P-selectin variants in the risk of various disease conditions.
In silico analysis reveals 27 functionally important SNPs in human SELP gene. Out of these, 11 SNPs has been reported to be associated with various inflammatory diseases in previous studies, thus validating the finding of the present analysis. However, the remaining 16 SNPs (rs1018828, rs3917651, rs3917647, rs3917655, rs3917802, rs3917803, rs3917824, rs3917840, rs3917843, rs3917848, rs3917853, rs3917854, rs3917855, rs2205895, rs3917811 and rs3917739) are not yet studied, thus they need to be thoroughly investigated. This will facilitate researchers to focus on experimental validation of these SNPs in various inflammatory disease conditions.
The work was supported by financial assistance under INSPIRE fellowship programme (IF-130841) by Department of Science and Technology, New Delhi.
Conflict of interest: Authors state no conflict of interest.
List of Abbreviations
Single nucleotide polymorphism
Epidermal growth factor
P-selectin glycoprotein ligand-1
Minor allele frequency
Sorting Intolerant from Tolerant
Position Specific Independent Count
Support vector machine
Amino acid substitutions
- RCSB PDB
Research Collaboratory for Structural Bioinformatics Protein Data Bank
Genome Wide Association Studies
Encyclopedia of DNA Elements
Transcription factor binding site
Coronary heart disease
systemic lupus erythematosus
 Stenberg P.E., McEver R.P., Shuman M.A., Jacques Y.V., and Bainton D.F., A platelet alpha-granule membrane protein (GMP-140) is expressed on the plasma membrane after activation, J. Cell Biol., 1985, 101, 880-88610.1083/jcb.101.3.880Search in Google Scholar
 McEver R.P., Beckstead J.H., Moore K.L., Marshall-Carlson L., and Bainton D.F., GMP-140, a platelet alpha-granule membrane protein, is also synthesized by vascular endothelial cells and is localized in Weibel–Palade bodies, J. Clin. Invest., 1989, 84, 92-9910.1172/JCI114175Search in Google Scholar
 Collins T., Williams A., Johnston G.I., Kim J., Eddy R., Shows T., et al., Structure and chromosomal location of the gene for endothelial-leukocyte adhesion molecule, J. Biol. Chem., 1991, 266, 2466-247310.1016/S0021-9258(18)52267-5Search in Google Scholar
 Tregouet D.A., Barbaux S., Escolano S., Tahri N., Golmard J.L., Tiret L., et al., Specific haplotypes of the P-selectin gene are associated with myocardial infarction, Hum. Mol. Gen., 2002, 11, 2015-202310.1093/hmg/11.17.2015Search in Google Scholar
 Tregouet, D.A., Barbaux S., Poirier O., Blankenberg S., Bickel C., Escolano S., et al., SELPLG Gene Polymorphisms in Relation to Plasma SELPLG Levels and Coronary Artery Disease, Ann Hum Genet, 2003, 67, 504-51110.1046/j.1529-8817.2003.00053.xSearch in Google Scholar
 Aref S., Sakrana M., Hafez A.A. and Hamdy M., Soluble P-selectin levels in diabetes mellitus patients with coronary artery disease, Hematology, 2005, 10, 183-18710.1080/10245330500072405Search in Google Scholar
 LiuY., Burdon K.P., Langefeld C.D., Beck S.R., Wagenknecht L.E., Rich S.S., et al., P-selectin gene haplotype associations with albuminuria in the Diabetes Heart Study, Kidney Int, 2005, 68, 741-74610.1111/j.1523-1755.2005.00452.xSearch in Google Scholar
 Berardi C., Larson N.B., Decker P.A., Wassel C.L., Kirsch P.S., Pankow J.S., et al., Multiethnic analysis reveals soluble L-selectin may be post-transcriptionally regulated by 3’UTR polymorphism: the MultiEthnic Study of Atherosclerosis (MESA), Hum Genet., 2015, 134, 393-40310.1007/s00439-014-1527-0Search in Google Scholar
 Raman K., Chong M., Akhtar D.G.G., D’Mello M., Hasso R., Ross S., et al., Genetic markers of inflammation and their role in cardiovascular disease, Can J Cardiol., 2013, 29, 67-7410.1016/j.cjca.2012.06.025Search in Google Scholar
 Miner J.J., Xia L., Yago T., Kappelmayer J., Liu Z., Klopocki, A.G, et al., Separable requirements for cytoplasmic domain of PSGL-1 in leukocyte rolling and signaling under flow, Blood, 2008, 112, 2035-204510.1182/blood-2008-04-149468Search in Google Scholar
 Jira P.E., Wanders R.J., Smeitink J.A., Jong D.J., Wevers R.A., Oostheim W., et al., Novel mutations in the 7-dehydrocho-lesterol reductase gene of 13 patients with Smith--Lemli--Opitz syndrome, Ann Hum Genet., 2001, 65, 229-23610.1046/j.1469-1809.2001.6530229.xSearch in Google Scholar
 Cohen J., Pertsemlidis A., Kotowski I.K., Graham R., Garcia C.K., and Hobbs H.H., Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9, Nat. Genet., 2005, 37, 161-16510.1038/ng1509Search in Google Scholar
 Lek M., Karczewski K.J., Minikel E.V., Samocha K.E., Banks E., Fennell T., et al., Analysis of protein coding genetic variation in 60,706 humans, Nature, 2016, 536, 285-29110.1038/nature19057Search in Google Scholar
 Chen R., Shi L., Hakenberg J., Naughton B., Sklar P., Zhang J., et al., Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases, Nat. Biotechnol., (in press), DOI: 10.1038/nbt.3514DOI: 10.1038/nbt.3514Search in Google Scholar
 Narasimhan V.M., Hunt K.A., Mason D., Baker C.L., Karczewski K.J., Barnes M.R., et al., Health and population effects of rare gene knockouts in adult humans with related parents, Science, 2016, 352,474-47710.1126/science.aac8624Search in Google Scholar
 Elmas E., Bugert P., Popp T., Lang S., Weiss C., Behnes M., et al., The P-Selectin Gene Polymorphism Val168Met: A Novel Risk Marker for the Occurrence of Primary Ventricular Fibrillation During Acute Myocardial Infarction, J. Cardiovasc. Electro-physiol., 2010, 21, 1260-126510.1111/j.1540-8167.2010.01833.xSearch in Google Scholar
 Sivapalaratnam S., Motazacker M.M., Maiwald S., Hovingh G.K., Kastelein J.J., Levi M., et al., Genome- wide association studies in atherosclerosis, Curr Atheroscler Rep, 2011, 13, 225-23210.1007/s11883-011-0173-4Search in Google Scholar
 Barbaux S.C., Blankenberg S., Rupprecht H.J, Francomme C., Bickel C., Hafner G., et al., Association between P-selectin gene polymorphisms and soluble p-selectin levels and their relation to coronary artery disease, Arterioscler Thromb Vasc Biol, 2001, 21, 1668-167310.1161/hq1001.097022Search in Google Scholar
 Manka D., Collins R.G., Ley K., Beaudet A.L. and Sarembock I.J., Absence of p-selectin, but not intercellular adhesion molecule-1, attenuates neointimal growth after arterial injury in apolipoprotein e-deficient mice, Circulation, 2001, 103, 1000-100510.1161/01.CIR.103.7.1000Search in Google Scholar
 Volcik K.A., Ballantyne C.M., Coresh J., Folsom A.R., and Boerwinkle E., Specific P-Selectin and P-Selectin Glycoprotein Ligand–1 Genotypes/Haplotypes are Associated with Risk of Incident CHD and Ischemic Stroke: The Atherosclerosis Risk in Communities (ARIC) Study, Atherosclerosis, 2007, 195, e76-e8210.1016/j.atherosclerosis.2007.03.007Search in Google Scholar
 Jacobin V.M.J., Deramchia K., Mornet S., Hagemeyer CE., Bonetto S., Robert R., et al., MRI of inducible P-selectin expression in human activated platelets involved in the early stages of atherosclerosis, NMR Biomed., 2011, 24, 413-42410.1002/nbm.1606Search in Google Scholar
 Burkhardt J., Blume M., Teixeira E.P., Teixeira H.V., Steiner A., Quente E., et al., Cellular Adhesion Gene SELP Is Associated with Rheumatoid Arthritis and Displays Differential Allelic Expression, PloS one, 2014, 9, e10387210.1371/journal.pone.0103872Search in Google Scholar
 Morris D.L., Graham R.R., Erwig L.P., Gaffney P.M., Moser K.L., Behrens T.W., et al., Variation in the upstream region of P-Selectin (SELP) is a risk factor for SLE, GENES IMMUN, 2009, 10, 404-41310.1038/gene.2009.17Search in Google Scholar
 Barbalic M., Dupuis J., Dehghan A., Bis J.C., Hoogeveen R.C., Schnabe R.B., et al., Large-scale genomic studies reveal central role of ABO in sP-selectin and sICAM-1 levels, Hum Mol Gen, 2010, 19, 1863-187210.1093/hmg/ddq061Search in Google Scholar
 Krawczak M., Ball E.V., Fenton I., Stenson P.D., Abeysinghe S., Thomas N., et al., “Human gene mutation database—a biomedical information and research resource,” Hum Mutat, 2000, 15, 45-5110.1002/(SICI)1098-1004(200001)15:1<45::AID-HUMU10>3.0.CO;2-TSearch in Google Scholar
 Prokunina L., and Riquelme M.E.A., “Regulatory SNPs in Complex Diseases: Their Identification and Functional Validation”, Expert Rev. Mol. Med., 2004, 6, 1-1510.1017/S1462399404007690Search in Google Scholar
 Ng P.C. and Henikoff S., Predicting the effects of amino acid substitutions on protein function, Annu. Rev. Genomics Hum Genet, 2006, 7, 61-8010.1146/annurev.genom.7.080505.115630Search in Google Scholar
 Capriotti E., Calabrese R., Fariselli P., Martelli P.L., Altman R.B., and Casadio R., WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation, BMC Genomics, 2012, 3, 1471-216410.1186/1471-2164-14-S3-S6Search in Google Scholar
 Bao L., Zhou M. and Cui Y., nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms, Nucleic Acids Res., 2005, 33, W480-W48210.1093/nar/gki372Search in Google Scholar
 Mort M., Sterne-Weiler T., Li B., Ball E.V., Cooper D.N., Radivojac P., et al., MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing, Genome Biol., 2014, 15, R1910.1186/gb-2014-15-1-r19Search in Google Scholar
 Li B., Krishnan V.G., Mort M.E., Xin F., Kamati K.K., Cooper D.N., et al., Automated inference of molecular mechanisms of disease from amino acid substitutions, Bioinformatics, 2009, 25, 2744-2750.10.1093/bioinformatics/btp528Search in Google Scholar
 Capriotti E., Fariselli P., and Casadio R., I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure, Nucleic Acids Res., 2005, 33, W306-W31010.1093/nar/gki375Search in Google Scholar
 Xu Z., and Taylor J.A., SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies, Nucleic Acids Res., 2009, 37, W600-W60510.1093/nar/gkp290Search in Google Scholar
 Boyle A.P., Hong E.L., Hariharan M., Cheng Y, Schaub M.A., Kasowski M., et al., Annotation of functional variation in personal genomes using Regulome DB, Genome Res, 2012, 22, 1790-179710.1101/gr.137323.112Search in Google Scholar
 Thusberg J., Vihinen M., Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum. Mutat. 2009, 30, 703-714.10.1002/humu.20938Search in Google Scholar
 Hicks S., et al., Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat., 2011, 32, 661-66810.1002/humu.21490Search in Google Scholar
 Williams S., Analysis of in silico tools for evaluating missense variants, summary report, 2012, National genetic reference laboratory, ManchesterSearch in Google Scholar
 Musemeci L, Arthur J.W., Cheung F.S.G., Hoque A., Lippman S., and Reichardt J.K., “Single Nucleotide Differences (SNDs) in the dbSNP Database May Lead to Errors in Genotyping and Haplotyping Studies”, Hum. Mutat., 2010, 31, 67-7310.1002/humu.21137Search in Google Scholar
 Nelson M.R., Marnellos G., Kammerer S., Hoyal CR., Shi M.M., Cantor CR., et al., Large-Scale Validation of Single Nucleotide Polymorphisms in Gene Regions, Genome Res., 2004, 14, 1664-166810.1101/gr.2421604Search in Google Scholar
 Dendana M., Hizem S., Magddoud K., Messaoudi S., Zammiti W., Nouira M., et al., Common polymorphisms in the P-selectin gene in women with recurrent spontaneous abortions, Gene, 2012, 495, 72-7510.1016/j.gene.2011.11.034Search in Google Scholar
 Kanitez N.A., Hancer V.S., Erer B., Kamali S., Inanç M., and Küçükkaya R.D., The association between P-selectin polymorphisms and thrombosis in antiphospholipid syndrome: a pilot study, Ann Rheum Dis, 2013, 72, 78910.1136/annrheumdis-2013-eular.2334Search in Google Scholar
 Reiner A.P., Carlson C.S., Thyagarajan B., Rieder M.J., Polak J.F., Siscovick D.S., et al., Soluble P-Selectin, SELP Polymorphisms, andAtherosclerotic Risk in European-American andAfrican-African Young Adults, Arterioscler Thromb Vasc Biol., 2008, 28, 1549-155510.1161/ATVBAHA.108.169532Search in Google Scholar
 Zhou D.H., Wang Y., Hu W.N., Wang L.J., Wang Q., Chi M., et al., SELP genetic polymorphisms may contribute to the pathogenesis of coronary heart disease and myocardial infarction: a meta-analysis, Mol Biol Rep, 2014, 41, 3369-338010.1007/s11033-014-3199-1Search in Google Scholar
 Peloso G.M., Demissie S., Collins D., Mirel D.B., Gabriel S.B., Cupples L.A., et al., Common genetic variation in multiple metabolic pathways influences susceptibility to low HDL-cholesterol and coronary heart disease, J Lipid Res., 2010, 51, 3524-353210.1194/jlr.P008268Search in Google Scholar
 Kolahdouz P., Farashahi Y.E., Tajamolian M., Manaviat M.R., and Sheikhha, M.H., The rs3917779 polymorphism of P-selectin’s significant association with proliferative diabetic retinopathy in Yazd, Iran, Graefes Arch Clin Exp Ophthalmol., 2015, 253, 1967-197210.1007/s00417-015-3141-9Search in Google Scholar
 Shukla S., Kavak E., Gregory M., Imashimizu M., Shutinoski B., Kashlev M., et al., CTCF promoted RNA polymerase II pausing links DNA methylation to splicing, Nature, 2011, 479, 74-7910.1038/nature10442Search in Google Scholar
© 2017 Raminderjit Kaur et al.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.