Detection rate of IGF-1 variants and their implication to protein binding: study of over 240,000 patients

Objectives: To determine the detection rate of IGF-1 variants in a clinical population and assess their implications. Methods: IGF-1 variants were detected based on their predicted mass-to-charge ratios. Most variants were distinguished by their isotopic distribution and relative retention times. A67T and A70T were distinguished with MS/MS. Patient specimens with a detected variant were de-identi ﬁ ed for DNA sequencing to con ﬁ rm the polymorphism. Results: Of the 243,808 patients screened, 1,099 patients containing IGF-1 variants were identi ﬁ ed (0.45 %, or 4,508 occurrences per million). Seven patients were identi ﬁ ed as homozygous or double heterozygous. Majority of variants (98 %) had amino acid substitutions located at the C-terminus (A62T, P66A, A67S, A67V, A67T, A70T). Isobaric variants A38V and A67V were detected more frequently in children than in adults. Six previously unreported variants were identi ﬁ ed: Y31H, S33P, T41I, R50Q, R56K, and A62T. Compared with the overall population, z-score distribution of patients with IGF-1 variants was shifted toward negative levels (median z-score − 1.4); however, it resembled the overall population when corrected for heterozygosity. Chromatographic peak area of some variants di ﬀ ered from that of the WT IGF-1 present in the same patient. Conclusions: In the IGF-1 test reports by LC-MS, the concentrations only account for half the total IGF-1 for patients with heterozygous IGF-1 variants. An IGF-1 variant may change the binding to its receptor and/or its binding proteins, a ﬀ ecting its activity and half-life in circulation. Variants located in or close to the C-domain may be pathogenic. Cross-species sequence comparison indicates that A38V and A70T may have some degree of pathogenicity.


Introduction
Growth hormone (GH) and insulin-like growth factor 1 (IGF-1) are vital for normal physical growth in children by stimulating growth, cell reproduction, and cell regeneration in tissues and organs throughout life.The effect of IGF-1 levels on aging is not fully understood and requires further studies [1], however, some reports suggest that low IGF-1 levels are associated with prolonged life expectancy [2,3].IGF-1 levels in blood are more stable than levels of GH.IGF-1 mediates many growth-promoting effects, and some metabolic effects of GH [4][5][6].Consequently, measurement of IGF-1 is a useful screening tool to assist in the diagnosis of growth hormone deficiency (GHD) or excess when coupled with provocative testing.The use of low IGF-I alone to identify children and young adults with GHD has a low sensitivity and sensitivity.In children 0-10 years old the sensitivity was 53.3 % and the specificity was 97.9 %.In individuals 10-20 years old the sensitivity was 73.9 % and the specificity was 67.0 % [7].Measurement of IGF-1 is useful to monitor the effectiveness of GH treatment [8][9][10][11].Moreover, IGF-1 immunoassays have been used for decades, but are prone to interference from IGF binding proteins (IGFBPs) and have a lack of standardization across various immunoassay platforms [12][13][14][15].In contrast, the measurement of IGF-1 using liquid chromatography-mass spectrometry (LC-MS) [16,17] has demonstrated consistency and adherence to key consensus recommendations [18,19].
Our laboratory developed a high resolution, accurate mass-based, top-down LC-HRMS assay for IGF-1 quantitation in 2011 and has provided clinical testing with this assay since 2012 [16].The method combines molecular specificity, quantitative performance, recombinant reference material for IGF-1 (traceable to WHO 02/254), and detailed age/sex specific reference intervals [17].This method measures the intact IGF-1 protein and permits the ability to identify the occurrence of an amino acid substitution in it.
Several studies have reported detection of IGF-1 variants with MS-based assays.Hines et al. reported a polymorphic IGF-1 variant using 1,720 samples [20].Maus et al. identified IGF-1 variants using a center-of-mass (COM) calculation with verification by tandem mass spectrometry (MS/MS) and DNA sequencing, resulting in an IGF-1 variant positive rate of 0.57 % based on 146,620 patient samples [21].In Oran et al. quantitation study, approximately 1 % of 1,054 samples were identified as an A67T variant, assigned primarily because of its high detection rate in the single nucleotide polymorphism (SNP) database.The findings were not confirmed by DNA sequencing [22].
In our previous works [28,29], isotopic peak index (IPi), relative retention time (rRT), and MS/MS were used to identify and characterize the variants, and DNA sequencing was used to verify the known variant assignments and determine novel variants.In addition to the most common variants, A67T and A70T, we were able to identify four additional variants represented in the Exome Aggregation Consortium (ExAC) database (P66A, A67S, S34N, A38V), two previously reported variants (V44M and A67V), and discover six previously unreported variants (Y31H, S33P, R50Q, R56K, T41I, and A62T).
In this report, we utilized our validated approach [29] to screen a large set of patient samples submitted to our laboratory for IGF-1 measurement.The purpose of this study is to determine the detection rate and identities of IGF-1 variants and provide the context for further studies regarding their clinical significance.We also reviewed relevant literature related to the binding of IGF-1 protein to its receptor and the potential clinical implications of the presence of these variants.

Subjects and samples
Patient specimens were submitted to Quest Diagnostics for quantitation of intact IGF-1 by LC−MS (test code 16293).The IRB sponsor protocol number is BR13-002 and the IRB protocol number is 20121940.The analysis of discarded patient specimens was judged exempt by the Western Institutional Review Board.In total, 243,808 unique patients were screened for the presence of IGF-1 variants.In subsequent sections of this manuscript, testing numbers and variant detection rates are calculated from the numbers of unique individuals tested within the period.

The workflow
The list of IGF-1 variants to monitor was built based on the ExAC database of polymorphisms (accessed on 2/6/2019), clinical reports, and new IGF-1 variants identified by our laboratory [29].While S35C variant was not originally in the list of monitored variants, our method would be able to detect it if present in the patient specimen [27].
The sample preparation method was previously described by Bystrom et al. [16] and modified in our prior work [29].Intact, unmodified IGF-1 was extracted by acidified ethanol and the neutralized supernatant was subjected to HPLC separation and HRMS detection.LC−MS data were acquired, analyzed, and reported by Thermo Trace-Finder 5.0 Clinical with an in-house developed method [29].WT IGF-1 and variant levels were calculated using their area under the curve (AUC) from LC-HRMS data and the calibration curve of the WT IGF-1 [30].

Statistical analysis
Tests of statistical significance were performed, boxplots and graphs generated using R (v4.2.1.R Core Team 2021).

Detection rate of IGF-1 variants
In total, 1,099 patients were identified as having a single amino acid substitution.In these patients, LC-MS data shows two chromatographic peaks, one for the WT protein, and one for a variant.Of patients suspected to have either A67T or A70T variants, 728 had enough residual volume to perform MS/MS analysis, which yielded 396 patients (54 %) with A67T and 332 (46 %) with A70T variants.We extrapolated these percentages when estimating and apportioning the relative distribution of A67T and A70T in the overall population and age analysis (Tables 1 and 2).Seven patients were identified as homozygous or double heterozygous in this cohort.In both cases, the WT IGF-1 peak was below the detection limit.
In homozygous variant patients, the only observed peak was the variant, and in double heterozygous two peaks were observed, each for a different variant protein (an example of a double heterozygous patient and a typical patient with an IGF-1 variant are presented in Supplementary Material, Figure S1 and S2, respectively).
The overall detection rate identified in this study is 0.45 % (or 4,508 OPM, occurrences per million).Of the variants, 95 % were identified having substitutions in C-terminal amino acids, which is the same value found in the ExAC database.
Age data was available for 243,343 of the patients tested, including 1,096 that had an IGF-1 variant.A histogram of IGF-1 variants occurrences vs. patient's age (Figure 1A) shows two peaks at age ranges 10-15 years old and 45-50 years old.This age distribution closely resembles that of the overall patient population submitted for the IGF-1 test, where two peaks at ages 10-15 and 55-60 were also present (Supplementary Figure S3).When occurrences are converted to detection rate (by dividing by the total number of patients of that age group tested), the distribution shows similar detection rate, peaking at ages 15-20, and staying relatively unchanged towards the older ages (Figure 1B).
Isobaric A38V (7 patients) and A67V (21 patients) variants were distinguished by their rRT: 0.00 ± 0.01 min for A38V and 0.04 ± 0.01 min for A67V.Both variants occurred more frequently in children than in adults (Table 2).We also performed an age analysis for A67T (394 patients) and A70T (332 patients) variants that were distinguished by their MS/ MS spectra.The detection rate of A70T was 1.3-fold higher in adults than in pediatric patients (Table 2).Other variants detected in this study are presented in Table 1.

AUC of variants compared to the WT
In general, the ratio of area under the curve (AUC) of a variant to that of the WT varies between patients (Figure 2).In some patients the AUC of a variant was up to 30 % higher (V44M) or lower (R50Q) than that of the WT.The ratio of a variant's AUC to that of the WT was tested with a one-sample t-test under the null hypothesis that it should equal to one.This resulted in three variants failing the null hypothesis: P66A (ratio=1.22,p-value=0.032),A67T (ratio=0.94,p-value=0.038),and A70T (ratio=1.06,p-value=0.005).Interestingly, the t-test between the ratios for A67T and A70T is statistically significant (two sample t-test, mean ratio for A67T=0.94,mean ratio for A70T=1.06,p-value=0.0008).

Z-score bias
The distribution of the z-scores for the overall (including patients with variants) tested clinical population (both age and gender data available for n=239,630) and the population with a detected heterozygous IGF-1 variant (both age and gender data available for n=1,086) is presented in Figure 3.The median z-score calculated for the population with a detected IGF-1 variant (Figure 3, left boxplot) was significantly lower (two sample t-test with equal variance, p-value<2.2e-16)than that of the overall tested patient distribution (Figure 3, middle boxplot).If the total IGF-1 concentration of the heterozygous patient is calculated by WT concentration multiplied by two (a rough estimate for both WT and a variant form), the z-score distribution (Figure 3, right boxplot) would be close to the overall tested patient distribution.The distribution of z-scores in males and females in the overall clinical population is similar (Supplementary Figure S4), however, there is a small statistically significant difference between the two (males z-score mean −0.03, females z-score mean −0.11, p-value<2.2e-16).No difference (p-value=0.60) between z-scores of males and females was observed in patients with IGF-1 variants (Supplementary Figure S5).

IGF-1 variant rate of detection
The IGF-1 variant detection rate in the ExAC database of general population is 0.21 % (2,079 OPM), compared to 0.45 % identified in this study.A possible explanation for this difference is that in order IGF-1 testing to be requested the patient is more likely to present a physician with growth or endocrine disorders.As a complete inventory of IGF-1 variants is likely yet to be assembled, we expect that our targeted approach remains an underestimate of the true variant detection rate in the clinical population.
The observed distribution of variants with age (Figure 1), suggests that the first peak in testing may be associated with studies initiated by physicians to assess growth in children and young adults, whereas the second may be related to investigations related to the aging process, and/or clinical investigation of GH deficiency or excess in adults.

IGF-1 protein complexes and receptor binding
In order for IGF-1 to bind to its receptor (IGF-1R), it must be released into the circulation or target tissue from its protein complex.IGF-1 forms two protein complexes in the blood: a binary complex with one of the IGFBP-1, -2, -4, or -6 (10-15 %), or a ternary complex (80-90 %) with either IGFBP-3, or -5 and acid-labile subunit (ALS) [31][32][33].The fraction of free IGF-1 in the blood of healthy patients is reported in the range of 0.13-2.77% [34,35], and this free IGF-1 has a half-life of <10 min [36].A binary complex extends that value to 30-90 min, whereas ternary complex prolongs it to 16-24 h [36].Protein complexes and IGF-1R compete for binding to IGF-1, and the potential consequences of amino acid substitutions on these interactions are discussed below.
IGF-1 C-domain (positions 30-41) and, to a lesser extent, D-domain (positions 63-70) amino acids are critical for recognizing IGF-1R, key roles of which are played by charged residues R36, R37, K65, and K68 [37].Interestingly, C-domain residues Y31, R36, and R37, are also needed for IGF-1 to bind to ALS in the ternary complex [38] with IGFBP-3 or -5 [39].The binding of IGF-1 to IGFBP-3 is mediated by hydrophobic residues [38].Thus, in the blood, IGF-1 can either be protected in a binary/ternary complex or be released to interact with its receptor.
Predicting the outcome of IGF-1 amino acid substitutions on IGFBPs, ALS, and/or the IGF-1R binding interactions is difficult due to their complexity.In the case of a stronger binding in the binary/ternary complex, a smaller free IGF-1 fraction will circulate in the bloodstream, increasing its half-life due to slower clearance.In contrast, weaker binding in the binary/ternary complex will result in a larger free fraction of the circulating protein, thus decreasing its half-life.Changes in specific amino acids may also affect the binding of IGF-1 to its receptor.Improper binding, or lack of binding, may inhibit receptor activation and the downstream signaling, resulting in a loss of function.Consequently, the ability to predict IGF-1 protein conformation is important to estimate the risk to patients.This theory has been recently confirmed by Giacomozzi et al. [27], who described a homozygous patient, in which the S35C IGF-1 variant reduced protein's ability to induce IGF1R phosphorylation.

Analyzing and predicting the outcome of the polymorphism
In the present method, the IGF-1 protein is detected in a nonreduced form with all disulfide bonds intact.During chromatographic separation, some variants do not co-elute with the WT IGF-1, indicating that amino acid substitutions in those variants affect their interactions with the chromatographic stationary phase resulting in different retention times compared to the WT IGF-1.Whether or not this is an indication of the changes in protein conformation in vivo requires further studies.
The following case studies indicate that location and amino acid change determines the severity of the variant's clinical outcome.Based on them, we attempted to assess the biochemical and clinical effects of newly discovered or clinically unreported variants.
In vitro analysis using recombinant variants demonstrated that both the V44M and R36Q amino acid substitutions reduced the affinity of IGF-1 for its receptor-90-fold for V44M [23] and 3.9-fold for R36Q IGF-1 variants [24].Valine-44 is located in the alpha helix close to the C-domain binding region (positions 30-41) and arginine-36 is located in a loop inside the C-domain binding region.Thus, amino acid substitutions within the C-domain region (R36Q) and proximal to it (V44M) can affect protein binding, suggesting that certain amino acid substitutions in the 30-44 region may interfere with binding of IGF-1 to its receptor and/or IGFBPs.
We also assessed the significance of the location of polymorphisms by analyzing the conservation of amino acids in the protein sequence (Table 3).The protein sequence, including protein-receptor interface region 30-41, is conserved among mammals, with only few exceptions.Therefore, our hypothesis is that polymorphisms in the most conserved locations may cause negative patient outcomes, compared to changes in the least conserved regions.

A38V and A67V pathogenicity
Both A38V and A67V variants occurred 2.67-times more frequently in pediatric patients than in adults: 49 OPM vs. 19 OPM for A38V, and 148 OPM vs. 56 OPM for A67V, Table 2. Assuming that a benign variant should have the same detection rate in pediatric and adult patients, the higher frequencies observed in the younger group suggests that theses variants may have a prevalent role in childhood growth.
A38V has previously been suggested to be a pathogenetic IGF-1 variant [21].The amino acid substitution occurs at position 38, which is in the conserved protein-receptor binding region (Table 3) and may result in a pathogenic change in function.Because alanine and valine are both hydrophobic, uncharged amino acids, the consequences of this substitution may not be clinically as severe as others in the binding region with more dissimilar amino acids (e.g.R36Q or Y60H).This hypothesis needs to be tested by in vitro study or by clinical correlation.
In contrast, A67V, was the third most identified variant (86 OPM).Despite its increased detection rate in the younger population, the amino acid substitution is located at a less conserved position in the sequence (Table 3), suggesting that it may not be pathogenic, however, absence in the DNA database could argue for it being biologically important.

A67T and A70T pathogenicity
A67T and A70T represent the most variant cases.Detection and confirmation of the two variants using tandem massspectrometry was validated with DNA sequencing and showed an unequivocal match between the two techniques [29].The C-terminus of IGF-1 does not directly participate in the protein-receptor.Compared to the general population ExAC database (Table 1), OPM for A67T was 1.8-fold higher, and OPM for A70T was 3.2-fold higher, indicating a higher detection rate of A70T in the clinical population.Higher detection rate of A70T in adults than in pediatric patients may indicate that, in contrast to A38V and A67V variants, any unfavorable consequences of the presence of the A70T variant may present later in life.
From the cross-species IGF-1 protein sequence comparison (Table 3), A70 is conserved across all species listed whereas A67 is not.Consequently, A70T pathogenicity may warrant further investigation.

Other variants
Among reported pathogenic variants (R36Q, V44M, R50W, and Y60H), only V44M was observed (five patients in the study cohort).Variants R50Q and R56K are not in the DNA database, but were observed five and three times, respectively, in this study.Their pathogenicity is unknown and needs further studying.

Z-score bias
Z-score value shows how many standard deviations a certain measured value lies apart from the population mean.For the IGF-1 assay, z-score is calculated based on the quantitative value of IGF-1 measured in patient's serum, patient's sex, and age, and normal values are in the range of −2 to +2.In our study, the median z-score for patients with IGF-1 variants was lower than that of the overall tested patient distribution but equivalent when corrected for heterozygosity (Figure 3).Levels in the low part of the normal range in a poorly growing child with a non-pathogenic IGF-1 variant may lead to unnecessary testing for growth hormone deficiency, while this testing may be beneficial if the variant is known to be pathogenic.
These z-score results indicate that routine HRMS detection of WT IGF-1 has an apparent limitation, i.e., for heterozygous individuals only half of the total IGF-1 is quantified.In total, 280 (25.8 %) patients would have been miscategorized as outside the IGF-1 normal reference range by not accounting for the level of IGF-1 variants (Supplementary Material, Adj.Z.Score.xlsx).Doubling the WT IGF-1 concentration can move a patient's results from an abnormal low to a normal reference range or from a normal reference range to an abnormally high level.For patients with homozygous or double heterozygous variants, the z-score may be calculated based on the estimated level of the respective variant(s).

Conclusions
The detection rate of IGF-1 variants in a clinical population of 243,808 individuals was 0.45 % (4,508 OPM), which is higher than in the general population.
Rate of detection of IGF-1 variants was similar in both sexes and all ages (when normalized to population age).However, for A38V and A67V it was higher in pediatric patients.Additional study is needed to understand why the detection rate of variants differs between children and adults.Position 38 is in a conserved receptor protein binding region, so A38V may be more pathogenic than A67V.Based on over-representation of A70T in the clinical population compared to the overall population, and the conservation of the alanine at the position 70 among species, A70T may be more pathogenic than A67T.Some variants showed a statistically significant difference between the AUC of the WT protein and that of a variant, including P66A and A70T (higher than WT) and A67T (lower than WT).This difference may, among other factors, indicate changes in the half-life of circulating proteins, which in turn could be caused by changes in the binding of the IGF-1 (and a variant) to its protein complex and/or receptor.It is a new discovery that requires a further study.
For patients with a heterozygous variant, there is a bias in calculation of the z-score as IGF-1 level only represents half of the circulating protein.However, if both WT and variant IGF-1 levels are accounted for, the z-score would better represent the patient's IGF-1 status.
The knowledge of the presence of a variant in a patient sample is clinically relevant for diagnosis and should provide valuable information to a physician.Currently, on Quest Diagnostics IGF-1 reports, a message is added if a heterozygous variant is presents pointing out that only WT IGF-1 concentration is reported, and the z-score is calculated based on that.In case of homozygous or double heterozygous variants, an estimation of total IGF-1 concentration is also given, providing important information to guiding the care of individuals carrying one or two IGF-1 variant alleles.Beyond IGF-1 concentrations, additional characterization of IGF-1 variants is necessary to determine the biological impact of individual variants.The field would benefit from in vitro characterization of IGF-1 variants' affinity to their binding partners and an accurate estimation of the total level of functional bioavailable IGF-1 in patient specimens.

Figure 1 :
Figure 1: (A) The occurrence of IGF-1 variants as related to the patient age.The histogram depicts the number of variants detected as a function of age.The bars along the X-axis represent age in 5-year intervals.(B) Detection rate of IGF-1 variants calculated in age groups.The histogram depicts the occurrence of detected variants per million patients tested (number identified × 1,000,000/number analyzed) as a function of age.The bars along the X-axis represent age in 5-year intervals.

Figure 2 :
Figure 2: Distribution of ratios of the AUC of a variant to that of WT IGF 1 present in the same patient specimen.This ratio was calculated by dividing the HPLC-MS response peak area of a variant to that of the WT for each patient.The n is the number of patients with a variant identified.Ratios statistically different from 1.0 are labeled with an asterisk.

Figure 3 :
Figure3: The z-score distributions of patients comparing IGF-1 variant containing patients of this study (left boxplot) with overall tested clinical population for the same period (middle boxplot) and patients with IGF-1 variants adjusted for the IGF-1 levels (right boxplot).In the latter, the WT IGF-1 levels were doubled, and the z-scores recalculated.

Table  :
Comparison of the ExAC database with current study.

Table  :
Pediatric (- years old) and adult distribution for AV, AV, AT, and AT variants.
a Occurrence per million patients.b Pathogenic variants reported from single case studies.c Extrapolated from the MS/MS identification.a The number of AT and AT pediatric and adult patients was extrapolated from a subset of patients to the total number.Motorykin et al.: IGF-1 variants detection rate

Table  :
The sequence comparison for the positions - among various species.Conservation of an amino acid among species may indicated its importance for proper protein folding and binding.The dotted line separates mammals (on top).The Table indicates that Alanine  and Alanine  are conserved among mammals.The mutations in these amino acids may lead to growth consequences.On the other hand, Alanine  is not conserved in all species, which may indicate the relatively benign nature of variants at that position.