Development of a certified reference material for anti-β2-glycoprotein I IgG – commutability studies

Objectives: In this paper, we describe the steps followed for the development of a certified reference material for immunoglobulin G antibodies against β2-glycoprotein I (also known as apolipoprotein H). These steps include processing of the material, commutability, the impact of dilution, the appropriate reconstitution conditions, homogeneity and stability during transport and storage. Methods: We analysed 69 clinical samples from patients suffering from antiphospholipid syndrome with several commercial enzyme-linked immunosorbent assays (ELISA) purchased from in vitro diagnostic manufacturers. Results: Analysis of the results indicated that the candidate reference material can be safely freeze-dried, and that the user should carefully follow the reconstitution instructions as small changes in e.g. temperature may have unwanted effects. The statistical analysis of the commutability studies indicated that the analytical response of the referencematerial upondilution is similar to that of clinical samples, and that correlation between results may differ from assay to assay. Finally yet importantly, the presented and developed candidate reference material is commutable for most assays tested, homogeneous and stable. Conclusions: Immunoglobulin G antibodies against β2glycoprotein I are associated with a higher risk of thrombosis and pregnancy complications. Their measurement is essential for the diagnosis and monitoring of antiphospholipid syndrome. These antibodies are detected by specific immunoassays, routinely used in clinical diagnostics, but various of these methods show enormous variability, in part due to the lack of a reference material.


Introduction Anti-β2GPI and disease
Antibodies against-β2-glycoprotein I (anti-β2GPI) are recognised as one of the criteria for the diagnosis for antiphospholipid syndrome (APS). APS is a disorder causing vascular thrombosis and/or obstetric complications [1]. It has a wide range of potential manifestations, with some of them, typically being recurrent early miscarriages, preeclampsia and late foetal loss [2]. Diagnosis relies on clinical manifestations in combination with the measurement of antiphospholipid antibodies. Antiphospholipid antibodies are a general name given to several different analytes but only three are agreed in the current consensus guidelines to be important in diagnosis of APS [3]. These are the lupus anticoagulant (LA), anticardiolipin antibodies (aCL) and the anti-β2GPI, which is a predominant target in the pathogenesis of APS. In addition to the above, test results indicating the presence of anti-β2GPI antibodies should be detected for a period no smaller than 12 weeks or no more than five years should pass between two positive tests [3]. When B2GPI is used, in an assay preparation and development, in combination with CL, serves as a target antigen for aCL assays, while in the case where only B2GPI is used, it serves as a target antigen for anti-β2GPI assays [4]. Additionally, anti-β2GPI antibodies are responsible for the response in most of the LA assays and β2GPI-dependent LA are the ones predictive for thrombosis and foetal loss [5].

Anti-β2GPI measurements
Measurement of anti-β2GPI IgG antibodies is possible via a number of commercial or in house prepared enzyme-linked immunosorbent assays (ELISAs). Each assay uses its own buffers, sample dilution schemes, antigens, calibrants and arbitrary units. Despite the fact that each assay may work perfectly well in itself, this diversity renders difficult to compare results for the same patient but analysed with different kits. In the absence of an internationally accepted standard, it is impossible to have accurate, quantitative results that are comparable between method and in an individual patient over time.

Comparability of results from different assays
It is widely known that results between assays vary due to a number of factors independently of their individual performance. These factors have been summarised elsewhere [6]. With the exception of the sample specific variation and inevitably the different scales employed in the immunoassays used, all other factors are considered in our studies.
In this paper, we present the results of the correlation studies for anti-β2GPI IgG assays, report the results in analytical terms, and describe the effect of freeze-drying processing on the material, the behaviour of a candidate reference material when diluted and the best conditions for reconstituting the candidate reference material to retain its characteristics. Additionally we report the between-and within-vial homogeneity of the produced material, its short and long-term stability.

Anti-β2GPI CRM and need
The Committee on Harmonization of Autoimmune Testing (C-HAT) of the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) in collaboration with the Joint Research Institute of the European Commission (EC-JRC) has studied the possibility of developing a certified reference material (CRM) with an assigned property value (anti-β2GPI IgG antibodies concentration in a matrix material). It could serve as a quality control of anti-β2GPI IgG measurements or for the calibration of immunoassaybased in vitro diagnostic devices.
For a material to be selected as a candidate CRM, it has to be commutable. In simple terms, it needs to resemble a random and typical clinical sample [7]. If such a material is found and used as a candidate CRM, then calibration bias can be minimised and the material can be of value [8][9][10][11]. We have monitored the commutability of several candidate materials since the beginning of this project. We have analysed the candidate CRM samples freeze-dried or in liquid form together with a representative number of clinical samples across various assays. At the end of our studies and trials, we did find a material that was commutable with a representative number of commercial immunoassays.

Serum samples
Clinical samples were collected based on the amounts available and their β2GPI IgG concentration. Separate commutability assessments were performed and in total 69 serum samples (30 in the first study and 39 in the second) were used. Serum samples were from patients undergoing β2GPI IgG testing and were provided by the University of Texas Medical Branch (Texas, USA) for the first study and from the Protein Reference Unit and Immunopathology Department, St. Georges' Hospital (London, UK) and from Instituto Auxocologico Italiano (Milan, IT) for the second. Each sample was anonymised following national ethical laws.

Candidate reference materials processing
Starting plasmapheresis material (approximately 4 L), collected from two patients diagnosed with APS was provided by Silvia Pierangeli from the University of Texas Medical Branch (Texas, USA). It was converted into serum by precipitation of clotting factors with protamine sulphate, delipidated and enhanced with preservatives (sodium azide, aprotinin and benzamidine hydrochloride). The material was filtered through 0.22 µm filter prior to filling into vials and freezedried. Filling precision is a critical parameter due to the following freeze-drying. Therefore, 23 vials were weighed throughout the filling process and the first 2,860 filled vials were taken to the subsequent lyophilisation step. Another 454 vials were kept aside and kept in liquid frozen form at −70°C due to the slight increase in the filling volume. The relative standard deviation of the masses of the filling solution of the main batch was 0.15%, well below the target level (<1%). Water content was estimated to be 0.81 ± 0.02%, which is safely below the target level (<2%).

Commutability and dilution studies
Two commutability studies were performed. The first one included four commercial immunoassays, 30 clinical samples and four dilutions of freeze-dried and liquid frozen candidate material. The purpose of that study was primarily to monitor the effect of freezedrying on the material. The second one included seven immunoassays, the chosen material (analysed at eight dilutions) and 39 clinical samples (CS). For the commutability studies, all ELISA measurements were performed on two plates per assay. Dilutions of the freeze-dried candidate RM and the CSs were analysed at least in duplicate on each of two ELISA plates per assay. Various dilutions were included in the analysis to ensure that the RM concentrations were within the measurement interval of the assays. All reconstitution and dilution volumes were gravimetrically controlled and the dilution levels were calculated from the masses and the densities rather than from the intended volumes. All measurements were performed at the JRC (Geel, BE). A dilution study was also done to monitor in detail the behaviour of the candidate material at different concentrations. Nine commercial immunoassays were used and eight gravimetrically prepared dilutions of the material were analysed. In all studies, the instructions and reagents of each manufacturer were used.

Statistical analysis
We compared the average results (per plate) obtained from all the immunoassays with Analyse-it (Analyse-it software, Leeds, UK). We calculated the Pearson's correlation coefficient for the clinical samples for all pairs of assays. In the cases where the correlation coefficient was higher than 0.75; a linear regression was applied and the 95% prediction interval was calculated. If the results were within the 95% prediction interval of this regression the material was considered commutable.

Homogeneity and stability of candidate RM
Fourteen randomly selected vials along the whole batch were analysed for the homogeneity study. The number of selected vials corresponds to approximately the cubic root of the total number of produced vials. Each vial was gravimetrically reconstituted and three independent samples were analysed by ELISA (Quanta Lite anti-β2GPI IgG, INOVA Diagnostics, San Diego, CA, USA). The measurements were performed randomly, under repeatability conditions to separate a potential analytical drift from a filling sequence trend.
The same set of vials was simultaneously used for the longterm stability study (up to 1 year). Vials were stored at −20°C and −70°C for 0, 4, 8 and 12 months (at each temperature). The reference temperature was set to liquid nitrogen (−150°C). Two vials per storage time and temperature were selected using a random stratified sampling scheme. For the short-term stability study, 52 vials were stored at different temperatures for 1, 2 and 4 weeks. The temperatures studied were −70°C (reference temperature), −20°C, 4°C, 18°C and 60°C, two vials for each condition and time point.

Results
Impact of processing and treatment of the CRM Freeze-drying does not negatively affect the candidate reference material A usual characteristic of freeze-dried materials is their increased long-term stability, which is desirable as long as their relevant properties remain unchanged. A small batch of the candidate reference material (RM) was freeze-dried to compare it with an RM liquid frozen format (stored at −70°C). Four anti-β2GPI IgG ELISA methods were used to measure several dilutions of both materials. Figure 1 shows that there are no significant differences (p>0.05) between the concentration values of the two materials, so freeze-drying does not alter the measured RM IgG concentration.
To characterise the freeze-dried RM, we carried out a small commutability study.
The commutability study performed included the measurement of dilutions of the freeze-dried RM alongside a number of clinical samples using the same anti-β2GPI IgG immunoassays as above. Figure 2 shows comparisons of measurement results for the RM and the clinical samples in pairs. Depending on the methods compared, the values for the clinical samples hold a linear (Figure 2A, C) or nonlinear ( Figure 2B) relationship and the depicted prediction interval illustrate the region where 95% of clinical samples measurement results are expected to lie. Even though the scatter of the values complicates the evaluation, the RM dilutions were within the prediction interval for all six comparisons (three of them are shown in Figure 2), which indicates the commutability of the material for the methods used.

Reconstitution of the freeze-dried material is a sensitive process
Reconstitution variables such as water temperature, agitation and elapsed days until the measurement were studied with regard to inter-and intra-vial homogeneity. Temperature had little impact but strong agitation had a negative effect, increasing the inter-vial variability of the measurement results (Figure 3). The optimal protocol involved reconstitution at 25°C with three to five manual inversions of the vial content.
Subsequently, within-vial variability was assessed on six groups of vials that were similarly reconstituted but measured at different days after reconstitution (0, 1, 2, 3, 4 or 7 days). Figure 4 shows that from the second day onwards, mean variability of within-vial measurement results  is significantly improved. It is therefore advised to leave the material stand at 4°C for two days before performing any measurement. Data also show no significant evidence of material instability over the studied timeframe.

Commutability Assay performance
The first step in the assessment of commutability is to ensure the performance of the assays used. Specifically, we monitored the equivalence of results from the different assays, the presence or absence of systemic bias and the effect of sample specific effects. In addition to these, we monitored whether the candidate reference material was equivalent to the routine clinical samples. Monitoring of these parameters (i.e. repeatability and intermediate precision) was evaluated and ensured through the experimental design that allowed for sufficient number of replicates (minimum 3). We concluded that the scatter observed was because of sample specific effects and not because of the assay repeatability and performance. We calculated the coefficient of variation (CV) of averages both within and between plates for the absorbance signal produced by the assays at the wavelength suggested by the manufacturers and the concentrations calculated from these absorbances. The results, whether as ODs or converted into concentrations, showed a relatively low intraplate variation for all ELISA assays (below 15%). On average, the variations were lower for the optical densities (ODs) than for the converted concentration values.
There were some situations where the CVs were higher than 15% but these were particularly seen when the concentration of the CSs were at the extreme ends (whether lower or higher) of the respective calibration curves, where small changes in signal have a big effect on the concentration. Some assays use less than five calibrators, which are not evenly distributed across the measuring range further, which may distort the calibration curve and introduce further variability into the assays.

The RM analytical response upon dilution is similar to that of clinical samples
Routine anti-β2GPI IgG immunoassays require clinical samples to be diluted in an appropriate buffer so that their IgG concentration value lies within the analytical measuring range. Likewise, a reference material needs to be diluted to generate the calibration curve. It is therefore essential that the relationship between the analytical response (OD) and the concentration be sufficiently similar for clinical samples and for the reference material used for calibration. This property is frequently referred as parallelism and not only has it obvious implications in the accuracy of the concentration estimates but the lack of it can also contribute to poor correlation between different methods, as seen before in Figure 2.
To investigate this property, dilution profiles were obtained for several clinical samples and the candidate RM using three different methods. Dilution effects may influence the assay response (OD values) to the concentration of the analyte (expressed as relative dilutions). Figure 5A shows that the assay response to varying concentrations of analyte in the CS and the candidate RM is equivalent (Eurodiagnostica, anti-β2-Glycoprotein I IgG ELISA test). In other words, there is parallelism between the RM and CS dilution curves. Therefore, the RM could be diluted to produce a calibration curve that can be used to accurately calculate the analyte concentration in the clinical samples.
In an equivalent manner, when CS values are normalised to the RM concentration, dilution curves overlap and most values fall within the prediction interval determined by the RM dilution curve ( Figure 5B). The other two methods employed produced similar results (not shown), so no evidence for lack of parallelism was found.

Correlation between results from different assays
The degree of correlation of results from different assays was evaluated by a pairwise comparison of results from all assays. The Pearson correlation coefficient (r) was calculated and the result interpretation was based on clinical biostatistics rules [12] according to which when the r value ranges between 0.75 and 1, the correlation can be considered to be very good to excellent. Tables 1 and 2 shows the Pearson correlation coefficients for all assay comparisons. They mostly varied from moderate to very high for most assays. For 15 assay pairs the value of r was above 0.75, indicating a good correlation. Despite the generally good picture of the results, there were certainly some clinical samples that were outliers in some of the comparisons.

Commutability of candidate reference materials
The main purpose for performing the described commutability studies was to select the best material for the eventual development of a CRM i.e. a material that would behave in the same way as an average clinical sample. The first, small-scale commutability study served to evaluate the impact of freeze-drying on the commutability and protein stability of the material. In the second study, having confirmed that the material behaved as required, more assays and 39 clinical samples were tested alongside a series of dilutions for the candidate RM.
The results were evaluated for all assay combinations for which the Pearson' correlation coefficient was above 0.75. In correlation coefficients below 0.75, the prediction interval is very broad. Results were analysed using Passing-Bablok and Deming regressions and results were evaluated with respect to the 95% prediction interval.
Overall, the commutability of the m the same manner as the material ERM-DA470k/IFCC which is certified amongst others for total IgG, and for which the commutability studies were done in a similar way [13].

Homogeneity and stability of candidate RM
The material was found sufficiently homogeneous. Withinunit homogeneity determines the minimum size of an aliquot that is representative for the whole unit. The between-unit homogeneity is important for ensuring that any vial chosen randomly for analysis is similar to any other.
The number of selected vials corresponds to approximately the cubic root of the total number of the produced vials; they were randomly selected using a random sampling scheme covering the whole batch. Statistical analysis of the received data showed no outlying unit means or trends in the filling sequence.
The candidate reference material was also found stable under controlled temperature conditions at 4°C for up to 4 weeks during shipment (analysed with Phadia EliA β2GPI IgG assay) and −70°C for one-year storage (analysed with Quanta Lite® β2GPI IgG ELISA, INOVA Diagnostics) ( Figure 6A, B, respectively). The obtained data were evaluated individually for each temperature and the results were screened for outliers using the single and double Grubbs test. We plotted regression lines for the concentrations versus time and tested for statistical significance. The slopes of the regression lines were not significantly different from Concerning the materials' long-term stability, vials were stored at −20°C and −70°C for 0, 4, 8 and 12 months (at each temperature). The reference temperature was set at liquid nitrogen (−150°C). Analysis of two vials per temperature showed that the material could be safely stored at either of the tested temperatures. The candidate reference material for anti-β2GPI IgG has been also tested for its stability five years after its production and was found similarly stable (data not shown).

Development of the CRM
The impact of the lack of standardisation on laboratory results can unarguably be severe, as it has already been shown in several EQAS reports [10,14]. To be of any value, a laboratory result needs to be accurate and conclusive. This is the only way to ensure that the requesting clinician will be assisted in delivering correct diagnosis, prognosis and monitoring of a disease. Undoubtedly, most of the assays available in the market are able to deliver independently quality results. Nevertheless, these results may vary between the assays causing confusion. Hutu et al. has shown in a commutability study for one of the biomarkers for small vessel-associated vasculitis, that only 10 out the 30 clinical samples tested had the same classification amongst the assays used [15]. This variability decreased when the developed CRM was used for recalculation of the results [16].
The results of these studies as for many others in the field of clinical chemistry or immunology or haematology, confirm the variability in anti-β2GPI IgG testing, even though all participating assays and laboratories perform generally well in terms of precision. From the studies we have finalised, we have identified a material that is commutable when freeze-dried. We have performed studies confirming its short-term (during transport) and long-term (during storage) stability under various conditions of temperature and time. For the assignment of a concentration value to this material, we are in close collaboration with the National Institute for Biological Standards and Control (NIBSC). Our findings are to be submitted to the WHO in the near future with a request of assignment of units in IU/  vial to the material. Assuming a success of our efforts, such a material could be used to reduce inter-and intra-assay variability, eventually improving [5] lot-tolot variation [15,17].