Assessing the commutability of candidate reference materials for the harmonization of neuro ﬁ lament light measurements in blood

Objectives: Neuro ﬁ lament light chain (NfL) concentration in blood is a biomarker of neuro-axonal injury in the nervous system and there now exist several assays with high enough sensitivity to measure NfL in serum and plasma. There is a need for harmonization with the goal of creating a certi ﬁ ed reference material (CRM) for NfL and an early step in such an e ﬀ ort is to determine the best matrix for the CRM. This is done in a commutability study and here the results of the ﬁ rst one for NfL in blood is presented. Methods: Forty paired individual serum and plasma samples were analyzed for NfL on four di ﬀ erent analytical platforms. Neat and di ﬀ erently spiked serum and plasma were evaluated for their suitability as a CRM using the difference in bias approach. Results: The correlation between the di ﬀ erent platforms with regards to measured NfL concentrations were very high (Spearman ’ s ρ ≥ 0.96). Samples spiked with cerebrospinal ﬂ uid (CSF) showed higher commutability compared to samples spiked with recombinant human NfL protein and serum seems to be a better choice than plasma as the matrix for a CRM. Conclusions: The results from this ﬁ rst commutability study on NfL in serum/plasma showed that it is feasible to create a CRM for NfL in blood and that spiking should be done using CSF rather than with recombinant human NfL protein.


Introduction
Neurofilament light chain (NfL) concentration in cerebrospinal fluid (CSF) is considered a sensitive neuronal biomarker and is thought to reflect neuro-axonal injury in the nervous system [1].NfL concentrations are increased in several conditions, e.g., amyotrophic lateral sclerosis [2], multiple sclerosis [3], stroke [4], Alzheimer's disease [5], and in the post-acute phase following traumatic brain injury [6].NfL concentrations in CSF and plasma/serum are strongly correlated [7].Considering that blood samples are more accessible and easier to obtain than CSF (which requires a lumbar puncture), NfL measurement in blood samples holds greater potential for use in a clinical practice setting.This potential has been acknowledged to the level that several clinical chemistry laboratories in Europe now have this measure in their portfolio.
In 2019, CSF NfL was added to The Alzheimer's Association QC program for CSF and blood biomarkers followed by plasma NfL in the following year (neurochem.gu.se/TheAlzAssQCProgram).
NfL measurements are performed in clinical settings using immunoassays developed by various manufacturers.To ensure consistency in their determination, it is of importance to have accurate measurement results that optimally also should be equivalent regardless of which analytical platform is used.A favored way is to establish metrological traceability of results to reference methods or certified reference materials (CRMs) that can be used by the different manufacturers to calibrate their assay against and harmonize the NfL values across different platforms.Ideally, this work is done in a collaborative manner including different stakeholders, e.g., assay manufacturers, end users, and metrology institutes.A CRM must exhibit a behavior close to that of clinical samples when measured with different methods.This closeness of agreement is a property called commutability, which is more stringently defined in the International Vocabulary of Metrology [8].Similar to CRMs that are used in a calibration hierarchy, commutability of external quality assessment materials is essential to monitor results harmonization across different platforms [9][10][11][12].
The International Federation of Clinical Chemistry and Laboratory Medicine has a working group that has published recommendations for assessing commutability [13,14].Such initiatives are of great help in showing the way that results from commutability studies should be presented and evaluated.To assess the commutability of candidate reference material is an important step in the process of creating CRMs.There are some already published comparisons between different methods for NfL in blood [15][16][17][18] but the current report is the first commutability study on blood NfL.

Samples
All 40 clinical samples, each with paired serum and plasma (EDTA K3 monovette, Starsted), and seven candidate reference materials were frozen (centrifugation at 2,000 g for 10 min at room temperature and frozen at −70 °C within 1 h) research samples from single individuals (no pools) from University Hospital Basel, Switzerland (UHB), which were aliquoted (200 μL) and sent frozen on dry ice to the different laboratories.Materials were stored at −70 °C pending the analyses, which were performed in duplicates using six assays on four different platforms (Table 1).In addition to three different native serum and EDTA plasma samples (1-3 and 8-10 in Table 2), four other candidate reference materials (4-7 and 11-14) were prepared by spiking in 30 pg/mL or 100 pg/mL of NfL either from a CSF sample at UHB with a NfL concentration measured on the Simoa platform, or a recombinant full length human NfL protein expressed in E. coli (EnCor Biotechnology Inc., Gainesville, FL, Cat#: PROT-r-NF-L).The characteristics of the different candidate reference materials are shown in Table 2.

Analytical platforms
All four commercial assays were used according to the instructions from the manufacturer.Three of the platforms use the same antibody pair, UD1 and UD2 from UmanDiagnostics, while company policies prevented the publication of the antibody identities for the Olink assay.
(a) Single molecule array (Simoa) assay: The Simoa technology is a type of ultra-sensitive digital immunoassay that has been around for about 10 years [19] and the blood test for NfL was first developed as an in-house assay at University of Gothenburg [20] and later became commercially available from Quanterix (Catalogue#: 103186).This assay was used at two locations, Quanterix and UHB, while the in-house variant was run only at UHB.The measurements on the in-house assay were performed as previously described [21].
(b) Olink: The Olink 96-plex technology was introduced to the market in 2013.The technology is based on the proximity extension assay technology and uses pairs of matched antibodies linked to unique DNA sequences, providing a dual-recognition assay with DNA readout.The signal can be read out using quantitative polymerase chain reaction [22] or next generation sequencing [23].In this study the Target 96 Neuro Exploratory panel, which includes the NfL biomarker, was used.Target 96 panels measure 92 protein biomarkers in parallel for 88 customer samples per plate.
Inter-plate normalization was performed by normalizing the data for each plate to the included Olink Control Sample, and then inverse log transformed for analysis.Results are reported in the relative quantification unit Normalized Protein eXpression.(c) Ella: Ella/Simple Plex (ProteinSimple) is an automated immunoassay platform which is commercially available since 2014.Ella utilizes a microfluidic design in a cartridge format, where sandwich immunoassays take place within glass nanoreactors.Measurements were generated using the NfL assay on Ella [15,24] (Part# 002448) and analyzed using the platform-provided software.
(d) Atellica ® Solution: The Atellica ® Solution is an automation-ready platform featuring multiple modules including immunoassay and chemistry analyzers.The immunoassay module utilizes chemiluminescent technology and has been commercially available since 2017.The current version of the NfL assay is a fully automated, two-site sandwich immunoassay developed by Siemens Healthcare Laboratory on the Atellica IM to offer as a testing service for clinical trial applications [25].

Statistics
Passing-Bablok regression [26] and Spearman's ρ were used for overview of the pairwise method comparisons while the commutability was evaluated using the difference in bias approach as described in part 2 of the IFCC recommendations for assessing commutability [14] for which all data were ln-transformed and the commutability acceptance criteria was set at 25%.For the method comparisons, a candidate reference material was judged commutable/non-commutable if the corresponding error bars were completely within/outside the acceptance interval, and inconclusive otherwise.Serum and plasma comparisons were analyzed using Passing-Bablok regression and Spearman's ρ.

Results
For the clinical samples, all the different methods showed very strong correlations, with mean Spearman's ρ≥0.96, for both plasma and serum.Method comparisons for all pairwise combinations of the included six assays are shown in Figures 1 and 2 for serum and plasma, respectively.In supplemental Figures 1 and 2 the recommended, corresponding, plots for assessing commutability are shown.The distribution between the classes commutable, inconclusive, or noncommutable for each candidate reference material is shown in Figure 3.In general, the native (unspiked) candidate reference materials (red triangles) showed the highest frequency of commutable events and the candidate reference materials spiked with human CSF (yellow diamonds and circles) did not show a clear deviation from the clinical samples.This was not true for the samples spiked with recombinant, human NfL (blue diamonds and circles), where sample spiked with the highest concentration deviated the most from the clinical samples both in serum and plasma.In total, the fraction of commutable candidate reference materials was higher in serum (43%) than in plasma (22%), and the corresponding figures for noncommutable candidate reference materials were 5.7% and 14%, respectively.The sample that performed worst with regard to commutability was the one spiked with the highest concentration of recombinant NfL where the noncommutable fraction was 40% in both serum and plasma.For all of the candidate reference materials the majority of the inconclusive events had a mean concentration within the acceptance range but with an error bar that crossed it (Figure 3).
Excluding Olink due to reported relative concentrations, the mean absolute difference in bias for the clinical samples between the methods was 39% for serum and 42% for plasma and in some pair-wise comparisons the difference in bias was not constant over the concentration range (e.g., Supplementary Figure 1C).
Spearman's ρ between the concentrations in paired plasma and serum were very strong overall (r s ≥0.95).The measured NfL concentration was higher in serum than in plasma for the Simoa and Olink platforms while Pro-teinSimple and Siemens exhibited less difference between the two sample types (Figure 4).

Discussion
Here, we report the first commutability study for plasma and serum NfL.Taken together, the results are promising and   different assays have similar selectivity and that differences between assays are caused by a calibration bias that could be eliminated by standardizing calibration with common CRMs.In follow up commutability studies, pools of plasma and serum (both neat and spiked with human CSF) should be evaluated since a single donor is not reasonable for making a CRM due to the large volume needed to make the material available during a prolonged period of time.
Although acceptance criteria were quite large (25%), the objective of the present study was not to formally assess commutability of candidate reference materials.Instead, the study aimed at (i) comparing commutability levels of various candidate reference materials so as to identifying the most suitable matrix for CRMs, and (ii) help designing subsequent commutability studies.This goal was achieved by demonstrating that blood samples spiked with CSF have much better commutability levels than blood samples spiked with recombinant NfL, which seems to compromise the sample matrix, especially at high concentrations.There is also an indication that the statistical assessment should be split in different concentration ranges in the next study since the bias, for some comparisons, was not constant over the concentration range.Further, the large rate of inconclusive results is due to large pooled estimated uncertainties associated with the difference in bias for the candidate reference materials, which can be due to assays imprecision and/or sample specific effects.A majority of the inconclusive results have their mean value within the acceptance interval which means that they may be found commutable if the uncertainty associated to imprecision was decreased.Data analysis shows that uncertainty due to imprecision is approximately 3-fold larger than uncertainty caused by samples specific effects.This means that to reduce the uncertainty associated with the bias, performing more replicates of the samples would be more beneficial than increasing the number of clinical specimens.
Overall, obtaining commutable materials seems achievable but once the right matrix for CRMs has been identified, their NfL concentrations have to be certified.The ideal option would be to develop a higher order reference   method available in a near future is unclear.However, a candidate reference measurement procedure is under development in CSF and seems achievable.As candidate reference material consisting of plasma or serum spiked with CSF seem to present appropriate commutability levels (with NfL concentration being much higher in CSF than in blood [4,20]), an option could be to gravimetrically spike serum or plasma samples with a known amount of CSF, the NfL concentration of which has been certified with the reference measurement procedure.Should this strategy turn out to be unsuitable, an alternative would be to assign concentration to the CRMs using a panel of immunoassays and perform recalibration of immunoassays with a harmonization protocol as described in ISO 21151:2020 [30].However, a first drawback lies in the fact that this strategy would not produce ensuring results with metrological traceability to the international system of units.Another limitation is that some of the currently available immunoassays measuring blood NfL are still in an early stage of development, implying that these could be subject to changes in calibration and/or changes in specificity if assays get optimized.This may compromise the ability to consistently assign target values to reference panels over the long run, which could in turn cause shifts in calibration when a lot of reference panel expires and/or is out of stock and needs to be replaced.Another challenge to produce CRMs is to select the most relevant matrix between serum and plasma.Since there are differences between some of the assays when it comes to measuring NfL concentration in these fluids, it will be challenging to produce CRMs for NfL that can be used for both serum and plasma unless changes are made to the methods that will result in the same measured concentration for both sample types.More likely, a choice has to be made to make CRMs either for one or the other.Our results indicate that obtaining commutable materials seems more easily achievable in serum than plasma but the clinical practice must be consider before making a decision.The high correlation between the different assays is likely a consequence of the fact that the same antibody pair is used in three out of four platforms.This could be viewed as a limitation of the study but since NfL in blood is established as a biomarker for neuronal injury, it can be anticipated that other assays with other pairs of antibodies, or even antibody-free assays, are likely to be available in future commutability studies.A potential issue with this is that CRMs for which commutability proved to be appropriate for a set of assays may not be commutable for other assays that will be developed in a later stage.
In conclusion, this initial commutability study for NfL in blood is positive and encourages further studies with an improved design.Future studies should include pools of blood-based candidate reference materials that, in addition to being neat, have been spiked with human CSF to increase the NfL concentration.Our findings suggest that spiking serum or plasma with endogenous NfL from CSF is preferable to spiking with recombinant NfL.

Figure 1 :
Figure1: Passing-Bablok regression and Spearman's rank correlation coefficient (r s ), using only the clinical samples, of the measured serum concentrations of NfL for pairwise comparisons between methods listed in Table1.Solid black line, regression line; blue dotted lines; 95% confidence interval; red dashed line, unity line (x=y), grey circles, clinical samples; red triangles, neat serum; blue diamonds and circles, serum spiked with 30 and 100 pg/mL using recombinant, human NfL, respectively; yellow diamonds and circles, serum spiked with 30 and 100 pg/mL NfL using human CSF, respectively.

Figure 3 :
Figure 3: Summary of the results of the candidate reference materials in serum (A), and plasma (B) as listed in Table2.Commutable, green diagonal fields; inconclusive, yellow horizontal fields; and non-commutable, red vertical fields.The numbers over each bar represent the percentage of the inconclusive samples with an average concentration within the 25% acceptance range.

4 :
Passing-Bablok regression and Spearman's rank correlation coefficient (r s ), using only the clinical samples, of the measured plasma and serum concentrations of NfL for the methods listed in Table1.Solid black line, regression line; blue dotted lines; 95% confidence interval; red dashed line, unity line (x=y).

Table  :
Information about the assays.
a Names are used in the Figures.Simoa, single molecule array; PEA, proximity extension assay, UHB, University Hospital Basel; CL, chemiluminescence.

Table  :
Description of the different candidate reference materials.
encourage further studies with the goal of creating CRMs for NfL in blood.Despite that assays correlate well with each other, it is clear that CRMs are needed due to the large differences in measured concentrations between the different assays, for both serum and plasma.The strong correlations between the different methods indicate that the

Table 1 .
Solid black line, regression line; blue dotted lines; 95% confidence interval; red dashed line, unity line (x=y), grey circles, clinical samples; red triangles, neat serum; blue diamonds and circles, serum spiked with 30 and 100 pg/mL using recombinant, human NfL, respectively; yellow diamonds and circles, serum spiked with 30 and 100 pg/mL NfL using human CSF, respectively.