Are hemoglobin A 1 c point-of-care analyzers fi t for purpose ? The story continues

Objectives: Point-of-care (POC) analyzers are playing an increasingly important role in diabetes management but it is essential that we know the performance of these analyzers in order to make appropriate clinical decisions. Whilst there is a growing body of evidence around themore well-known analyzers, there are many ‘new kids on the block’ with new features, such as displaying the presence of potential Hb-variants, which do not yet have a proven track record. Methods: The study is a comprehensive analytical and usability study of six POC analyzers for HbA1c using Clinical and Laboratory Standards Institute (CLSI) protocols, international quality targets and certified International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) and National Glycohemoglobin Standardization Program (NGSP) Secondary Reference Measurement Procedures (SRMP). The study includes precision (EP-5 and EP-15), trueness (EP-9), linearity (EP-6), sample commutability (fresh, frozen and lyophilized), interference of Hbvariants (fresh and frozen samples). Results: Only two of the six analyzers performed to acceptable levels over the range of performance criteria. Hb-variant interference, imprecision or variability between lot numbers are still poor in four of the analyzers. Conclusions: This unique and comprehensive study shows that out of six POC analyzers studied only two (The Lab 001 and Cobas B101) met international quality criteria (IFCC and NGSP), two (A1Care and Innovastar) were borderline and two (QuikReadgo and Allegro) were unacceptable. It is essential that the scientific and clinical community are equipped with thisknowledge inorder tomakesounddecisionson theuseof these analyzers.


Introduction
Diabetes is a global health burden and a leading cause of morbidity and mortality worldwide. It is estimated that up to 50% of people with diabetes are currently undiagnosed, and there is an urgent need for rapid, accurate and timely diagnostic testing to identify those both with and at risk of the disease [1].
Point-of-care (POC) analyzers play an increasingly important role in wide range of clinical settings and there is increasing desire from clinicians to have access to more POC tests and a wider range of tests [2]. There is belief that POC enables faster clinical decision making, increased rapport with patients and reduced referrals to secondary care and subsequent healthcare costs. Over 50% of primary care physicians surveyed by Horwick et al. (2014) wanted increased access to HbA 1c POC testing, bespeaking a clear demand for HbA 1c POC [3]. However, Jones et al. (2013) also highlighted an apparent nervousness amongst primary care physicians around the accuracy of POC testing [4]. Coupling the desire for increased HbA 1c POC availability and the prudent concerns on quality it is essential that we understand how well POC HbA 1c analyzers perform.
Understanding the quality of POC testing has been a topic of key interest for over a decade with the stark message of six out of eight analyzers not meeting the accepted quality criteria in 2010 [5]. Since this seminal study, there have been numerous evaluations of POC HbA 1c performance with a focus around the more common analyzers [6]. External quality assessment (EQA) provides a snap shot of 'real world' data on the performance of POC analyzers, although only a fraction of analyzers in use are currently enrolled in EQA schemes [7,8].
Whilst there is a growing body of evidence around the more well-known analyzers, there are many 'new kids on the block' with new features, such as displaying the presence of potential Hb-variants, which do not have a proven track record. Clinicians and laboratory scientists need robust and rigorous evaluation data on performance, acceptability and usability of POC analyzers to support informed decision making around use of POC testing analyzers.
The evaluation of POC analyzers is not without issue. Whilst many HbA 1c POC analyzers are scaled down versions of laboratory analyzers they have their own unique differences which require adaptations in order to complete a comprehensive evaluation. One key issue is that several POC analyzers are not compatible with frozen or lyophilized blood samples, meaning conventional evaluation protocols cannot be directly applied.
Whilst there are numerous method comparisons published, it is important to note that these are often single comparisons to routine laboratory methods (which will have their own imprecision and bias to consider), which provide insight into local performance but do not provide a robust picture of performance against internationally accepted secondary reference measurement procedures (SRMPs) or international quality criteria [9,10].
This study aims to understand the performance of a range of POC HbA 1c analyzers using a rigorous evaluation protocols which examines issues such as; interference from Hb-variants with fresh and frozen samples, sample compatibility (fresh, frozen and lyophilised) and system usability whilst comparing analytical performance to International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) and National Glycohemoglobin Standardization Program (NGSP) SRMPs.

Analyzers evaluated
The six POC analyzers included in this study and their key characteristics are summarized in Table 1. The choice of analyzer was both manufacturer led (Allegro, QuikReadgo and The lab 001) and investigator led (Cobas B101, InnovaStar and A1Care). The manufacturers of the last three analyzers were approached by the authors as previous evaluations had highlighted some performance issues. An initial familiarization protocol was undertaken with each instrument and the results were shared with the manufacturers to enable them to decide if they wished to continue to a full evaluation. This supports a collaborative approach to working with manufacturers with the aim to improve quality. In some cases when a product is new in development, feedback at an early stage will enable further development before a product is brought to full evaluation, saving time and resources [11]. Six analyzers were fully evaluated, however two new analyzers were not ready yet for a full evaluation.

Imprecision study (EP-5 and EP-15)
The Clinical and Laboratory Standards Institute (CLSI) EP-5 protocol was used to investigate assay imprecision. It is known that some POC methods have a bias with frozen material and as it is not known if frozen samples have an impact on the imprecision, also EP-15 was performed with two fresh patient samples (HbA 1c values of 48 and 75 mmol/mol). Both samples were analyzed five-fold for five days. Imprecision was also calculated on the basis of the duplicates of the fresh patient samples in the EP-9 protocol.
Linearity (EP-6): Linearity was assessed using the CLSI EP-6 protocol. After adjustment for Hb concentration, patient samples with a low HbA 1c value and a high HbA 1c value were mixed in incremental amounts to generate a series of equally spaced samples over a broad HbA 1c concentration range. Eleven samples were analyzed in duplicate in one day. The samples were made fresh and then frozen at −80°C degrees until analysis. Whilst some analyzers display a bias with frozen samples this is generally a consistent bias and therefore these can still be used to assess linearity. The difference between the fitted values of the best polynomial line and the regression line for the 11 samples were compared. CLSI states for EP-6 that goals for linearity should be derived from goals for bias, and should be less than or equal to these goals [14]. The IFCC Task Force on Implementation of HbA 1c Standardization has set an TAE of 10% at an HbA1c concentration of 50 mmol/mol (19). Taking into account the whole clinical relevant range, we have set a TAE of 6 mmol/mol with a nonlinearity budget of 50% (=3 mmol/mol). If the deviation exceeds allowable nonlinearity (3 mmol/mol) the data was considered nonlinear.
Hemoglobin variants AS, AC, AE, AD, elevated A2 and HbF: Twenty patient samples of each heterozygous Hb variant, from our frozen whole blood biobank, were measured on each of the different POC analyzers. Values were assigned using an IFCC calibrated boronate affinity HPLC (Premier Hb9210). For samples with increased HbF, HbA 1c values were assigned using an IFCC calibrated cation-exchange HPLC (Menarini HA8180V, Diabetes Mode, (frozen) and Tosoh G8 (fresh)). Percentage HbF (3.5-42.0%) was determined using the Sebia Capillarys 2 Flex Piercing Hemoglobin program.
In addition to the frozen samples, 16 HbAS, seven HbAC, five HbAD, nine HbAE and four HbF (9.1, 20.5, 20.8 and 27.5%) fresh Hb-variant samples were also analyzed on each analyzer as two of the analyzers (InnovaStar and QuikReadgo) showed a bias with frozen samples.
Any bias observed due to the presence of variants is a compound of both the bias in normal samples (identified by the EP-9 protocol) and the bias associated with the variant. In order to account for this, the results were adjusted for the bias found during EP-9 (Premier Hb9210 for fresh and frozen Hb-variant), thus any residual bias would be due to the Hb-variant. For the two analyzers that also display a bias with frozen samples, the bias correction was done using the 24 frozen EQA samples rather than the EP-9 data. Whilst this is not a perfect solution it avoids a two-step correction.
For an Hb variant to be considered as not causing a clinically relevant interference, the results of the Hb variant should fall within a defined scatter line of ±10% (SI units) of the regression line derived from the comparison of the test instrument and the Premier Hb9210 with the nonvariant samples (HbAA).
Schiff Base, icteric samples, different hemoglobin concentrations: In order to create labile samples (Schiff Base) 12, 16 and 20 mg/mL glucose was added to aliquots of high, medium and low HbA 1c , EDTA samples. Icteric samples were generated by removing the plasma of a non-icteric sample and replacing with plasma with 219, 236 and 258 μmol/L bilirubin (icteric sample), again at three different HbA 1c levels. Similarly addition or removal of plasma was used to create a range of samples with varying hemoglobin levels. The samples were stored frozen at −80°C until analysis. A mean relative difference of ± 10% (in SI units) pre and post treatment of the samples, was considered a significant interference.
EQA programs (assessing sample commutability): In order to assess sample commutability, samples from both the IFCC Certification Program for manufacturers [15] and the European Reference Laboratory for Glycohemoglobin (ERL) EQA Program [16] were used to provide data on frozen and lyophilized samples respectively.

System usability scale (SUS):
This study included an SUS score generated by the two technicians who performed the evaluation study. SUS is a simple technology diagnostic tool consisting of 10 questions which gives a global view of subjective assessment of the usability of the device tested [17].
An SUS score >81 can be considered as excellent, between 71 and 80 as good, between 52 and 70 okay and <51 poor [18].
Defining the quality criteria: International Quality Standards: This study used the previously published global guidance on acceptable quality and performance criteria for HbA 1c testing from the IFCC Task Force on Implementation of HbA 1c Standardization [19].
NGSP Manufacturer Certification Criteria: Thirty six out of 40 results must be within 5% (relative) of an individual NGSP SRMP to pass certification [20].

Results
Imprecision (EP-5 and EP-15 Table 2 displays the CVs derived from both EP-5 and EP-15 and the duplicates from the EP-9 protocol. Only the Lab Lenters-Westra and English: HbA 1c POCT: the story continues   this table and  Table 3 it can be seen that only the Cobas B101 passed the NGSP criteria with two lot numbers compared to all four individual SRMPs and that the QuikReadgo and the Allegro failed the NGSP criteria with both lot numbers for all four individual SRMPs. Figure 1 shows the regression lines for each POC device vs. the mean of the SRMPs. All analyzers suffered some degree of bias. The Lab 001 and the Cobas B101, had the least bias across the lots and the HbA1c range and all other POC analyzers had a statistically significant difference either between the lot numbers or at different HbA 1c levels or both and showed a large dispersion around the deming regression line compared to the mean of the SRMPs.

Linearity (EP-6)
Supplemental Table 2 details the results of the linearity study. The maximum deviation is shown between the fitted values of the best polynomial line and the regression line for the 11 samples. If the deviation exceeds allowable nonlinearity (3 mmol/mol) the data was considered nonlinear. Based on this criterion all POC analyzers were linear except for the Cobas B101 and the InnovaStar. However, the detection limit of the InnovaStar was >30 mmol/mol. Excluding the lowest sample for the calculations showed that the InnovaStar was linear. The HbA 1c result of the highest sample for the Allegro was above the detection limits (>130 mmol/mol) therefore the linearity was assessed with 10 samples instead of 11.
Hemoglobin variants AS, AC, AE, AD, elevated A2 and HbF Table 4 shows the mean relative difference of the frozen and fresh Hb-variants samples and Supplemental Figures 1-6 show the graphs of the interference of Hb-variants for frozen and fresh Hb-variants for the different POC analyzers. All methods, except for the A1Care, had an interference with one or more of the Hb-variants with frozen or fresh samples (mean relative difference was >10%). All Hbvariants were detected and correctly identified by the Lab 001 via an S-, C-, D-, E-or F-window.

Schiff Base, icteric samples, different hemoglobin concentrations
None of the POC analyzers showed an interference for Schiff Base, icteric samples or different hemoglobin concentrations. Supplemental Tables 3-5 show the data.

Sample commutability
To investigate the impact of sample type in relation to different clinical applications, fresh (EP-15 and EP-9), frozen (EP-5 and IFCC certification program samples) and lyophilized (ERL EQA scheme) samples were compared. Figure 2A, B shows the data from EP15 and EP-9 (red circles) representing all fresh samples, IFCC certification samples (blue circles) representing all frozen samples and the ERL EQA (green circles) representing lyophilized samples. In addition, Figure 2B shows EP-5 and EP-9 data (purple circles) in order to assess the impact of using EP-5 vs. EP-15 for routine method evaluations. The EP-5 and EP-15 studies were both used to compare performance with fresh and frozen samples and for the majority of analyzers the EP-15 evaluation provided sufficient data to assess performance. The data clearly demonstrates that lyophilized material was only commutable with The Lab 001, all other POC analyzers showed a large positive bias when analyzing lyophilized material. Frozen material was not commutable with the InnovaStar and the QuikReadgo. When using fresh patient samples all, except the Allegro, passed the IFCC criteria of having a sigma >2 at an HbA 1c concentration of 50 mmol/mol. Conversely the Allegro actually performed better when using frozen samples instead of fresh samples. Table 1 shows the SUS scores of the different POCT analyzers. The usability of all the POCT analyzers was good to excellent (mean SUS score > 80) except for the QuikReadgo (mean SUS score was 60).

Discussion
What progress has been made?
In the study of 2014 the InnovaStar showed an interference with fresh patient samples, which was likely due to the  instrument being calibrated using frozen samples [23]. The previous publication led the manufacturer to switch to fresh patient samples, which are available from the ERL, to calibrate their cartridges resulting in lower bias in fresh samples [24]. The Lab 001 device is new to the market and the sigma graphs show excellent performance however there was a small bias at higher HbA 1c levels. Paradoxically, had the imprecision of the instrument been higher than the bias would not have been detected as the confidence intervals would be wider. This device shows that the field of POCT has moved on and quality improvements are possible.
There are still significant issues with the performance of some analyzers The key issues we still see are: a) lot to lot variability, b) high imprecision and c) significant interference from variants. Four out of the six evaluated analyzers still do not demonstrate acceptable and/or consistent performance.
The InnovaStar had poor performance between lot numbers that was not acceptable. The new to the market QuikReadgo also showed a statistically significant difference between the lot numbers, and failed to meet the NGSP criteria with either lot number when compared to any of the four SRMPs. The Allegro also showed similarly poor performance. It should be noted that both the Allegro and A1Care were evaluated in a previous study (data not presented) in which the results were acceptable, to good. As the data for certain elements of the current study was acceptable it is likely that there is an inconsistency in the manufacturing chain that needs to be identified. The considerable variability in performance across the analyzers, albeit less than in earlier studies, shows that there is still work to be done [6].
A key issue with POC analyzers still appears to be interference from variant hemoglobins. A complicating factor when evaluating Hb-variants is the fact that some analyzers (QuikReadgo and Innovastar) are not compatible with frozen samples. This study addresses this issue with the use of fresh Hb-variant samples. However it was not possible to obtain as wide a range or number of fresh Hbvariant samples for investigation as would be desirable. Explaining the interferences seen in these analyzers, which are nearly all immunoassay, is difficult, why would frozen and fresh samples perform so differently? Why are they causing an interference at all when the epitopes that the antibodies bind to do not contain the mutations that cause the Hb variant? HbE for example is an inherited single base mutation at codon 26 of the beta-globin gene, leading to substitution of glutamic acid (46 amino acid of the beta chain) for lysine which should not, theoretically, interfere with the antibodies used in the POC analyzers. A possible explanation could be that the mutation causes folding of the hemoglobin molecule in such a way that position 46 are then very close to the first four amino-acids of the beta chain and interacts with the antibodies used in the immunoassay [25]. Alternatively this may be due to differences in the immobilization of the antibodies which lead to differences in the surface chemistry and thus the binding of the antibody.  The Lab 001 suffers from interference with fresh HbAS samples, but less so with frozen samples which are easier to obtain for method development. This does potentially pose a problem for patient samples, however this is mitigated by the fact that Lab 001 as a capillary electrophoresis method is capable of identifying the presence of a variantunlike most other POC analyzers.
It is important to clarify that some of the manufacturers do clearly state that the presence of variants may alter the HbA 1c results however these claims and the findings of this study do not always correlate.
It is not all about analytical performance Whilst many evaluations focus on analytical performance, it is important to consider the wider context of the use of POC analyzers. The usability/user-friendliness of each of the analyzers was assessed and found to be variable.
A crucial factor in the practical, clinical usability of an instrument is how long it takes to generate a result, with the benefit of providing real time results often touted as a key selling point for POC analyzers. The time from a 'cold start' (turning the power on and warming reagents if needed) to a result was assessed. The range of time needed was wide at ̴ 3.0 min for The Lab 001 to ̴ 16.5 min for the InnovaStar. This is important information for users of the analyzers as they may have little advance warning that a test may need to be undertaken.

Key messages
This complex and detailed evaluation provides a comprehensive overview of six HbA 1c POC analyzers. Whilst there are areas of excellence in performance there are still significant areas for improvement with the performance of some being unacceptable. It is possible for an analyzer to meet certification criteria for the IFCC and/or NGSP and perform well in one evaluation and then perform very poorly in subsequent evaluations. From a clinical and scientific perspective this is alarming. It is essential that performance of an analyzer is stable, especially with increased use for both monitoring and diagnosis of people with diabetes. Four of the analyzers in this study showed highly variable performance which is not acceptable.
It is unclear why such discrepant results are seen when fresh or frozen samples are used, especially as this is not commonly seen with routine laboratory analyzers. Whilst many evaluations and method development often necessitate the use of frozen samples, it is rare that a POCT would be used with anything other than fresh samples. We have shown here that there can be marked differences in performance with each sample type.
One way to identify variability in performance is through the use of EQA schemes. The authors strongly advocate the use of EQA to identify ongoing performance issues, and although POC analyzers are often exempt from the need to participate in EQA it is a valuable and powerful tool for monitoring performance. A caveat to this is that POC analyzers may not be able to utilize the frozen or lyophilized samples often used in EQA schemes. None of the manufacturers claim in their information for users that lyophilized samples can be used and this is supported by the data from the ERL-EQA samples (see Figure 2), EQA program leads need to be cognoscente of this issue and work towards providing commutable samples for POC analyzers.
As discussed earlier, the interference from Hb-variants in a number of the analyzers is perplexing. The disparity in results seen between fresh and frozen samples is of concern as many manufacturers will likely develop their methods using frozen samples but in the 'real world' setting where fresh samples are used, the variants pose a potential unseen problem. Not all manufacturers are accurate in their claims for Hb-variant performance.
for intellectual content; and (c) final approval of the published article. Competing interests: Authors state no conflict of interest.