Inadequate design of mutation detection panels prevents interpretation of variants of concern: results of an external quality assessment for SARS-CoV-2 variant detection.

OBJECTIVES
Mutation-specific PCR assays have quickly found their way into laboratory diagnostics due to their capacity to be a fast, easy to implement and high-throughput method for the detection of known SARS-CoV-2 variants of concern (VoCs). However, little is known about the performance of such assays in routine laboratory analysis.


METHODS
The results reported in a recent round of an external quality assessment (EQA) scheme for SARS-CoV-2 mutation-specific PCR were retrospectively analyzed. For the determination of individual variant-specific sequences as well as for the interpretation results for certain virus variants, correct, incorrect, and unreported results were evaluated, and their possible causes were investigated.


RESULTS
A total of 34 laboratories participated in this study. For five samples containing the VoC Alpha + E484K, Beta, Gamma, Delta, or B.1.1.318 (as a variant of interest), 848 results for SARS-2-CoV mutation detection were reported, 824 (97.2%, range per sample 88-100%) of which were correct. Melting curve assays gave 99% correct results, real-time RT-qPCR 94%, microarray-based assays 100%, and MALDI-TOF MS 96%. A total of 122/167 (73%) reported results for SARS-CoV-2 variant determination were correct. Of the 45 inconclusive or incorrect results, 33 (73%) were due to inadequate selection of targets that did not allow identification of contemporary VoC, 11 (24%) were due to incorrect results, and one (3%) was due to correct results of mutation-specific PCR.


CONCLUSIONS
Careful and up-to-date selection of the targets used in mutation-specific PCR is essential for successful detection of current SARS-CoV-2 variants.


Introduction
Mutation-specific PCR assays for the detection of specific mutations in known variants of SARS-CoV-2 virus were introduced as a fast, easy to implement and high-capacity method to help clinicians and authorities perform infection control measures [1]. Independent evaluations provide important information about the properties, performance and limitations of such assays and are therefore essential for good laboratory operation by ensuring reliable results. Several such reports on assay evaluation and optimized panels for mutation-specific PCR have been published [2][3][4][5][6][7]; however, data are lacking on the performance of SARS-CoV-2 mutation and variant detection assays in routine use. External quality assessment (EQA) schemes provide valuable information about the analytical and diagnostic performance of laboratory tests in routine use. Schemes for SARS-CoV-2 PCR have repeatedly proven their potential in this regard [8][9][10][11][12][13][14].

Materials and methods
For this study, the results reported in a recent round of the EQA scheme "SARS-CoV-2 mutation-specific PCR" operated by the Austrian Christoph Buchta and Jeremy V. Camp contributed equally to this work.
Association for Quality Assurance and Standardization (ÖQUASTA) and the Center for Virology of the Medical University Vienna were retrospectively analyzed. This scheme was established in winter 2020/21 and corresponds to the general EQA scheme as summarized by the European Organization for External Quality Assurance Providers in Laboratory Medicine (EQALM) [15]. Participants expressing interest in participation were required to provide a brief methods summary prior to enrollment. The participants were comprised of both public and private laboratories in Austria, some of which were medical diagnostic laboratories, although this was not a requirement for enrollment.

Sample selection, preparation, testing, dispatch
Samples were selected to monitor the analytical performance of the participant laboratories and survey the variant detection assays in use. Samples were selected, tested, characterized, and prepared at the Center for Virology, Medical University of Vienna, the Austrian national reference laboratory for respiratory viruses (an ISO 9001 certified laboratory). The aim was to provide samples representing current variants of concern (VoC) of SARS-CoV-2 as published by the European Centre for Disease Prevention and Control [16]. Testing, preparation, and dispatch of samples have already been described, as has the data collection and evaluation in this EQA scheme [14]. Briefly, the five samples were residual nasopharyngeal secretions obtained by swabs from patients in Austria and diluted in physiological saline. They were confirmed by the national reference laboratory for respiratory viruses by whole genome sequencing, and viral loads of samples were characterized by multiple repeated RT-qPCRs, as previously described [14]. Stability during shipping conditions was assessed prior to shipment as previously described.
We included an Alpha + VOC (lineage B.1.1.7 + E484K, EPI_ISL_3144944, C t = 21.9), Beta (B.1.351, EPI_ISL_1191134, C t = 29.5), Gamma (P.1, EPI_ISL_3144947, C t = 25.6), Delta (B.1.617.2, EPI_ISL_3144946, C t = 23.8), and a B.1.1.318 sample (EPI_ISL_3144945, C t = 23.0). B.1.1.318 is a variant of interest (as determined by Public Health England, VUI-21FEB-04) due to the E484K mutation but missing N501Y determining for variants alpha, beta and gamma and L452R, P681R, and T478K determining variant delta. The goal for this sample was to test the interpretation of the sample, specifically whether laboratories would conclude that this sample should be further analyzed as a potential variant of interest. Samples for this round were shipped on June 28th, 2021, and the results were reported before July 12th, 2021. The report form required participants to select the results of genome detection assays (positive, negative, "n.a." = not evaluable), a list of seven specific codon positions (selected from a choice of specific substitutions at each site), and the interpretations (a list of possible VoCs, others, or "n.i." = not interpretable) for each sample (Supplementary Material). The seven specific codon positions for variant detection were set following current Austrian governmental recommendations [17] (Table 1), but laboratories could report the results from additional assays.

Analysis of results
The results from individual mutation-specific assays were analyzed for overall accuracy with respect to test format but without regard to specific commercial assays. This EQA was not designed to evaluate or compare specific commercial assays, as we expected a wide variety of test systems to be used (including various combinations of sample handling, nucleic acid extraction methods, specific assay chemistry, and equipment used to analyze the results), and these all may influence the results [11][12][13][14]. Laboratories were evaluated according to the reported interpretations with respect to the results from their respective mutation-specific assay panels. We noted whether correct/incorrect/"not interpretable" interpretations of the sample identity were due to either (i) adequate panel selection for the respective sample (

Results
A total of 34 laboratories participated in this round and reported results for at least one of the target positions offered in this scheme. A total of 15 laboratories were registered for all mutations minimally required to detect contemporary SARS-CoV-2 VoC (Tables 1 and 2). All except two laboratories screened the samples for SARS-CoV-2 viral RNA prior to performing mutation-specific assays. Three laboratories could not detect viral RNA in the Beta test sample (C t = 29.5) and therefore reported no results for the mutation-specific assays for this sample. This resulted in a total of 167 reports for SARS-CoV-2 variant detection (i.e., one report for each sample from 34 laboratories, excluding the three laboratories that did not detect beta). In general, mutation detection assays performed well, typically with 97-99% of reports correctly identifying the genotype at the respective positions. However, from the 20 laboratories using an assay to detect the genotype(s) responsible for mutations at codon position P681, only 71 (88%) out of 81 total results matched the target ( Figure 1). Otherwise, from the 34 participating laboratories, 33 tested for a mutation at position N501Y, resulting in 158 individual mutation detection results, of which 155 (98%) matched the target (N501 or N501Y) ( Figure 1). For testing for the E484K mutation, 31 laboratories were registered and reported 151 results, of which 149 (99%) matched the target (E484 or E484K). For testing for mutation HV69/70del, 29 laboratories were registered and reported a total of 138 results, of which 135 (98%) matched the target. For the K417N mutation, 24 laboratories were registered, and 111 (97%) out of 115 laboratories matched the target. For the detection of L452R, 23 laboratories were registered and reported 106 (99%) correct out of 107 results. For the detection of V1176F, 16 laboratories were registered, and 75 (99%) out of 76 laboratories met the target. For the detection of mutations in T478K, three laboratories were registered, and each of the 13 reported results matched the target. For the detection of Y453F, two laboratories were registered, and each of the nine reported results matched the target.
In total, 848 individual results for mutation-specific PCR were reported, among which 24 (2.8%) were incorrect (Table 2, Figure 1). The incorrect results included three detections at position N501 (2%, one false negative, two false-positive), two for E484 (1%, two false negative), three false-positive for HV69/70 (2%), four for K417 (3.5%, three false negative, one report of K417T instead of K417N), one false negative for L452 (1%), 10 for P681 (12%, six false negative, two false-positive, two reports for P681R instead of P681H), one for V1176 (1%, false-positive), and none for Y453 and T478. There were a total of eleven false-positives (including two reports of P681R instead of P681H and one report of K417T instead of K417N) and 13 false negative mutations reported. The 24 incorrect results for mutations were reported by 10 laboratories, of which five had each incorrect result out of (per sample) 35, 30, 26, 24, or 20 reported results, four each reported three incorrect results Microarraybased assay out of (per sample) 39, 34, 30, or 20 reported results, and one had seven incorrect out of in total nine reported results.
Assays used by participants in this EQA round were based on four different test formats. A total of 524 (63%) individual results were obtained by melting curve assays, 250 (30%) by real-time RT-qPCR, 35 (4%) by microarraybased assays, and 24 (3%) by MALDI-TOF MS ( Table 2). The ratios of correct results were 517/524 (99%) for melting curve assays, 234/250 (94%) for real-time RT-qPCR assays, 35/35 (100%) for microarray-based assays, and 23/24 (96%) for MALDI-TOF MS. One participant reported the use of an in-house assay but did not provide any further details. The real-time RT-qPCR test formats comprised assays from six different manufacturers; the corresponding ratios of correct results ranged from 22-100% (Table 2). The available data did not allow us to discriminate between errors due to specific assays or due to the interpretation of the raw data; however, there was no indication of such systemic errors in general. Incorrect results (false negatives or falsepositives) came either from test systems that were not used by another laboratory or from test systems that were used by several other laboratories and otherwise gave correct results.
For example, four laboratories incorrectly reported the identity of the nucleotide responsible for mutations at codon position P681 in three different samples (Alpha + E484K = P681H, Gamma = P681, and B.1.1.318 = P681H) using three different assays designed to detect either P681H or P681R. The P681 mutation assay used by two of the labs was also used by seven other laboratories that reported the mutation (or wild type) at this position correctly, and these two laboratories correctly determined the genotype for the other two samples (Beta = P681 and Delta = P681R) correctly. Moreover, these two laboratories reported two false negatives and one false-positive for the P681H mutation in Alpha + E484K, B.1.1.318, and Gamma, respectively. The other two assays that provided incorrect (or inconclusive) results at the P681 site for these three samples were not used by any other laboratory, one of which (Allplex, Seegene Germany GmbH) reported two false negatives for P681H and the other (MutaPLEX, Immundiagnostik AG, Germany) reported two false-positives for P681R instead of P681H.
For SARS-CoV-2 variant interpretation, the identities (VoC name or lineage name) of 122/167 (73%) test samples were correctly reported (Table 3), and of those 97 tested for at least the minimum required mutations (Table 2). Thirteen correct interpretations were made from incomplete or inadequate test panels, and 12 were based on at least one incorrect or not interpretable mutation-specific assay. The sample identity was incorrectly interpreted only four times. All incorrect interpretations were due to incomplete assay panels, although one was also due to assay failure. Two labs that reported Alpha instead of Alpha + E484K did not include an E484K in their test panel (and one reported "not interpretable" for N501Y and HV69/70del assays), and the two labs that reported Gamma incorrectly (as either Alpha + E484K or Beta) only included N501Y and E484K in their test panel and did not include assays to detect K417T and V1176F.
The remaining samples (51/167) were reported as "not interpretable" or not reported at all on the submission form ( Table 3). Eight of these were from the B.1.1.318 sample, for which laboratories used an adequate test panel (i.e., E484K) but determined that it was not one of the other VoCs. Six of the "not interpretable" or not reported results were due to assay failure, three of which were from "not interpretable" individual assays (although in each case the assay panel was adequate) and three that were due to incorrect false-positive/negative mutationspecific assays, which would have made the interpretation unclear. Finally, 27/51 samples reported as "not interpretable" seemed to be the result of an inadequate panel of mutation detection assays being used. For example, one laboratory tested only for K417N mutations, and although they were correct in each sample, they could not (and did not) interpret the identity based on this result.
We did not detect an overall sample-specific error in interpretations ( Table 4). The correct interpretations (despite having either adequate/inadequate test panels or incorrect results) were provided 30/34 (88%) times for Alpha + E484K (the highest ratio of correct interpretations), 21/31 (68%) times for Beta (the lowest ratio of correct interpretations, not counting the three genomenegative results), 24/34 (70%) for Gamma, 23/34 (68%) for Delta, and 24/34 (70%) for the B.1.1.318 sample (whereby "other" was considered a correct result, Supplementary Material). Conversely, incorrect interpretations were noted for only two samples (Alpha + E484K and Gamma), as noted above. The ratios of "not interpretable" results VoC, variants of concern. a Sample identity was reported correctly, incorrectly, or reported as "not interpretable". In some cases, no report was given, and this was included in the "not interpretable" category. b Test panels included assays to detect the minimum number of mutations (Table ) required to diagnose contemporary VoC, and assay results were correct. c At least one assay provided incorrect or uninterpretable results, but otherwise test panels were adequate to detect the minimum required mutations for each sample. d Test panels were inadequate (did not include all of the minimum required mutations for VoC) but otherwise reported correct results for the specific assays in use. e Three laboratories did not detect viral genomic RNA in one sample.

Buchta et al.: SARS-CoV-2 mutations and variants in EQA
were therefore similar to the ratios of correct results, with the lowest number of "not interpretable" results for Alpha + E484K (2/34) and the highest for Delta (11/34). Beta and B.1.1.318 both had 10 (of 31 and 34, respectively) "not interpretable" results, and for Gamma, 8/34 results were reported as "not interpretable".

Discussion
The principal finding of this study was that several laboratories did not employ an adequate assay panel to identify contemporary SARS-CoV-2 variants. Simply stated, 43/167 interpretations were based on incomplete information. Many of these (13) were "solvable" because laboratories included more mutation detection assays in addition to the minimum required. However, 27 were determined to be "not interpretable" based on incomplete and inadequate panel selection. As we stated, participating laboratories were not necessarily "official" or certified diagnostic laboratories in Austria. Nonetheless, this could translate into up to 16% of all samples being uninterpretable simply because the laboratory did not use the minimum required panel to detect contemporary VoC. At the time of the EQA, the prevalent variants in Austria were Alpha (approximately 15%, including Alpha + E484K) and Delta (approximately 85%), with very few Gamma reported, and we note that the prevalence of Alpha was decreasing, while Delta was sharply increasing. Accordingly, assays should have included at least examinations for mutations in amino acid positions N501, HV69/70, and E484, and at least one L452, P681, or T478. Surprisingly, only 76% of the participating laboratories covered these targets, and therefore, no conclusive results could be reported by two laboratories for the sample with the variant Alpha + E484K and six for the sample with the variant Delta. As Delta was not yet the predominant strain in Austria, many laboratories may not have yet included these assays. The determination of the beta and gamma variants may not have been so important at the time, which may be why fewer laboratories chose to include assays in their panels that corresponded to these VOCs. As a result, some laboratories were unable to correctly identify the samples with these variants (Figure 1 and Table 4). The second notable finding from this study is that the error rate of individual mutation assays is reasonably low, with only 24/848 (2.8%) incorrect results ( Figure 2). We could not see a clear difference in the performance of the test formats, with all formats producing correct individual results in well over 90% of tests. The number of results reported for each format differed greatly, so a reliable statement can only be made about the performance of melting curve assays based on 524 results (99% correct) and real-time RT-qPCR assays based on 250 results (94% correct). However, although the majority of laboratories reported no or only one false-positive/negative mutation result, some reported three, and one reported seven incorrect results for partly the same and partly different targets. For example, in the case of position P681, two laboratories reporting incorrectly for three samples both used the same assay, whereas the other laboratories reporting false results at this position used assays that were not used by other laboratories. While this prevents analysis of assay-specific failures, it suggests that operational errors or misinterpretation of the raw data are more likely to be the cause of the incorrect results than poor performance of assays. Previously, we included a sample in a variant detection EQA with a specific (additional) mutation that may have produced divergent results from assays designed to detect HV69/70del [13]; however, sequencing did not reveal any similar substitutions that may have affected specific assay performance (e.g., detection of P681 genotypes).
Incorrect results in mutation-specific assays impeded correct interpretation of the sample with variant Alpha + E484K for two laboratories and of the sample with variant Delta for four laboratories. It is noticeable that three laboratories did not recognize the Beta variant sample with a C t value of approximately 30 as positive. The reason for this remains unclear, as current assays definitely identify those and even samples with significantly higher C t values as positive. We previously noted a higher ratio of "inconclusive" or incorrect results from samples with high Ct values [13], and we noted a higher number of "not interpretable" results from mutation-specific assays for this sample (eight total) than for other samples (1)(2)(3)(4)(5). This may indicate that sample degradation had occurred, although we did not detect significant degradation in any sample during our quality control procedure designed to mimic various shipping and storage conditions. Otherwise, we could not ascribe incorrect results/interpretations to the use of single-plex vs. multiplex assay formats, as the laboratoryspecific implementation of various commercial kits was not always clear. It also remains unclear how some participants came to 11 correct interpretation results, although they did not identify the mutations relevant for certain variants or they received incorrect results for them.
Our results show that a careful selection of the targets according to the current variant activity is of primary importance. Sixteen laboratories correctly interpreted all five samples, and although nine tested all positions required by authorities, only six laboratories reported everything correctly (and again, we are unaware whether these are "certified" diagnostic laboratories). We recommend that governments or regulatory agencies that oversee clinical laboratory testing provide clearly defined and routinely updated guidance on the minimum number of targets needed for certified diagnostic laboratories to adequately identify contemporary SARS-CoV-2 variants. While the selection of specific assays may be important, we did not evaluate that here, and in fact, we detected very few assayspecific failures. Even if an appropriate panel is selected and updated according to contemporary variants, the interpretation of the results by competent staff depends upon successful SARS-CoV-2 variant detection, and we again note that the error rate was very low. Incorrect interpretation of correct mutation-specific PCR results is very likely a human error, and we noted at least one such error in our study. Moreover, although "not interpretable" is not necessarily a useful outcome, we support this form of reporting rather than incorrect interpretations based on incomplete information. We recommend that a risk analysis of the result interpretation procedure and the competence management system be routinely performed to ensure accurate interpretation and reporting of variants. reviewed manuscript. UR: Analyzed data and provided technical EQA support, critically reviewed manuscript. BB: Provided public health advice, critically reviewed and edited the manuscript. EPS: Provided scientific advice, critically reviewed the manuscript MMM: Provided scientific advice, critically reviewed the manuscript. AG: Provided scientific advice to the overall study and reviewed and edited the manuscript. SWA: Provided sample material, conceptualized, conducted and supervised this EQA study, provided scientific advice to the study, reviewed and edited the manuscript. IG: Conceptualized, conducted and analyzed this EQA study and wrote and edited the manuscript draft. All authors have accepted responsibility for the entire content of this manuscript and approved its submission.