Recommendations for validation testing of home pregnancy tests (HPTs) in Europe.

Home pregnancy tests (HPTs) available in Europe include accuracy and other performance claims listed on their packaging. Due to the lack of guidance on the standardisation of such products, it is often difficult to replicate these claims when tested on a clinical sample, whether in a laboratory setting or by lay users. The In Vitro Diagnostic Regulation is a set of requirements that mandate comprehensive validation data on human pregnancy tests and other in vitro devices. It is due to replace the current European Directive (98/79/EC) and fully implemented in Europe by 2022. In June 2019, a panel of seven experts convened to discuss the validation studies required to provide the information needed to meet the new regulation for HPTs in Europe and proposed 15 recommendations for best practice. Defining best practice at all stages of validation of these important tests may ensure that tests marketed in Europe are fit for purpose, enabling lay users to be confident of the high quality of the HPT results they obtain. The panelists believe that the recommendations proposed here for the validation of HPTs may constructively contribute to improved standardisation of validation procedures in Europe.


Introduction
Home pregnancy tests (HPTs) became widely available in 1976, enabling women to determine their pregnancy status in the privacy and comfort of their own homes [1][2][3]. Significant technical innovations have resulted in the development of the tests used today. These have simple, one-step direct sampling formats with either colourimetric or digital outputs that facilitate interpretation as positive, negative or equivocal [2,3]. The basis of all pregnancy tests is detection of the hormone human chorionic gonadotrophin (hCG) [2]. hCG is a glycoprotein that has a structure consisting of two non-covalently linked dissimilar α-(91 amino acids) and β-(145 amino acids) subunits [1,2]. Following implantation, the placenta begins to develop, gradually producing increasing amounts of hCG until weeks 10-12 of pregnancy, after which the hCG concentration decreases moderately and stabilises until late pregnancy, with a further decrease during the last month [2]. hCG is not present in measurable concentrations in non-pregnant, pre-menopausal women except in relatively rare malignancies [4]. Its measurement, therefore, has enabled development of both laboratory tests and HPTs for detection of pregnancy [2].
At least 65 million HPTs are sold in Europe each year [5]. These are available as strips, cassettes and mid-stream dipsticks. Mid-stream tests either have results displayed as a line or cross for visual interpretation by the user (visual tests, also known as line tests) or the results are displayed digitally to the user on an LCD screen (digital tests) [5]. The characteristics of typical devices are shown in Table 1. Many HPTs are sold over the counter for home use, others are intended for professional use only [6].
HPTs are designed for single use for qualitative detection of hCG in a urine specimen collected at any time of the day, with a positive result indicating possible pregnancy [7].
However, detection limits may vary considerably, with one study reporting detection limits ranging from 6.3 to 50 IU/L [8]. There is currently no universal standardisation guidance for medical devices such as HPTs; thus, their analytical and clinical performance inevitably vary considerably [9]. The US Food and Drug Administration (FDA) has published summaries of HPTs cleared for sale in the USA, thereby providing some indication of how validation is being conducted. Details of the tests performed and the type of hCG samples used are included in these summaries [10,11]. Table 2 illustrates the variability in testing of analytical precision for four HPTs recently cleared by the FDA, with the number of tests performed during evaluation ranging from 540 to 6,000 [10]. It is highly desirable that validation of HPTs includes assessment of their clinical accuracy when used by non-laboratory personnel, but how rigorously this is assessed also varies markedly as presented in Table 2 [11].
Currently, tests available in Europe include accuracy and other performance claims on their packaging. However, as illustrated above, the lack of standardised criteria against which performance can be assessed makes it difficult to compare results for analytical accuracy [2]. This also applies to claims for clinical accuracy, i.e. the correct determination of "pregnant" (positive) and "nonpregnant" (negative) results for specimens for which the true clinical status is known. Many HPTs claim to be more than 99% clinically accurate when used from the day of the missed menstrual period, but it is not always clear how this was determineda problem highlighted in the USA almost two decades ago [8]. Different approaches can yield differing results depending on the types of specimens used for testing, sample size and other factors, as well as the selection of lay users contributing to validation studies.
A recent study reported that only three of seven HPTs available on the European market met accuracy and reliability claims of >99% of the products' claimed analytical sensitivity (i.e. the stated concentration of hCG at which the product is expected to return pregnant results), while the other four tests had analytical accuracies of <99% (81.6-95.9%) [12]. In this context, it is important to note that if such assessments are performed by trained laboratory technicians under ideal conditions [2,13] they may not reflect real-life usage. As the main aim of analytical validation is to demonstrate that the procedure is suitable for its intended purpose [14], for HPTs it is essential that there is clear evidence that the test can be successfully performed by lay users.
Manufacturers who market HPTs in the USA are now required by the FDA to provide evidence of product performance that includes data for testing analytical and  [16,17]. The requirements include the necessity for comprehensive validation data (including data on customer usability) on HPTs and other in vitro devices [17,18]. The panel's recommendations for best practice are described below and key points are highlighted in Table 3. The panel encourage manufacturers to apply these recommendations during implementation of the new IVDR and for notified bodies to use this guidance when examining the data underpinning HPT performance.  (1) Nomenclature of HPTs, description of hCG isoforms recognised and units for reporting

Requirements for validation of HPTs in Europe
Achieving consistency for the validation of these important tests requires agreement regarding the pre-analytical, analytical, clinical and user assessments that should be undertaken, as well as how this information should be presented to enable ready comparison of data from different sources. Panel recommendations are presented below.

Considerations relating to nomenclature
Description of device formats HPTs are available in a number of formats (Table 1).
Reaching broad agreement about how these tests should be described and adoption of consistent nomenclature for the different types of tests would assist individual lay users to decide which type of test they would find most convenient to use. Recommendation: Agreement on the broad descriptions of different types of device formats is a priority.

Description of hCG isoforms recognised
Recognition of isoforms of hCG Numerous isoforms of hCG exist in urine, and not all assays detect the various forms to the same extent. Intact hCG (in which the αand β-subunits remain associated) is the predominant urinary form in early pregnancy [19]. In early pregnancy the free β-subunit of hCG (hCGβ) is present at a concentration below 10% of that of intact hCG and is not detected in many HPTs [20][21][22][23][24]. The β core fragment of hCG (hCGβcf), a metabolically degraded fragment of hCGβ, is undetectable in early-pregnancy urine, but becomes the predominant form by 10 weeks of gestation [21,25]. hCG and the nicked form are less stable than hCGβ [26]. In quantitative assays, these variations can lead to different "total" hCG results for clinical specimens, depending on the concentrations of the hCG isoforms present [26]. International standards available for most forms of hCG enable characterisation of quantitative assays in terms of their detection of the different forms [26][27][28][29][30][31]. The panel considered whether HPTs should undergo similar characterisation.
HPTs provide qualitative results onlythey indicate whether the hCG concentration is above or below a certain claimed detection limit. The panel agreed that information relating to recognition of different hCG isoforms by HPTs is not relevant for lay users, as actual clinical performance in relation to pregnancy status is the key metric. However, it is important that manufacturers assess which isoforms of hCG are recognised by each HPT during early characterisation of the method (Section "Units of measurement"). This can be conveniently achieved using existing International Reference Reagents for these isoforms and expressing comparative results in substance concentrations (mol/L) assigned to these Reference Reagents [26].
Recommendation: Manufacturers should include data regarding the relative recognition of hCG isoforms as expressed in molar units (mol/L) in technical data sheets.

Units of measurement
Achieving agreement regarding the units in which hCG concentrations are expressed for HPTs would be desirable. In an FDA guidance document for OTC hCG tests, the units used throughout are mIU/mL [15]; these units are established by the World Health Organisation Expert Committee on Biological Standardization. Originally based on hCG bioactivity measurements, these somewhat arbitrary units were assigned by bioassay to each successive International Standard for hCG to maintain continuity of clinical hCG results (and pharmacological hCG preparations). Adoption of molar units (substance concentrations) enables direct comparison of recognition of different forms of hCG as described above and facilitates analytical standardisation of both qualitative and quantitative hCG methods. The International Federation of Clinical Chemistry and Laboratory Medicine recommends the use of substance concentrations (mol/L) wherever possible; however, it was accepted by the panel that as clinicians and lay users are familiar with hCG results expressed in mIU/mL, IU/L or U/L, changing to molar or mass units would cause considerable confusion. It would be desirable to express results in terms of U/L, the units for hCG used in most clinical laboratories. However, when comparing concentrations of hCGβ and hCGβcf with hCG, molar units must be used. Earlier established International Units (IUs) for hCGβ are not at all comparable with IUs of hCG and the standards for hCGβ and hCGβcf have not been assigned a value in IU.
Recommendation: hCG results should be expressed using the same units, ideally U/L.

Analytical requirements for validation of HPTs
Validation studies for HPTs differ from those for many diagnostic tests performed in clinical laboratories as they include studies conducted both by the diagnostic companies manufacturing the HPT and by lay users representative of those likely to use the product. The aims of these two types of validation differ, as the first focuses primarily on the analytical characteristics of the HPT, while the second assesses whether the HPT is fit for purpose in the hands of the lay user. The urgent need for international guidance on validation of HPTs is clear from the variation in practice shown in Table 2 and Supplementary Table 1.
During laboratory testing, technicians should handle the tests according to the instructions for use, for example, lie on a flat surface and replace cap if this is detailed in the instructions. If a different testing procedure is used, there should be statistically robust studies showing equivalence of methods.

Specimen requirements for validation studies
It is essential that HPT validation studies of both types are performed using urine. Validation studies for methods appropriate for use with blood or serum specimens must also be performed in these matrices. hCG standards used to assess detection limits should be prepared in the relevant matrices from pre-menopausal non-pregnant women. Samples should have urinary hCG <1 U/L, measured using a quantitative assay validated for use in urine samples. Alternatively, pregnancy status could be determined using serum hCG concentration taken on the same day as the urine sample. Given that it is relatively easy to source female urine, there should be no need to use male urine, which may not completely reflect the matrix in which tests are ordinarily performed. When preparing standards, specific gravity should be recorded and highly dilute urine should not be used.
Validation panels should include specimens from individual women, both pregnant and non-pregnant. This is in accordance with FDA guidance, which states that at least 100 "fresh" authentic human urine specimens should be used in validation testing of HPTs [15].
The conditions of collection for urine and other specimens for HPT validation studies should be agreed, as current definitions of what constitutes a "fresh" urine specimen are vague and inadequate. Whether specimens should be collected at a specified time of day must also be considered. The time delay between collection of urine specimens and their use in validation studies is particularly important for manufacturers as it is unlikely to be feasible to test specimens immediately after collection. Conditions of storage, including temperature and the maximum time of storage from when specimens are collected until they are used in validation studies, need to be specified. This is because studies have shown that storage conditions can affect the measured concentration of hCG in urine samples [19,25,[32][33][34]. For example, significant loss of immunoreactivity has been reported for samples stored at −20°C, but not at +4°C or −80°C [35]. Inclusion of agents to retard bacterial growth is advisable for refrigerated samples (e.g. sodium azide) because bacterial enzymes have been reported to cleave the peptide chains of hCG. Manufacturers should therefore validate their storage and collection conditions if using banked samples, and clearly describe these in the relevant technical data sheets.
The same considerations apply to lay users who are testing specimens provided by the manufacturer, particularly if this is being performed at home (Table 4). This is likely to be less relevant when lay users are testing their own urine, assuming they will test at the time of urination. However, lay users participating in validation studies should be reminded of this requirement and also advised of the ideal time of day at which to test their urine if specific timing is recommended in the instructions with the product (e.g. sampling the first morning urine).
Recommendation: Pre-analytical requirements for urine collection and storage prior to validation studies should be clearly defined.

Assessing analytical accuracy, sensitivity and reproducibility of HPTs
It is very important to differentiate analytical and clinical accuracy and sensitivity of HPTs, an issue that the panel considered problematic, particularly in advertising claims made for some HPTs. These distinctions should be explained and the terms clearly defined in kit inserts and/ or technical data sheets.

Analytical accuracy
Analytical accuracy must be assessed by the manufacturer, and reflected in kit inserts and/or technical data sheets in the claimed detection limit or analytical sensitivity of the method. These are most often 10 or 25 U/L of hCG but may be as high as 50 U/L or as low as 2 U/L. Analytical accuracy is assessed using urine specimens containing known amounts of added purified intact hCG, which has been calibrated against the International Standard for hCG. Manufacturers should document the version of International Standard used for calibration. The concentration ranges of these specimens should include specimens below and above the claimed detection limit of the methodconcentrations that should give consistently negative (not pregnant) or positive (pregnant) results. Specimens of intermediate concentrations near the "transition" or "50:50" concentration (at which on half the testing occasions, the HPT will give a positive result and on the other half will give a negative result) should also be included (See further discussion below). Test performance at these concentrations is most likely to be affected by factors such as different lot number or operator, thus providing additional valuable information. Some panel members recommended a detection limit of 25 U/L, but there was no consensus.
Recommendation: The minimum number of urine specimens containing added hCG and the appropriate concentration range for validation of analytical accuracy should be defined.

Analytical sensitivity
As HPTs are threshold-based tests rather than quantitative tests, standard definitions of Limit of Detection (LoD) and Limit of Quantification (LoQ) have not historically been applied. Rather, tests have been described in terms of "sensitivity", meaning analytical sensitivity. The definition of analytical sensitivity varies and in some data sheets is "the lowest concentration of hCG which it is possible to detect" rather than, as is appropriate, the lowest concentration which the HPT detects reliably and reproducibly. The panel agreed that the analytical sensitivity of HPTs should be defined as the lowest concentration of hCG that can be reliably detected more than 99% of the time. Although it would be preferable to use LoD, the term "sensitivity" is now being used by consumers and healthcare professionals to understand the performance of tests; thus, a new term in consumer labelling would be unhelpful. Details of how analytical sensitivity was assessed should be included in technical data sheets.
Recommendation: Analytical sensitivity should be expressed as the lowest concentration that the HPT detects ≥99% the time.

Analytical precision
The precision of HPTs, which reflects their repeatability and reproducibility [14], should be assessed during validation. Repeatability is a measurement of precision in which replicate measurements using the same procedure, operators, system, operating conditions location (laboratory) are conducted on the same or similar specimens over a short period of time. Reproducibility aims to identify the parameters that contribute to variability of the assay by combining factors known to influence assay precision, such as operators, lots and batches, in a matrix to provide sufficient data points for statistical analysis. As presented in Table 2, considerable variation in how the precision of HPTs is currently assessed was reported. The panel agreed that in accordance with recommendations for other laboratory tests, repeatability should be assessed by obtaining 20 measurements from the same specimen (ideally a pooled urine standard) at different hCG concentrations representative of the working range of the assay. By including specimens of appropriately low concentration, the limit of detection can be determined similarly. Testing the same samples across five non-consecutive days by three different operators and for three different lot numbers provides additional evidence of precision and reproducibility. Guidance regarding experimental design for reproducibility is available from CLSI Evaluating Quantitative Measurement Precision-A3 [36]. Recommendation: Analytical precision should be assessed by: (1) repeating the test at least 20 times per condition, across standards which include hCG concentrations near the detection limit of the method and the "50:50" point; and (2) repeating the testing series on at least three separate days spaced across a minimum fiveday time frame, and with a minimum of three different operators and three different lot numbers.
Analytical robustness to potential interferences HPT validation studies should include evaluation of the effect of potential clinically relevant cross-reactants, including luteinising hormone (LH) at concentrations that may be present in urine (e.g. 500 U/L). The FDA requests that manufacturers also test follicle stimulating hormone (e.g. 1,000 U/L) and thyroid stimulating hormone (e.g. 1 U/L) [10,11]. Dietary supplements such as biotin may suppress signal production [37] causing erroneous results. Vulnerability to such interference should be assessed during validation. The urine matrix used in spiking experiments should have an appropriate pH (5-7) and specific gravity (1.010-1.030).
High urinary concentrations of hCG in late pregnancy, and of hCGβcf at the end of the first trimester and through later pregnancy may saturate antibody binding sites in the HPT, producing a false negative test result (i.e. the "hook effect") [2,21,[38][39][40][41]. Validation of HPTs should include testing to demonstrate that the test still returns "Pregnant" results at high concentrations of hCG (e.g. 500,000 U/L). For hCGβcf, validation studies should demonstrate that "hooking" that results in false negative results does not occur in clinically possible scenarios. This has to be done in a matrix with high concentrations of intact hCG to mimic the actual clinical conditions under which this phenomenon has been observed to occur (e.g. hCGβcf diluted to achieve a final concentration of 1 μmol/L into late first trimester pooled urine) [2,20,21,[41][42][43][44].
Results of such validation studies should be included in technical data sheets, which should also clearly state whether concentrations above which the method is affected are likely to be encountered in clinical specimens. hCGβcf is likely to be found at the highest concentration in urine at 10-12 weeks of gestation [43].
Recommendation: Cross-reactions with LH, hCGβcf and other potential clinically relevant interferences should be assessed in HPT validation studies and results documented in technical data sheets.
Robustness to potential errors when using the HPT Although challenging to address, assessing the vulnerability of the HPT to small errors in specimen application (e.g. holding a mid-stream test in the urine stream for too long, or adding too many urine drops to a cassette test) or errors in timing prior to reading the tests should be included in validation studies. Mid-stream tests often provide two options for sample application: in-stream and dipping; equivalency of these methods should be demonstrated. If the HPT is insufficiently robust, it may be necessary to consider re-designing method.

Clinical requirements for validation of HPTs
Assessment of clinical accuracy should provide users with information about how accurately the test will identify pregnancy status at the time the sample is taken, and should include robust data with a statistically appropriate sample size and clear definition of how pregnancy status is defined. Particular attention is required during validation studies to address the following issues.
Validation of claims for "early pregnancy testing" Manufacturers making claims relating to "early pregnancy testing" (often described as "testing before the missed period") should be fully transparent regarding the proportion of pregnancies detected when testing so early and provide data supporting their claim. It is not clear whether many of the tests available in Europe have validated their early claim performance on clinical samples and if so, how this was achieved or how the specimens were collected. It is likely that rather than conducting studies examining early pregnancy detection rate, some may base the claim purely on the analytical sensitivity of the test.
The language surrounding early testing claims appears to have arrived at a common terminology, "testing before the missed period", and as consumers are familiar with seeing data presented using the missed period as reference, moving to alternative, more precise terms would be confusing. The timing of the missed period is notably inaccurate, especially for women with irregular periods, but unless the woman is monitoring ovulation using home ovulation tests or undergoing in vitro fertilisation, there is currently no better method for assessing when to time a test. Claims on currently marketed HPTs tend to be derived from calculating day of expected period as 15 days following ovulation, where ovulation day can be determined by a variety of methods (pre-conception day of LH surge [urine or serum], or use of urinary oestrone-3-glucuronide:pregnanediol-3-glucuronide ratio or ultrasound observed ovulation). Although this may not necessarily mirror the day a consumer would calculate, it does enable direct comparison between performance characteristics of different tests.
Claims such as "Results over 99% accurate from the day of your expected period" should be supported by referenced clinical studies.
Recommendation: Data supporting claims for early pregnancy testing should be provided and how studies were conducted described in full in the technical data sheet.

Potential disadvantages of highly sensitive HPTs
HPT methods should be designed so that analytical sensitivity is not so high that it generates "apparently false" positive results in women experiencing early pregnancy loss, which depending on the circumstances can be very distressing. Approximately one in four pregnancies end in early pregnancy loss, many of which will cause a positive HPT result at the time of the first missed period [43,45,46]. In addition, hCG concentrations in non-pregnant peri-and post-menopausal women can be slightly raised relative to the pre-menopausal reference interval [47]; therefore, HPTs should be assessed for specificity when used by women in this age group. This issue is increasingly relevant as more women delay pregnancy until a later age when irregular periods, often associated with the perimenopause, may be mistaken for pregnancy, prompting a pregnancy test. It has been suggested that an upper reference limit of 14 U/L should be used when interpreting serum hCG results in women >55 years of age [47]. Manufacturers should quantify this risk for each HPT and consider including this information in the package insert if the risk is relatively high.
Recommendation: Potential drawbacks of highly sensitive HPTs should be assessed and relevant information included in technical data sheets and package inserts.

Misleading claims of 100% accuracy
The panel was particularly concerned that claims of "100% accuracy" made for some HPTs can be misleading to health professionals and lay users as it is often not clear whether "100%" relates to analytical or clinical accuracy. Lay users in particular are likely to assume such claims mean that an HPT is 100% clinically accurate for the detection of pregnancy. This is never the case due to potentially confounding factors, including specimens that are too dilute or the possibility of early pregnancy loss, as well as the nature of all immunoassays. Such claims could be exploited by manufacturers to imply that their test is clinically superior. In agreement with current FDA guidance, the panel agreed that such claims should not be permitted in Europe.
Recommendation: Statements that HPTs are 100% accurate should not be permitted in data sheets or other promotional material.

Requirements for validation of HPTs by lay users
HPTs are developed primarily for lay users who generally will have little practical experience of laboratory testing. It is important to ensure that HPTs include clear instructions for use, that the tests can be readily performed outside a laboratory environment, that they are robust and that interpretation of the results is straightforward. Incorporating lay user testing during validation of HPTs is therefore essential to assess ease of use and confirm whether intended users are readily able to follow the product instructions, as well as to ensure that test results are interpreted correctly. The lay user panel should include a reasonable number of representative women from a range of age groups and its composition should be described in the technical data sheet. Panels would ideally include women seeking to become pregnant and women who suspect pregnancy (whether wanted or unwanted) because their period is late. In practice, however, recruiting women in the latter group is likely to be more difficult.
The results obtained by lay users testing their own sample should be compared to their true clinical status, as determined by an independent method. This is especially important as lay users may be less able than laboratory staff to read fainter lines at the test sensitivity. The study should be sufficiently sized to enable clinical accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) to be calculated robustly, as required by the IVDR.
Recommendation: Sensitivity, PPV and NPV should be calculated.

Ease of use
Both practical and questionnaire assessments are likely to be required. The latter could take the form of Likert-style questionnaires, where participants provide responses to questions such as "How easy was it to read the result?" on a 7-point scale ranging from 1 (e.g. "Extremely easy") to 7 (e.g. "Extremely difficult"). Criteria for acceptability should be established before questionnaires are issued.
Recommendation: A study where the HPT is used by women representative of the lay user and the volunteer results are compared to clinical pregnancy status should be conducted. The study should also consider ease of use, with results presented in the technical data sheet.

Presentation of practical information on how to use the HPT
Clear instructions about specimen collection The clarity of instructions regarding how to collect urine specimens and apply the specimen to the HPT should be assessed as how the specimen is applied to the device may influence the results. For example, some tests require uniform flow of urine through the test strip. If this is perturbed because too much or too little urine is applied, erroneous results may be obtained. Validation studies should demonstrate the results from lay user testing match those for similar urine specimens as tested by laboratory professionals [12].

Clear instructions about timing
Clear instructions about timing, both from the time of sample collection to the time of use of the HPT, and between application of urine to the device and the read time, should be included in the information leaflet. Additionally, the maximum time period for which the result remains valid from the time of running the test should be noted. This is because when lateral flow-based tests dry out, line intensity can be altered to no longer reflect the true result of the test, so there is usually a maximum reading time.
Recommendation: Information provided with the HPT should include clear and detailed instructions about urine collection and timing between steps in the testing process. The clarity of these instructions should be assessed in practice by an appropriately constituted panel of lay users during validation of the HPT.

Presentation of information on interpretation of HPT results
General format of information provided Information should be clearly presented using wording readily understood by the general public, including young teenagers, and should avoid use of scientific terms wherever possible. The font size on all package labelling and information leaflets should be large enough to be readily legible.

Shelf life
HPTs have a defined shelf life as determined by stability studies, which should be conducted by the manufacturer (usually 2-3 years). The "use by" date should be clearly printed on the test carton.

Error results
As with any immunoassay, HPTs can occasionally fail to function properly, whether due to over-or under-sampling, or misuse by the lay user, or due to a faulty test (e.g. damaged in shipping). Therefore, it is important that a lay user is able to understand when a test result is an error, and not misinterpret the result as "Not pregnant" (e.g. no lines on the test) or "Pregnant" (e.g. just the test line visible). The instructions for use should clearly provide information on what error results present like.

Information about test interpretation
Information leaflets should include a statement that HPTs are not 100% accurate for confirmation of pregnancy and that negative tests should be repeated if there is a strong suspicion of pregnancy. Clear images of the different results should be included to aid interpretation.
Inappropriately negative results may be obtained in early pregnancy if hCG concentrations are not high enough for detection or if urine is very dilute (early morning urine is often requested to reduce this risk). Repeating the test 3-5 days later is generally desirable. Measurement of serum hCG when requested by the woman's healthcare provider is another option that should provide a definitive answer. This is essential if the woman is experiencing abdominal pain or bleeding as these may indicate an ectopic pregnancy, which can produce very low concentrations of hCG [41,48,49].
Recommendation: Information leaflets provided with the HPT should include clear and detailed information about the limitations of HPTs. The clarity of the information provided should be assessed through questionnaires provided to lay users participating in the analytical validation studies.

Conclusions
The authors trust that the recommendations proposed here for validation of HPTs may constructively contribute to improved standardisation of validation procedures for HPTs in Europe. Defining best practice at all stages of validation of these important tests should help to ensure that tests marketed in Europe are fit for purpose and that lay users can be confident of the high quality of the HPT results they obtain. communication for providing medical writing assistance, which was funded by SPD Development Company Limited. Research funding: The study was funded by SPD Development Company Limited (Bedford, UK), a wholly owned subsidiary of SPD Swiss Precision Diagnostics GmbH (Geneva, Switzerland). Author contributions: All authors attended the panel meeting, provided expert recommendations, and were involved in manuscript preparation and review. All authors have accepted responsibility for the entire content of this manuscript and approved its submission. Competing interests: Sarah Johnson -Employee of SPD Development Company Ltd., a fully owned subsidiary of SPD Swiss Precision Diagnostics GmbH, a manufacturer of home pregnancy and ovulation tests. Fiona Gould -Contracted to support Abbott Rapid Diagnostics, a manufacturer of home pregnancy and ovulation tests.