Abstract
Objectives
The averted infections ratio (AIR) is a novel measure for quantifying the preservationofeffect in activecontrol noninferiority clinical trials with a timetoevent outcome. In the main formulation, the AIR requires an estimate of the counterfactual placebo incidence rate. We describe two approaches for calculating confidence limits for the AIR given a point estimate of this parameter, a closedform solution based on a Taylor series expansion (delta method) and an iterative method based on the profilelikelihood.
Methods
For each approach, exact coverage probabilities for the lower and upper confidence limits were computed over a grid of values of (1) the true value of the AIR (2) the expected number of counterfactual events (3) the effectiveness of the activecontrol treatment.
Results
Focussing on the lower confidence limit, which determines whether noninferiority can be declared, the coverage achieved by the delta method is either less than or greater than the nominal coverage, depending on the true value of the AIR. In contrast, the coverage achieved by the profilelikelihood method is consistently accurate.
Conclusions
The profilelikelihood method is preferred because of better coverage properties, but the simpler delta method is valid when the experimental treatment is no less effective than the control treatment. A complementary Bayesian approach, which can be applied when the counterfactual incidence rate can be represented as a prior distribution, is also outlined.
Introduction
In a series of papers we have considered the analysis of activecontrol noninferiority trials with a timetoevent outcome in the context of HIV prevention trials (Dunn and Glidden 2019; Dunn et al. 2018; Glidden, Stirrup, and Dunn 2020). Our key conclusion is that the standard metric used in such trials, the rate ratio comparing experimental and control arms, is misleading. We further argued that clinically meaningful inference requires estimation or specification of one of two unobserved parameters: (a) the event rate that would have been observed in trial subjects if they had received no treatment (counterfactual placebo arm) or (b) the effectiveness of the control arm relative to the counterfactual placebo arm. With this information, in combination with the observed incidence rates in the control and experimental arms, we can estimate a measure called the averted infections ratio (AIR). The AIR is interpreted as the proportion of events that would be averted by use of the experimental treatment compared with the control treatment. In the context of noninferiority trials, it is a natural criterion for assessing the degree to which the experimental treatment preserves the effect of the control treatment relative to no treatment (“preservationofeffect”) (Ghosh et al. 2011; Pigeot et al. 2003; Snapinn and Jiang 2008). Noninferiority trials using this approach typically aim to demonstrate at least 50% preservationofeffect, although this value is context specific and higher values may be warranted (Pigeot et al. 2003; Ghosh et al. 2011). In this paper, we consider the derivation of confidence limits for the AIR when it is estimated via the counterfactual placebo incidence.
Notation and statistical formulation
Denote the hypothetical placebo, control, and experimental arms by the subscripts P, C, and E, respectively. We observe F
_{C} personyears followup in control arm and F
_{E} personyears followup in experimental arm. Let X
_{C} and X
_{E} be the random variables denoting the number of observed events, where we assume that
Alternatively, Ψ can be expressed in terms of the counterfactual control arm effectiveness (
In this formulation, Ψ is a linear function of the rate ratio and confidence limits for Ψ can be obtained by direct transformation of confidence limits for the rate ratio. As the latter problem has been extensively studied (Graham, Mengersen, and Morton 2003; Li, Tang, and Wong 2014; Price and Bonett 2000; Sahai and Khurshid 1993) we focus on formulation (1).
Inference conditional on counterfactual incidence
This section considers the derivation of confidence limits for the AIR when considering a single, prespecified value of λ _{P}. This allows exploration of how the confidence limits (and point estimates) vary over a range of plausible values of λ _{P}, which can be highly informative (Glidden, Stirrup, and Dunn 2020).
Delta method
We first apply a log transformation to the AIR, a natural procedure for any statistic that is a ratio of two variables. From Eq. (1),
Based on a firstorder Taylor series expansion (Oehlert 1992)
since λ _{P} is regarded as fixed. Thus
A (1α) confidence interval for Ψ is obtained from
Profilelikelihood method
The loglikelihood under a Poisson model is
We can express (3) in terms of Ψ via Eq. (1), noting that a nuisance parameter (either λ _{C}, or λ _{E}, or a function of λ _{C} and λ _{E}) is also involved. Denoting this arbitrary nuisance parameter by ζ, the profilelikelihood confidence region for Ψ is defined by the set of values (Cole, Chu, and Greenland 2014)
where
is the unconstrained maximised loglikelihood.
An alternative approach is to parameterise the problem in terms of λ _{C} and λ _{E} rather than Ψ and ζ. We therefore maximise (3) subject to the constraint implied by Eq. (1) for a specified value Ψ*. Rearranging,
Introducing a Lagrange multiplier (
Differentiating (5) with respect to β, λ _{E}, and λ _{C} results in a set of three nonlinear equations:
noting that Ψ* and λ _{P} are constants. Using the method of elimination,
where
The roots of the function implied by (4) were found using the uniroot function in R (version 4.02), which utilises the goldensection search procedure combined with parabolic interpolation (code in Appendix).
Unconditional inference
In addition to exploring how the AIR varies over a range of values of the counterfactual incidence, we may wish to integrate over this parameter to obtain the unconditional distribution of the AIR. Bayesian inference provides a natural framework for this problem. Here we consider the case where trial investigators are able to specify a simple prior distribution for the counterfactual incidence, although more sophisticated approaches which incorporate external information are also possible (Glidden, Stirrup, and Dunn 2020).
Assume that the prior for λ _{P} can be specified as a Gamma distribution based on background knowledge. For λ _{E} and λ _{C}, we use weakly informative priors ∼Gamma(0.5,0.001) – this approximates to Jeffrey’s prior (Gelman et al. 1995), and also corresponds to adding 0.5 to the observed number of events as discussed in Section 6. As the Gamma distribution is the conjugate prior for the Poisson model, the posterior distributions for λ _{E} and λ _{C} are Gamma(X _{E} + 0.5, F _{E} + 0.001) and Gamma(X _{C} + 0.5, F _{C} + 0.001), respectively (Gelman et al. 1995). We generate samples from the distributions of λ _{P}, λ _{E}, and λ _{C}, to derive the posterior distribution for the AIR using Eq. (1).
The main application of the AIR is in noninferiority trials, where it is reasonable to assume that λ
_{C} < λ
_{P} since the effectiveness of the control drug will already have been established. Further, the AIR is uninterpretable if λ
_{C} > λ
_{P} as this would imply there was no yardstick against which the experimental drug could be compared (nothing to preserve). In most realistic applications it is also reasonable to assume that λ
_{E} < λ
_{P} as the experimental drug will have been selected as having some biological activity. It is therefore problematic if the sampled values
There are three possible resampling strategies: (a) resample
Three arm trials with a placebo arm
Trials are occasionally designed with a placebo arm in addition to the control and experimental arms, thereby providing a direct estimate of λ _{P} (Ghosh et al. 2011). The Taylor series approximation (Eq. (2)) requires an additional term to reflect the uncertainty in the estimate of λ _{P}:
The additional term is generally much smaller than the first two terms and, in expectation, (7) tends towards (2) when λ
_{E} = λ
_{C}. This leads to a paradoxical finding, namely that the sample size of the placebo group appears to be irrelevant when this equality is assumed (as is commonly the case when designing noninferiority trials). This paradox is explained by the fact that Ψ = 1 when λ
_{E} = λ
_{C} regardless of the value of λ
_{P}. However, the placebo group needs to be large enough in order to ensure that the estimate
Coverage probabilities
Methods
Exact coverage probabilities for the lower and upper confidence limits (at nominal coverage probabilities of 1α, for α = 0.025, 0.05) were computed using the delta method and profilelikelihood method described in Section 3. For the purposes of exposition we assume F _{C} = F _{E} = 1, so that λ _{C} and λ _{E} can be considered as the expected number of events, and λ _{P} the expected number of counterfactual events, in each of the two trial arms. The following parameters were examined over a grid of values: Ψ = 0.5(0.1)1.0; λ _{P} = 40(20)100; θ _{C} = 0.6(0.1)0.9. Exact coverage probabilities were computed by
where
The loglikelihood is undefined when either X _{C} = 0 or X _{E} = 0. However, in contrast with the rate ratio, this is a highly informative outcome in terms of the AIR (even X _{C} = 0, X _{E} = 0). To avoid this problem, X _{C} and X _{E} were replaced by X _{C} + 0.5 and X _{E} + 0.5 before applying the methods of Section 3.2. For consistency, this adjustment was also applied for confidence limits determined by the delta method. The addition of 0.5 resulted in improved coverage estimates under both approaches, as has previously been reported for the rate ratio (Price and Bonett 2000).
Results
The complete set of coverage probabilities for the lower and upper confidence limits are given in the Appendix. However, the lower confidence limit is of primary interest since this is the comparator for the noninferiority margin. Also, the upper limit of Ψ may be severely constrained for large values of θ _{C}. Ψ can be expressed as θ _{E}/θ _{C}, so that, for example, Ψ ≤ 1.25 if θ _{C} = 0.8, Ψ ≤ 1.11 if θ _{C} = 0.9.
Figure 1 shows coverage probabilities using the delta method for the lower onetailed α = 0.05 confidence limit (similar patterns were observed for α = 0.025). Coverage is generally too low for Ψ = 0.5–0.8, is reasonably accurate for Ψ = 0.9, and is too high for Ψ = 1.0. This pattern is explained by a negative correlation between the empirical AIR and its estimated standard error, conditional on the true AIR (Ψ). Conditional on Ψ, coverage is higher the larger the value of the control arm effectiveness (θ _{C}), except for Ψ = 1.0 when differences are minor. As expected, actual and nominal coverage are closer the larger the value of λ _{P}, although convergence is slow with material discrepancies even for λ _{P} = 100. Coverage probabilities for the upper confidence limit were consistently and substantially too high (Appendix), particularly for lower values of Ψ.
Table 1 shows coverage probabilities for the profilelikelihoodbased lower confidence limit for λ _{P} = 40 and α = 0.05. Coverage was close to the nominal value of 0.95 (range 0.9468–0.9615) for all permutations of Ψ and θ _{C}; as expected, correspondence was even closer at higher values of λ _{P} (not shown). Coverage for the profilelikelihoodbased upper confidence limit were also highly accurate, in contrast to the delta method (Appendix). The results of these analyses support the routine use of profilelikelihoodbased confidence limits, although the delta method is valid in a conservative sense (i.e. actual coverage exceeds nominal coverage) if the true AIR is ≥0.9 approximately. This is reflected in larger values for the lower confidence limit using the delta method (Appendix).
Effectiveness of control treatment (θ _{C})  AIR (Ψ)  

0.5  0.6  0.7  0.8  0.9  1.0  
0.6  0.9468  0.9521  0.9518  0.9522  0.9517  0.9502 
0.7  0.9510  0.9539  0.9511  0.9522  0.9519  0.9511 
0.8  0.9523  0.9522  0.9553  0.9517  0.9532  0.9518 
0.9  0.9539  0.9538  0.9579  0.9489  0.9568  0.9615 

Nominal coverage is 0.95.
Example
The BRIEF TB/A5279 study was a randomised, noninferiority trial that compared two regimens for the prevention of active tuberculosis in HIVinfected patients who were living in areas of high tuberculosis prevalence or who had evidence of latent tuberculosis infection (Swindells et al. 2019). The reference regimen was 9 months of daily isoniazid alone (9month arm) and the experimental regimen was 1 month of daily rifapentine plus isoniazid (1month arm). The incidence of the primary endpoint (diagnosis of tuberculosis, or death from tuberculosis or unknown cause) were similar in the 1month arm (32 endpoints, 4,926 personyears followup (PYFU), incidence rate 0.65 per 100 PYFU) and 9month arm (33 endpoints, 4,896 PYFU, incidence rate 0.67 per 100 PYFU). The primary metric was the rate difference rather than the rate ratio, which is generally used in HIV prevention research. Noninferiority was declared by the investigators because the upper 97.5% confidence limit of 0.30 per 100 PYFU was less than the prespecified margin of 1.25 per 100 PYFU. However, this conclusion is questionable as the authors did not take the counterfactual placebo incidence into account. Notably, the observed incidence in the 9month arm was markedly lower than the incidence rate assumed for the purposes of sample size calculation (2 per 100 PYFU).
Figure 2 shows the lower 5% and upper 95% confidence limits for the AIR as a function of the counterfactual incidence, computed using the delta and profilelikelihood methods. Consistent with results of Section 6.2, the delta method yields narrower confidence intervals. The figure also reveals the sensitive relationship between the lower confidence limit and the assumed counterfactual incidence, underlining the importance of obtaining as much information as possible about this parameter.
Figure 3 show the results of a Bayesian analysis (10,000 simulations) under two different priors for the counterfactual incidence: Gamma(10,0.001) and Gamma(10,0.002), corresponding to mean incidence rates of 1 and 2 per 100 PYFU, respectively. Without expert knowledge, we emphasise that this is an illustrative rather than a definitive analysis. The lower incidence rate is broadly consistent with the overall ∼30% efficacy of tuberculosis prophylaxis in HIVinfected patients (Ross et al. 2021); the higher value is the rate that the investigators postulated for the control regimen (post hoc, a substantial overestimate).
For the low incidence scenario, 22.2% of initial simulations had to be resampled because of violation of Eq. (6). The posterior median (90% credibility interval) AIR was 1.038 (0.347, 3.627) under resampling strategy (a), 1.033 (0.373, 3.228) under strategy (b), and 1.031 (0.357, 3.281) under strategy (c). For the high incidence scenario, only 0.6% of initial simulations had to be resampled. The posterior median (90% credibility interval) AIR under resampling strategy (a) was 1.009 (0.760, 1.370). The values under the other resampling strategies were almost identical (all within ±0.002). In general, our preference is to resample
Summary
We have described two approaches for calculating confidence limits for the AIR given a prespecified value of the counterfactual incidence: a closedform solution based on a Taylor series expansion (delta method), and an iterative method based on the profilelikelihood, for which R code is provided. The profilelikelihood method is preferred because of better coverage properties, but the delta method is valid when the experimental treatment is no less effective than the control treatment. The difference between the two methods is minimal when the counterfactual incidence is much larger than the observed incidence in both the control and experimental arms. We also describe a simple Bayesian approach when the counterfactual incidence rate can be represented as a simple prior distribution. However, more precise inference can be achieved by harnessing other data which inform the prior distribution (Glidden, Stirrup, and Dunn 2020).
Funding source: NIHNational Institutes of Health
Award Identifier / Grant number: R01AI143357
Award Identifier / Grant number: Unassigned
Acknowledgments
We thank Doug Taylor for discussions on threearm trials and Trinh Duong for her comments on the paper.

Research funding: David Glidden was supported by US National Institutes of Health grant R01AI143357.

Author contribution: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

Competing interests: Authors state no conflict of interest.

Informed consent: Not applicable.

Ethical approval: Not applicable.
References
Cole, S. R., H. Chu, and S. Greenland. 2014. “Maximum Likelihood, Profile Likelihood, and Penalized Likelihood: A Primer.” American Journal of Epidemiology 179 (2): 252–60. https://doi.org/10.1093/aje/kwt245.Search in Google Scholar
Dunn, D. T., and D. V. Glidden. 2019. “The Connection between the Averted Infections Ratio and the Rate Ratio in ActiveControl Trials of Preexposure Prophylaxis Agents.” Statistical Communications in Infectious Diseases 11 (1): 20190006. https://doi.org/10.1515/scid20190006.Search in Google Scholar
Dunn, D. T., D. V. Glidden, O. T. Stirrup, and S. McCormack. 2018. “The Averted Infections Ratio: A Novel Measure of Effectiveness of Experimental HIV Preexposure Prophylaxis Agents.” Lancet HIV 5 (6): e329–34. https://doi.org/10.1016/s23523018(18)300456.Search in Google Scholar
Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 1995. Bayesian Data Analysis. New York: Chapman & Hall.10.1201/9780429258411Search in Google Scholar
Ghosh, P., F. Nathoo, M. Gonen, and R. C. Tiwari. 2011. “Assessing Noninferiority in a ThreeArm Trial Using the Bayesian Approach.” Statistics in Medicine 30 (15): 1795–808. https://doi.org/10.1002/sim.4244.Search in Google Scholar
Glidden, D. V., O. T. Stirrup, and D. T. Dunn. 2020. “A Bayesian Averted Infection Framework for PrEP Trials with Low Numbers of HIV Infections: Application to the Results of the DISCOVER Trial.” Lancet HIV 7 (11): e791–6. https://doi.org/10.1016/s23523018(20)301922.Search in Google Scholar
Graham, P. L., K. Mengersen, and A. P. Morton. 2003. “Confidence Limits for the Ratio of Two Rates Based on Likelihood Scores: Noniterative Method.” Statistics in Medicine 22: 2071–83. https://doi.org/10.1002/sim.1405.Search in Google Scholar PubMed
Li, H.Q., M.L. Tang, and W.K. Wong. 2014. “Confidence Intervals for Ratio of Two Poisson Rates Using the Method of Variance Estimates Recovery.” Computational Statistics 29: 869–89. https://doi.org/10.1007/s0018001304679.Search in Google Scholar
Oehlert, G. W. 1992. “A Note on the Delta Method.” The American Statistician 46: 27–9. https://doi.org/10.1080/00031305.1992.10475842.Search in Google Scholar
Pigeot, I., J. Schafer, J. Rohmel, and D. Hauschke. 2003. “Assessing Noninferiority of a New Treatment in a ThreeArm Clinical Trial Including a Placebo.” Statistics in Medicine 22 (6): 883–99. https://doi.org/10.1002/sim.1450.Search in Google Scholar PubMed
Price, R. M., and D. G. Bonett. 2000. “Estimating the Ratio of Two Poisson Rates.” Computational Statistics & Data Analysis 34: 345–56. https://doi.org/10.1016/s01679473(99)001000.Search in Google Scholar
Ross, J. M., A. Badje, M. X. Rangaka, A. S. Walker, A. E. Shapiro, K. K. Thomas, X. Anglaret, S. Eholie, D. Gabillard, A. Boulle, G. Maartens, R. J. Wilkinson, N. Ford, J. E. Golub, B. G. Williams, and R. V. Barnabas. 2021. “Isoniazid Preventive Therapy Plus Antiretroviral Therapy for the Prevention of Tuberculosis: A Systematic Review and MetaAnalysis of Individual Participant Data.” Lancet HIV 8 (1): e8–15. https://doi.org/10.1016/s23523018(20)30299x.Search in Google Scholar
Sahai, H., and A. Khurshid. 1993. “Confidence Intervals for the Ratio of Two Poisson Means.” The Mathematical Scientist 18: 43–50.Search in Google Scholar
Snapinn, S., and Q. Jiang. 2008. “Preservation of Effect and the Regulatory Approval of New Treatments on the Basis of Noninferiority Trials.” Statistics in Medicine 27 (3): 382–91. https://doi.org/10.1002/sim.3073.Search in Google Scholar PubMed
Swindells, S., R. Ramchandani, A. Gupta, C. A. Benson, J. LeonCruz, N. Mwelase, J. Juste, J. Lama, A. Valenica, A. OmozOarhe, K. Supparatpinyo, G. Masheto, L. Mohapi, R.O. da Silva Escada, S. Mawlana, P. Banda, P. Severe, J. Hakim, C. Kanyama, D. Langat, L. Moran, J. Andersen, C. V. Fletcher, E. Nuermberger, and R. E. Chaisson, BRIEF TB/A5279 Study Team. 2019. “One Month of Rifapentine Plus Isoniazid to Prevent HIVRelated Tuberculosis.” New England Journal of Medicine 380 (11): 1001–11. https://doi.org/10.1056/nejmoa1806808.Search in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/scid20210002).
© 2021 David T. Dunn et al., published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.