Abstract
Objectives
The use of correlates of protection (CoPs) in vaccination trials offers significant advantages as useful clinical endpoint substitutes. Vaccines with very high vaccine efficacy (VE) are documented in the literature (95% or above). Callegaro, A., and F. Tibaldi. 2019. “Assessing Correlates of Protection in Vaccine Trials: Statistical Solutions in the Context of High Vaccine Efficacy.” BMC Medical Research Methodology 19: 47 showed that the rare infections observed in the vaccinated groups of these trials poses challenges when applying conventionallyused statistical methods for CoP assessment such as the Prentice criteria and metaanalysis. The objective of this work is to investigate the impact of this problem on another statistical method for the assessment of CoPs called Principal stratification.
Methods
We perform simulation experiments to investigate the effect of high vaccine efficacy on the performance of the Principal Stratification approach.
Results
Similarly to the Prentice framework, simulation results show that the power of the Principal Stratification approach decreases when the VE grows.
Conclusions
It can be challenging to validate principal surrogates (and statistical surrogates) for vaccines with very high vaccine efficacy.
Introduction
An important factor influencing the duration and complexity of clinical trials is the choice of the endpoint used to assess vaccine efficacy. It would be extremely convenient to replace a late, costly or rare true endpoint by an immunological surrogate, which is measured earlier, cheaper, or more frequently. However, from a regulatory perspective, a surrogate endpoint (called sometimes Surrogate of protection, or Correlate of Protection) is not considered acceptable for the determination of efficacy, unless it has been validated, i.e. shown to predict clinical benefit. Prentice (1989) introduced a formal definition of surrogacy based on the concept of mediation in a singletrial setting. Although appealing, Prentice’s definition and criteria received criticism, such as (i) the assumption that the surrogate explains 100% of the VE is too restrictive; (ii) the approach can be susceptible to postrandomization selection bias; (iii) the immune response cannot be constant in the control group; etc. (Burzykowski, Molenberghs, and Buyse 2005; Frangakis and Rubin 2002). In subsequent decades, many statistical methods have been proposed for the evaluation of surrogate endpoints, most of them framed within the causal inference (Follmann 2006; Frangakis and Rubin 2002; Gilbert, Qin, and Self 2008) and metaanalytic paradigms (Buyse et al. 2000; Daniels and Hughes 1997; Gail et al. 2000).
Although not common, vaccines with very high efficacy are documented in the literature (Black et al. 2000; Lin et al. 2001; Mitra et al. 2016; Phua et al. 2012; Prymula et al. 2014; Wei et al. 2016). These include the salmonella typhi vi conjugate (Mitra et al. 2016), or the combined measlesmumpsrubellavaricella immunisation (Prymula et al. 2014). Assessing Correlate of Protections (CoPs) in the context of high Vaccine Efficacy (VE) using classical statistical methods is problematic. Indeed, a very small number of cases/infections (corresponding to the vaccinated groups) can trigger considerable issues for such statistical models. There is therefore a need to evaluate the statistical methods for CoP assessment to the context of high efficacy vaccines. Callegaro and Tibaldi (2019) showed that the validation of a surrogate endpoint using the Prentice criteria and metaanalytic frameworks (by randomized subgroups in single trial setting) can be problematic in case of high VE because of the rare events available in the vaccine group. The aim of this paper is to evaluate the performance of the causal framework, specifically the Principal Surrogate approach (Follmann 2006; Gilbert, Qin, and Self 2008) in case of high VE.
Methods
The Prentice criteria
The following set of notation will be used throughout the manuscript: Y _{ j } and S _{ j } are random variables denoting the observed clinical (binary) and the surrogate endpoint for subject j=1, …, n and Z _{ j } is a binary treatment indicator (Z=1 for treatment and Z=0 for control group).
Key concepts, including the hypothesistesting approach to the validation of substitute endpoints using randomised clinical trial data, were introduced by Prentice in 1989 (Prentice 1989). Prentice’s four criteria for the validation of a surrogate endpoint can be evaluated using the following 4 models:
In this paper we will mainly focus on criterion 4. This criterion is met if the null hypothesis H_{01}: γ
_{
Z
}=0 is rejected (pvalue(s)
Principal surrogate framework
Many causal inference approaches/methods have been published in the literature. In what follows, we describe the Vaccine Efficacy Framework of Follmann and Gilbert (Follmann 2006; Gilbert, Qin, and Self 2008). Since S _{ i } can be affected by treatment, there are 2 naturally occurring counterfactual values of S _{ i }: S _{ i }(1) under treatment, and S _{ i }(0) under control. The observed clinical endpoint (binary) is denoted by Y _{ i } and the counterfactual values are Y _{ i }(1) under treatment, and Y _{ i }(0) under control. Criteria for S to be a good surrogate are based on risk estimands that condition on the potential surrogate responses (Gilbert, Qin, and Self 2008)
A contrast in risk_{(1)}(s(1), s(0)) and risk_{(0)}(s(1), s(0)) is a causal effect on the clinical endpoint. A classical contrast used in vaccines is the Vaccine Efficacy (VE)
A Principal Surrogate (PS) is a biomarker satisfying two conditions: causal necessity
and Wide Effect Modification (WEM) which means that
WEM is similar in spirit to the Individual Causal Association (ICA) (Alonso et al. 2015), which is the correlation between the individual causal effect on the endpoint and on the surrogate.
In this paper we only focus on WEM. In fact, current works (Gabriel and Follmann 2016; Gabriel and Gilbert 2014; Gilbert, Qin, and Self 2008; Huang and Gilbert 2011; Wolfson and Gilbert 2010) suggest that WEM criterion is of primary importance for a biomarker to be a PS. Furthermore, Alonso et al. (2015) showed that the average causal necessity definition may be extremely restrictive.
Estimating VE
Assumptions A1–A3 (A1: Stable unit treatment value assumption; A2: Ignorable treatment assignments; A3: Equal individual clinical risk up to the time of surrogate measurements) imply that risk(Z)(s(1), s(0)) would be identified if we knew the potential outcomes S _{ i }(Z) of subjects assigned the opposite treatment 1 − Z (Wolfson and Gilbert 2010)
It follows that it is necessary to impute (or integrate out) the missing potential biomarkers. The risk can be modeled using the following logistic model
The model can be simplified in case of a Constant Biomarker (S _{ i }(0) = c)
where the VE curve is used
The constant biomarker assumption is reasonable when subjects have been selected to have no meaningful exposure to the pathogen, so that S(0) = 0. Examples include HIV (Follmann 2006) or varicella vaccine trials (Chan et al. 2002). This assumption is also reasonable for populations exposed to the pathogen when the biomarker S _{ i } is the log10 FoldIncrease from baseline (FI_{ i }), which is the difference between the log10 post (A _{ i }) and the log10 baseline (B _{ i }) values (FI_{ i }=A _{ i } − B _{ i }).
Missing values imputation/integration
The key challenge in estimating these risk estimands is solving the problem of conditioning on counterfactual values that are not observable. This involves integrating out (or imputing) missing values based on some models, and under some set of assumptions and/or trial augmentations. Gilbert, Qin, and Self (2008) and Follmann (2006) proposed to use the estimated maximum likelihood followed by bootstrap. Huang, Gilbert, and Wolfson (2013) suggested a pseudoscore estimation procedure that does have a closed form variance estimator. Miao et al. (2013) used a multiple imputation approach. In this paper we fit model 1 using the method implemented in the R package pseval (Sachs and Gabriel 2016): Baseline Immunogenicity Predictor (BIP); parameters estimated using estimated maximum likelihood (missing information is integrated out) and the variance is estimated by bootstrap. Rcode is provided in the Appendix. This approach is similar in spirit to the method used in Follmann (2006).
Results
Simulations of Callegaro and Tibaldi (2019)
To evaluate the impact of high vaccine efficacy on the PS validation, we repeated the simulations of Callegaro and Tibaldi (2019). The Dunning regression model (Dunning 2006) was used to simulate the data in an ideal CoP setting, where the treatment effect is fully explained by the post values (A _{ i }) as follows:
Here, π can be interpreted as the probability of being exposed to the disease. This model corresponds to the classical logistic model when all subjects are exposed (π=1).
Simulations were run using the following parameter assumptions: total sample size n=5,000, 1:1 randomization, π=0.1, μ=8.3, γ=log(1–0.95); the immune response post vaccination is normally distributed AZ=0 ∼ N(3, 0.2) in the placebo group and AZ=1 ∼ N(3 + Δ, 0.2) in the vaccine group, where Δ=0.33, 0.75, 1, 1.5. The value of the immune response at baseline is generated as B ∼ N(3, 0.2) with correlation between A and B of 0.90 in the placebo group and 0.50 in the vaccine group (0.2 is the variance of the normal distribution). For each scenario, 1,000 clinical trials were simulated.
We fit Prentice model 4 on the simulated data with FoldIncrease (S _{ i }=FI_{ i }=A _{ i } − B _{ i }) as surrogate adjusting for the baseline (B _{ i }) using logit regression
and the scaled logistic model Dunning (2006)
Note that this model is consistent with the model used to generate the data (Eq. (2)), with a slightly different parametrization. The power to meet Prentice criterion 4 (PC4) was measured as the proportion of simulated trials with pvalue(s)
Furthermore, we applied the Principal surrogate approach on vaccine induced foldincrease (S(1)_{ i }=FI(1)_{ i }) where missing information is integrated out using the baseline surrogate measurement (B _{ i }). The power of the WEM approach was measured as the proportion of simulated trials with significant Wald statistics for the s(1)z coefficent of model (1) (pvalue(s(1)z)<α, α=0.05). Appendix contains the R code used to apply the PS approach is provided.
Table 1 shows that the power of both PC4 and WEM decreases when the VE increases. This is due to the fact that there is less information (number of events) as the VE increases. Note that the power of the Prentice approach is higher than in Callegaro and Tibaldi (2019) because of the inclusion of the baseline surrogate as covariate. Simulation results suggest similar power for PC4 and WEM approaches.
Table 1:
Δ 

PC4 logistic  PC4 scaled logistic  WEM 

0.33  0.41  0.94  0.96  0.92 
0.75  0.75  0.93  0.96  0.92 
1.00  0.87  0.89  0.95  0.90 
1.50  0.96  0.80  0.88  0.73 

Power (α=0.05) to assess Prentice criterion 4 (PC4) using logistic and scaled logistic model and power to assess the Wide Effect Modification (WEM) of a Principal Surrogate using the logistic model (pvalue of the interaction s(1)z).
The performance of the two approaches depends on the correlation between A and B. In fact, larger is the correlation, more informative is the covariate B. To assess the role of the correlation on the results, we replicated Table 1 with smaller correlation between A and B (Cor(A,B)=0.5 in the placebo and in the vaccine group). Simulation results are shown in Table 2. We can see that when the correlation is smaller (i.e. when the covariate B is less informative) there is a greater loss of power for high VE for both approaches, especially for the PS approach. These results are aligned with the simulation results of Callegaro and Tibaldi (2019), showing a similar loss of power of Prentice method without covariates.
Table 2:
Δ 

PC4 logistic  PC4 scaled logistic  WEM 

0.33  0.41  0.94  0.96  0.98 
0.75  0.75  0.89  0.96  0.97 
1.00  0.86  0.79  0.95  0.92 
1.50  0.96  0.69  0.95  0.62 

Power (α=0.05) to assess Prentice criterion 4 (PC4) using logistic and scaled logistic model and power to assess the Wide Effect Modification (WEM) of a Principal Surrogate using the logistic model (pvalue of the interaction s(1)z).
Simulations with constant biomarker under placebo
In the previous simulations the FoldIncrease was not constant in placebo (it was normally distributed). To evaluate the performance of the Prentice and PS approach in case of constant biomarker under placebo, which mimics vaccine trials in a naive population, we simulated data using the model described above. However, in the inferential models, we replaced FI by FI* which is constant in Placebo. FI* is defined as
where c is the 99% quantile of the distribution of FI in Placebo.
Table 3 shows some loss of power of the PS approach when the VE increases. Even if the use of the Prentice framework is not justified in this context, Table 3 shows the results of the Prentice criteria 4 (PC4 logistic model). Results from PC4 scaled logistic are not shown because the model is not converging. We observe a dramatic loss of power of Prentice criterion 4 when the VE is high.
Table 3:
Δ 

PC4 logistic  WEM 

0.33  0.41  0.80  0.67 
0.75  0.75  0.45  0.85 
1.00  0.87  0.49  0.84 
1.50  0.96  0.37  0.69 

Power (α=0.05) to assess Prentice criterion 4 (PC4) using logistic model and power to assess the Wide Effect Modification (WEM) of a Principal Surrogate using the logistic model (pvalue of the interaction s(1)z).
Note that Table 3 shows simulation results where the inferential models do not agree with the data generating mechanism, so it represents a situation of model missspecification.
To disentangle the problem of model missspecification from the constant biomarker problem, we generate additional constant biomarker data using a model consistent with the “inferential” model used to fit the data. We simulated data using the following Dunning regression model:
Here, π = 0.1 and the other parameters are chosen to mimic Table 1 data: Δ=0.33, 0.75, 1, 1.47, μ=(8.66, 9.45, 9.82, 9.41), γ=(−5.39, − 5.15, − 4.8, − 4.45) and γ _{ B }=(−2.31, − 2.63, − 2.79, − 2.66).
Table 4 shows that the loss of power of Prentice approach shown in Table 3 was mainly due to model missspecification. In fact, Table 4 shows a relatively higher power of PC4 logistics than Table 3 when VE is large.
Table 4:
Δ 

PC4 logistic  WEM 

0.33  0.29  0.96  0.73 
0.75  0.62  0.94  0.85 
1.00  0.78  0.96  0.84 
1.47  0.94  0.79  0.80 

Power (α=0.05) to assess Prentice criterion 4 (PC4) using logistic model and power to assess the Wide Effect Modification (WEM) of a Principal Surrogate using the logistic model (pvalue of the interaction s(1)z).
Simulations with low/moderate VE
For comparison, we considered simulations with low VE. We simulated data as described above with μ _{1}=E(AZ=1)=3, 3.075, 3.15, 3.23, corresponding to estimated VE about 0%, 10%, 20% and 30%, respectively. Note that Prentice criteria 1 will not be met in this situation. For simplicity, we focused only on Prentice criterion 4. Table 5 shows that both approaches (PC4 and WEM) are powerful in the case of low/moderate VE. Prentice criterion 4 seems to be slightly more powerful than PS.
Table 5:
Δ 

PC4 logistic  PC4 scaled logistic  WEM 

0.000  −0.01  0.95  0.96  0.92 
0.075  0.09  0.95  0.96  0.91 
0.150  0.20  0.95  0.97  0.91 
0.250  0.31  0.95  0.97  0.93 

Power (α=0.05) to assess Prentice criterion 4 (PC4) using logistic and scaled logistic model and power to assess the Wide Effect Modification (WEM) of a Principal Surrogate using the logistic model (pvalue of the interaction s(1)z).
Simulations using random intercept logistic (correlated potential outcomes)
Finally, we generated data in a different way more aligned with the causal inference setting (potential outcomes). We generated correlated postvaccination values (A(0), A(1)) using a bivariate normal distribution
with Δ=(0.33, 0.75, 1.1, 1.6). The mean and the variance of the baseline are the same as the postdose surrogate in Placebo. The correlation between baseline and post is 90% in Placebo and 50% in Vaccinated, respectively. We generated the correlated clinical outcomes using a logistic model with individual random intercept (b _{ i })
The variables Y(0), Y(1) are conditionally independent given b but unconditionally (averaged over b) correlated. The extent of correlation depends on the variance of the random effect (var(b)).
We generated bridge distributed random intercept (using R package bridgedist Swihart (2016)) such that the resultant marginal distribution follows a logistic regression model Wang and Louis (2003). In fact, the marginal logistic regression model is logit(P(Y(z)=1A(z)))=μ/c + A(z)γ/c for z=0, 1 with
Table 6 shows that Prentice criterion 4 is more powerful than WEM.
Table 6:
Δ 

PC4 logistic  WEM 

0.33  0.46  0.95  0.85 
0.75  0.75  0.94  0.84 
1.10  0.87  0.91  0.77 
1.60  0.95  0.89  0.74 

Power (α=0.05) to assess Prentice criterion 4 (PC4) using logistic model and power to assess the Wide Effect Modification (WEM) of a Principal Surrogate using the logistic model (pvalue of the interaction s(1)z).
The Prentice framework is more powerful than PS for different reasons: (i) PS tests for an interaction, which is less powerful than a test for the main effect; (ii) the covariate S (observed surrogate in vaccinated and placebo) has greater range in the Prentice model 4 than the covariate S(1) in the PS model. It is easier to estimate a slope for a covariate with a bigger range. Figure 1 illustrate these differences.
Figure 1:
Case study: analysis of a simulated dataset with large VE
In this section we analyze one simulated dataset from the scenario with largest VE of Table 1. The sample size is n=5,000, with 1:1 randomization. The number of events observed in the two groups are 3 and 90, with an estimated VE of 96% (95%CI, 89–98%). Figure 2 shows that the vaccine and placebo groups had similar log10 titer distributions at baseline while there is a small overlap in distributions post vaccination. Antibody responses clearly increased from baseline to postdose in vaccine recipients but not in placebo recipients.
Figure 2:
Figure 3 shows the Spearman correlation between baseline and post (left panel) and between baseline and FoldIncrease (right panel).
Figure 3:
Prentice framework
First we examine the interaction between surrogate and the treatment. Table 7 shows that there is no interaction (pvalue=0.49).
Table 7:
Estimate  Std. error  z Value  pValue  

(Intercept)  0.833  0.717  1.161  0.245 
Z  −0.576  1.696  −0.339  0.734 
FI  −1.060  0.555  −1.909  0.056 
B  −1.434  0.257  −5.583  0.000 
group:FI  −0.966  1.401  −0.690  0.490 
Secondly, we assess the four Prentice criteria. Table 8 shows that all criteria are met. In particular, the last 4 rows shows the results related to criterion 4. We can see that the effect of the surrogate is significant (pvalue(s)=0.019), while the treatment effect is not significant, but is close to 5% (pvalue(z) = 0.078).
Table 8:
Criterion  Variable  Estimate  Std error  z Value  pValue 

1  (Intercept)  0.433  0.698  0.620  0.535 
1  Z  −3.448  0.588  −5.865  0.000 
1  B  −1.293  0.249  −5.196  0.000 
2  (Intercept)  0.956  0.031  30.734  0.000 
2  Z  1.472  0.009  163.653  0.000 
2  B  −0.317  0.010  −31.166  0.000 
3  (Intercept)  1.092  0.702  1.556  0.120 
3  FI  −1.983  0.285  −6.970  0.000 
3  B  −1.545  0.250  −6.176  0.000 
4  (Intercept)  0.825  0.717  1.150  0.250 
4  Z  −1.644  0.933  −1.763  0.078 
4  FI  −1.205  0.514  −2.345  0.019 
4  B  −1.432  0.257  −5.574  0.000 
Slightly better results are obtained if Dunning model is used (see Table 9).
Table 9:
Variable  Estimate  Std error  z Value  pValue 

(Intercept)  8.528  3.442  2.478  0.013 
FI  −2.662  1.131  −2.353  0.019 
Z  −0.620  1.305  −0.475  0.635 
B  −2.978  0.963  −3.092  0.002 
logit(pi)  −2.386  0.370  −6.440  0.000 
In summary, there is suggestive though not strong evidence that the FoldIncrease is a Statistical Surrogate.
Principal surrogate framework
Table 10 shows the results from R package pseval with 50 bootstrap (R codes are provided in the Appendix). We can see that the interaction between the treatment group and FI(1) (test for wide effect modification) is borderline (pvalue=0.053).
Table 10:
Estimate  Boot se  Lower CL 2.5%  Upper CL 97.5%  pValue  

(Intercept)  −7.81  1.146  −10.35  −6.147  9.13^{−12} 
FI(1)  2.64  0.573  1.79  3.891  4.18^{−6} 
Z  2.90  2.846  −3.66  6.593  3.08^{−1} 
FI(1):Z  −3.98  2.053  −8.11  −0.157  5.28^{−2} 
Figure 4 shows the estimated VE curve for FoldIncrease. The estimated VE curve is an increasing function of FI(1), however we can see large variability for small values of FI(1) and negative VEs for vaccine recipients with no rise.
Figure 4:
In summary, there is suggestive though not strong evidence that the FoldIncrease is a Principal Surrogate.
Discussion
Although not common, vaccines with very high efficacy (95% or above) are documented in the literature (Black et al. 2000; Lin et al. 2001; Mitra et al. 2016; Phua et al. 2012; Prymula et al. 2014; Wei et al. 2016). These trials raise the problem of assessing CoPs in the context where small number of cases/infections in vaccinated groups are available.
Callegaro and Tibaldi (2019) showed that the validation of a surrogate endpoint using the Prentice criteria and metaanalytic frameworks (by randomized subgroups in single trial setting) can be problematic in case of high VE. In this paper, we evaluate the performance of the causal framework, specifically the Principal Surrogate (PS) approach (Follmann 2006; Gilbert, Qin, and Self 2008) in case of high VE.
First, we replicated the simulation study of Callegaro and Tibaldi (2019) where the clinical outcome was simulated using Prentice model 3 (assuming full mediation) and using the Dunning model (Dunning 2006). These simulation results show that i) adjustments for important covariates (such as baseline surrogate) considerably improves the power of the Prentice approach (even if the model is missspecified) in case of high VE. Furthermore, these simulation results show similar power of Prentice and PS frameworks. The power of both approches decreases when VE grows.
Second, we slightly changed the Callegaro and Tibaldi scenario to consider the case of constant biomarker under placebo and the case of small/moderate VE. Simulation results show that i) PS is more powerful than Prentice in case of constant biomarker when the inferential model is missspecified, otherwise Prentice is more powerful; ii) Prentice criteria 4 and PS frameworks are powerful when the VE is small (see Table 3). However, in this case Prentice criteria 1 is not met, so the two approaches give different conclusions.
Finally, we simulated correlated potential outcome data using a bivariate (random intercept) logistic regression. In this case the Prentice framework is more powerful than the PS approach. This can be due to the following reasons: (i) Prentice model 4 corresponds to the model used to generate the data and so there is no lack of fit in the Prentice framework; (ii) PS tests for an interaction, which is less powerful than a test for the main effect; (iii) the covariate S (observed surrogate in vaccinated and placebo) has greater range in the Prentice model 4 than the covariate S(1) in the PS model. It is easier to estimate a slope for a covariate with a bigger range (see Figure 1); (iv) Principal stratification has to impute S(1) for placebo participants which increases the variability of estimates relative to knowing S(1). In contrast S is known in all for the Prentice criterion.
For computational reasons, we performed relatively small number of iterations (1,000). Larger number of iterations can be considered in the future using multiple processors. What is computationally intensive is the bootstrap of the PS approach. As an example, 200 resampling on the case study required 14 min. To mitigate the computational load, it may be useful in the future to derive asymptotic formulas approximating the bootstrap approach.
It is important to highlight that the power comparison between the two approaches should be interpreted with care. In fact, the two approaches measure two different things: Prentice framework evaluates if the surrogate is a “statistical surrogate” while the PS evaluates if the surrogate is a “principal surrogate” (see Gilbert et al. (2015) for more details).
For illustration, we analyzed one dataset simulated with full mediation (Dunning model 3) and with high VE (
In conclusion, we evaluated by simulation the impact of high VE on the PS approach. Similarly to the Prentice framework, we showed that the power decreases when the VE grows. It follows that it can be challenging to validate a principal surrogate (and a statistical surrogate) when rare infections are observed in the vaccinated groups.
Funding source: GSK Vaccines

Research funding: GlaxoSmithKline Biologicals SA was the funding source for all costs associated with the development and the publishing of the present manuscript.

Conflict of Interest: AC and FT are employees of the GlaxoSmithKline group of companies. AC and FT own stock options in the GlaxoSmithKline group of companies. Prof DF declares that he has no conflict of interest.
Appendix: R Code used to apply the PS approach
library("pseval")
binary.
est < psdesign
(data, Z = group, Y = y, S = FI, BIP = B) +
integrate_parametric
(S.1 ∼ BIP) +
risk_binary
(model = Y ∼ S.1 * Z, D = 50, risk = risk.
logit) +
ps_estimate
(method = "BFGS")
binary.
boot < binary.
est + ps_bootstrap
(n.boots = 200,
progress.
bar = FALSE, start = binary.
est$estimates$par,
method = "BFGS")
References
Alonso, A., W. Van der Elst, G. Molenberghs, M. Buyse, and T. Burzykowski. 2015. “On the Relationship between the CausalInference and MetaAnalytic Paradigms for the Validation of Surrogate Endpoints.” Biometrics 71: 15–24. https://doi.org/10.1111/biom.12245.Search in Google Scholar
Black, S., H. Shinefield, B. Fireman, E. Lewis, P. Ray, J. R. Hansen, L. Elvin, K. M. Ensor, J. Hackell, G. Siber, F. Malinoski, D. Madore, I. Chang, R. Kohberger, W. Watson, R. Austrian, and K. Edwards. 2000. “Efficacy, Safety and Immunogenicity of Heptavalent Pneumococcal Conjugate Vaccine in Children.” The Pediatric Infectious Disease Journal 19 (3): 187–95. https://doi.org/10.1097/0000645420000300000003.Search in Google Scholar
Burzykowski, T., G. Molenberghs, and M. Buyse. 2005. The Evaluation of Surrogate Endpoints. New York: Springer.10.1007/b138566Search in Google Scholar
Buyse, M., G. Molenberghs, T. Burzykowski, D. Renard, and H. Geys. 2000. “The Validation of Surrogate Endpoints in MetaAnalyses of Randomized Experiments.” Biostatistics 1: 49–67. https://doi.org/10.1093/biostatistics/1.1.49.Search in Google Scholar
Callegaro, A., and F. Tibaldi. 2019. “Assessing Correlates of Protection in Vaccine Trials: Statistical Solutions in the Context of High Vaccine Efficacy.” BMC Medical Research Methodology 19: 47. https://doi.org/10.1186/s128740190687y.Search in Google Scholar
Chan, I. S. F., S. Li, H. Matthews, C. Chan, R. Vessey, J. Sadoff, and J. Heyse. 2002. “Use of Statistical Models for Evaluating Antibody Response as a Correlate of Protection against Varicella.” Statistics in Medicine 21 (22): 3411–30. https://doi.org/10.1002/sim.1268.Search in Google Scholar
Daniels, M. J., and M. D. Hughes. 1997. “Metaanalysis for the Evaluation of Potential Surrogate Markers.” Statistics in Medicine 16: 1965–82. https://doi.org/10.1002/(sici)10970258(19970915)16:17<1965::aidsim630>3.0.co;2m.10.1002/(SICI)10970258(19970915)16:17<1965::AIDSIM630>3.0.CO;2MSearch in Google Scholar
Dunning, A. J. 2006. “A Model for Immunological Correlates of Protection.” Statistics in Medicine 25 (9): 1485–97. https://doi.org/10.1002/sim.2282.Search in Google Scholar
Follmann, D. 2006. “Augmented Designs to Assess Immune Response in Vaccine Trials.” Biometrics 62 (4): 1161–9. https://doi.org/10.1111/j.15410420.2006.00569.x.Search in Google Scholar
Frangakis, C. E., and D. B. Rubin. 2002. “Principal Stratification in Causal Inference.” Biometrics 58 (1): 21–9. https://doi.org/10.1111/j.0006341x.2002.00021.x.Search in Google Scholar
Gabriel, E., and D. Follmann. 2016. “Augmented Trial Designs for Evaluation of Principal Surrogates.” Biostatistics 17 (3): 453467. https://doi.org/10.1093/biostatistics/kxv055.Search in Google Scholar
Gabriel, E., and P. Gilbert. 2014. “Evaluating Principal Surrogate Endpoints with TimeToEvent Data Accounting for TimeVarying Treatment Efficacy.” Biostatistics 15 (2): 251265. https://doi.org/10.1093/biostatistics/kxt055.Search in Google Scholar
Gail, M. H., R. Pfeiffer, H. C. V. Houwelingen, and R. Carroll. 2000. “On MetaAnalytic Assessment of Surrogate Outcomes.” Biostatistics 1: 231246. https://doi.org/10.1093/biostatistics/1.3.231.Search in Google Scholar
Gilbert, P. B., E. E. Gabriel, Y. Huang, and I. S. F. Chan. 2015. “Surrogate Endpoint Evaluation: Principal Stratification Criteria and the Prentice Definition.” Journal of Causal Inference 3: 157–75. https://doi.org/10.1515/jci20140007.Search in Google Scholar
Gilbert, P. B., L. Qin, and S. G. Self. 2008. “Evaluating a Surrogate Endpoint at Three Levels, with Application to Vaccine Development.” Statistics in Medicine 27 (23): 4758–78. https://doi.org/10.1002/sim.3122.Search in Google Scholar
Huang, Y., and P. Gilbert. 2011. “Comparing Biomarkers as Principal Surrogate Endpoints.” Biometrics 67 (4): 1442–51. https://doi.org/10.1111/j.15410420.2011.01603.x.Search in Google Scholar
Huang, Y., P. B. Gilbert, and J. Wolfson. 2013. “Design and Estimation for Evaluating Principal Surrogate Markers in Vaccine Trials.” Biometrics 69: 301–9. https://doi.org/10.1111/biom.12014.Search in Google Scholar
Lin, F. Y. C., V. A. Ho, H. B. Khiem, D. D. Trach, P. V. Bay, T. C. Thanh, Z. Kossaczka, D. A. Bryla, J. Shiloach, J. B. Robbins, R. Schneerson, S. C. Szu, M. N. Lanh, S. Hunt, L. Trinh, and J. B. Kaufman. 2001. “The Efficacy of a Salmonella Typhi Vi Conjugate Vaccine in TwoToFiveYearOld Children.” New England Journal of Medicine 344 (17): 1263–9. https://doi.org/10.1056/nejm200104263441701.Search in Google Scholar
Miao, C., X. Li, P. Gilbert, and I. Chan. 2013. “A Multiple Imputation Approach for Surrogate Marker Evaluation in the Principal Stratification Causal Inference Framework.” In Risk Assessment and Evaluation of Predictions. New York: Springer.10.1007/9781461489818_18Search in Google Scholar
Mitra, M., N. Shah, A. Ghosh, S. Chatterjee, I. Kaur, N. Bhattacharya, and S. Basu. 2016. “Efficacy and Safety of ViTetanus Toxoid Conjugated Typhoid Vaccine (Pedatyph) in Indian Children: School Based Cluster Randomized Study.” Human Vaccines & Immunotherapeutics 12 (4): 939–45. https://doi.org/10.1080/21645515.2015.1117715.Search in Google Scholar
Phua, K. B., F. S. Lim, Y. L. Lau, E. A. S. Nelson, L. M. Huang, S. H. Quak, B. W. Lee, L. J. van Doorn, Y. L. Teoh, H. Tang, P. V. Suryakiran, I. V. Smolenov, H. L. Bock, and H. H. Han. 2012. “Rotavirus Vaccine RIX4414 Efficacy Sustained during the Third Year of Life: A Randomized Clinical Trial in an Asian Population.” Vaccine 30 (30): 4552–7. https://doi.org/10.1016/j.vaccine.2012.03.030.Search in Google Scholar
Prentice, R. L. 1989. “Surrogate Endpoints in Clinical Trials: Definition and Operational Criteria.” Statistics in Medicine 8 (4): 431–40. https://doi.org/10.1002/sim.4780080407.Search in Google Scholar
Prymula, R., M. R. Bergsaker, S. Esposito, L. Gothefors, S. Man, N. Snegova, M. Štefkovičova, V. Usonis, J. Wysocki, M. Douha, V. Vassilev, O. Nicholson, B. L. Innis, and P. Willems. 2014. “Protection against Varicella with Two Doses of Combined MeaslesMumpsRubellaVaricella Vaccine versus One Dose of Monovalent Varicella Vaccine: A Multicentre, ObserverBlind, Randomised, Controlled Trial.” The Lancet 383 (9925): 1313–24. https://doi.org/10.1016/s01406736(12)614615.Search in Google Scholar
Sachs, M. C., and E. E Gabriel. 2016. “An Introduction to Principal Surrogate Evaluation with the Pseval Package.” The R Journal 8: 277–92. https://doi.org/10.32614/rj2016046.Search in Google Scholar
Swihart, B. 2016. Bridgedist: An Implementation of the Bridge Distribution with LogitLink as in Wang and Louis (2003). Also available at https://CRAN.Rproject.org/package=bridgedist, r package version 0.1.0.Search in Google Scholar
Wang, Z., and T. A. Louis. 2003. “Matching Conditional and Marginal Shapes in Binary Random Intercept Models Using a Bridge Distribution Function.” Biometrika 90: 765–75. https://doi.org/10.1093/biomet/90.4.765.Search in Google Scholar
Wei, M., F. Meng, S. Wang, J. Li, Y. Zhang, Q. Mao, Y. Hu, P. Liu, N. Shi, H. Tao, K. Chu, Y. Wang, Z. Liang, X. Li, and F. Zhu. 2016. “TwoYear Efficacy, Immunogenicity, and Safety of Vigoo Enterovirus 71 Vaccine in Healthy Chinese Children: A Randomised OpenLabel Study.” The Journal of Infectious Diseases 215: 56–63.10.1093/infdis/jiw502Search in Google Scholar
Wolfson, J., and P. Gilbert. 2010. “Statistical Identifiability and the Surrogate Endpoint Problem, with Application to Vaccine Trials.” Biometrics 66 (4): 11531161. https://doi.org/10.1111/j.15410420.2009.01380.x.Search in Google Scholar
© 2021 Andrea Callegaro et al., published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.