## Abstract

While the HVTN 505 trial showed no overall efficacy of the tested vaccine to prevent HIV infection over placebo, markers measuring immune response to vaccination were strongly correlated with infection. This finding generated the hypothesis that some marker-defined vaccinated subgroups were partially protected whereas others had their risk increased. This hypothesis can be assessed using the principal stratification framework (Frangakis and Rubin, 2002) for studying treatment effect modification by an intermediate response variable, using methods in the sub-field of principal surrogate (PS) analysis that studies multiple principal strata. Unfortunately, available methods for PS analysis require an augmented study design not available in HVTN 505, and make untestable structural risk assumptions, motivating a need for more robust PS methods. Fortunately, another sub-field of principal stratification, survivor average causal effect (SACE) analysis (Rubin, 2006) – which studies effects in a single principal stratum – provides many methods not requiring an augmented design and making fewer assumptions. We show how, for a binary intermediate response variable, methods developed for SACE analysis can be adapted to PS analysis, providing new and more robust PS methods. Application to HVTN 505 supports that the vaccine partially protected individuals with vaccine-induced T-cells expressing certain combinations of functions.

## 1 Introduction

A vaccine that prevents HIV-1 infection is critically needed for ending the global HIV-1 pandemic. A recent efficacy trial of a candidate HIV-1 vaccine – HVTN 505 – randomized 2,496 HIV-1 negative volunteers in 1:1 allocation to the DNA/rAd5 vaccine regimen or placebo [1] administered at months 0, 1, 3, 6. The primary objective compared the rate of HIV-1 infection from 6.5 to 24 months between the randomized treatment arms, with estimated cumulative incidence 4.62% (3.15%) in the vaccine (placebo) group, and cumulative incidence ratio 1.46 (95% CI 0.82 to 2.63, Wald test *p* = 0.20). Through a 2-phase sampling design, Janes et al. [2] studied HIV-1 Envelope-specific CD8+ T cell responses measured %by intracellular cytokine staining 2 weeks post last vaccination (Month 6.5) as a correlate of HIV-1 infection between 6.5 and 24 months, and found in vaccine recipients a strong inverse correlation between CD8+ T cell polyfunctionality score (PFS) and HIV-1 infection (*p* < 0.001). The surprising strength of the correlate [e.g., estimated cumulative risk 0.160 and 0.034 for vaccinated subgroups with PFS below and above the median response, compared to 0.070 for placebo recipients (Figure 4 of [2]), suggests the possibility that a qualitative interaction occurred, where some marker-defined vaccinated subgroups received partial protection from vaccination whereas others had their risk increased.

Janes et al.’s observation of intermediate risk of the placebo group between that of the biomarker response-defined vaccinated subgroups does not imply causal vaccine vs. placebo effect modification across these subgroups, because post-randomization selection bias could occur (e.g., due to an unmeasured genetic factor predictive of both the PFS and HIV infection). A direct assessment of a causal vaccine effect (eliminating possible selection bias) would compare risk for each vaccinated subgroup defined by biomarker response value to that of the placebo recipient subgroup who would have had the same biomarker response value if assigned vaccination, which essentially repeats the primary analysis of vaccine efficacy (which is valid based on the randomization) across the marker-defined subgroups. This “principal surrogate (PS) analysis” [3] is a sub-field of the general framework of principal stratification established by Frangakis and Rubin [3].

However, available methods for PS analysis make strong structural risk assumptions that do not hold in HVTN 505, and require study design augmentations such as close-out placebo vaccination to measure counterfactual biomarker response [4] that were not utilized in HVTN 505. Given the high threshold of evidence needed to convince scientists of a qualitative interaction, application of these methods would be inadequate – rather PS analysis with methods that rely on weaker assumptions are needed. Fortunately, many nonparametric and semiparametric principal stratification methods are available that do not require a design augmentation, many of which were developed for applications in another sub-field of principal stratification – “survivor average causal effect (SACE) analysis” (e.g., [5, 6])– which we define as principal stratification methods focused on a single principal stratum (e.g., always survivors). We describe how such methods designed for SACE analysis can be adapted to PS analysis for the special case of a binary intermediate outcome. This adaptation contributes novel PS methods, by (1) applying to clinical trials without design augmentations; (2) relaxing the strong assumption of no individual-level clinical treatment effects before the biomarker is measured; (3) avoiding strong structural placebo conditional risk assumptions; and (4) extending SACE methods to study designs that only measure the intermediate outcome in a subset of participants (e.g., two-phase/case-control studies), as in HVTN 505. The PFS marker studied by Janes et al. [2] is a summary measure of many different cellular expression patterns, which are binary (expressed vs. not expressed), motivating methods for a binary intermediate outcome.

## 2 Principal Surrogate (PS) Target Parameters and the HVTN 505 Application

A common objective of randomized clinical trials is to assess biomarker response endpoints as principal surrogate endpoints for the clinical endpoint of interest [3]. With *Z* denoting binary treatment assignment and *S ^{τ}* denoting a biomarker measured at a fixed time point

*τ*post-randomization and

*Y*denoting the binary clinical endpoint of interest measured after

*τ*, the PS estimand of interest is the “principal effects” or “causal effect predictiveness” (CEP) surface [7]:

*CEP*(

*s*

_{1},

*s*

_{0}) =

*h*(

*risk*

_{1}(

*s*

_{1},

*s*

_{0}),

*risk*

_{0}(

*s*

_{1},

*s*

_{0})), where

*risk*(

_{z}*s*

_{1},

*s*

_{0}) ≡

*P*(

*Y*(

*z*) = 1∣

*S*(1) =

^{τ}*s*

_{1},

*S*(0) =

^{τ}*s*

_{0},

*Y*(1) =

^{τ}*Y*(0) = 0) for

^{τ}*z*= 0, 1. Here

*Y*(

^{τ}*z*) is the indicator that the clinical endpoint occurs by time

*τ*, such that the population for inference is free of the endpoint under both treatment assignments when

*S*is measured. In HVTN 505,

^{τ}*S*measures an immune response to vaccination and

^{τ}*Y*= 1 is subsequent HIV-1 infection, and it would be problematic to include individuals already infected with HIV-1, because their

*S*values would be affected by the natural immune response to HIV-1 infection, making the results uninterpretable. Therefore,

^{τ}*CEP*(

*s*

_{1},

*s*

_{0}) contrasts the risk of infection between the vaccine and placebo arms among those who would be infection-free at time

*τ*under either treatment assignment and have biomarker measures

*s*

_{1}and

*s*

_{0}under assignment to vaccine and placebo, respectively. The objective of PS analysis is inference about the

*CEP*(

*s*

_{1},

*s*

_{0}) parameters and contrasts in these parameters, which has been addressed using a variety of methodological approaches including estimated maximum likelihood [4, 7, 8], pseudo-score estimating equations [9], principal score methods [10, 11, 12], and Bayesian methods [13, 14, 15].

We consider PS methods assuming either “No Early Effect” (**NEE**), “No Early Harm” (**NEH**), or “No Early Benefit” (**NEB**) of treatment on *Y* by time *τ*, as defined below as **A4**, **A4′**, and **A4″**. For the PS methods assuming **NEH** or **NEB**, we focus on the special case that *Z* = 0 is a control condition such as placebo, and there is no variability of *S ^{τ}* in subjects assigned to

*Z*= 0, i.e.,

*P*(

*S*(0) = 0∣

^{τ}*Y*(0) = 0) = 1. This “Constant Biomarker (CB)” case occurs in placebo-controlled preventive vaccine efficacy trials that only enroll subjects not previously infected with the pathogen under study and for which the intermediate response endpoint is a readout from a validated bioassay designed to only detect an immune response

^{τ}*specific*to the pathogen under study [7]; HVTN 505 is a typical example. For methods assuming

**NEE**, we consider both the special Case CB and the general case where

*S*(0) varies and a monotonicity assumption holds (

^{τ}**A5**below). These scenarios are chosen because they commonly occur in applications. Table 1 describes the interpretation of the

*CEP*(

*s*

_{1},

*s*

_{0}) parameters for HVTN 505. Because Case CB holds,

*CEP*(1,0) and

*CEP*(0, 0) are of interest. For a given binary immune response biomarker

*S*,

^{τ}*VE*(1) =

*CEP*(1, 0) is vaccine efficacy for the subgroup with positive response if assigned vaccine and negative response if assigned placebo, and

*VE*(0) =

*CEP*(0, 0) is vaccine efficacy for the subgroup with negative response under both treatment assignments. Our scientific goal is to study whether vaccine efficacy differs between these two subgroups, i.e.

*μ*≡

*VE*(1) −

*VE*(0) ≠ 0, and we also seek to assess evidence for a qualitative interaction, defined by

*VE*(1) and

*VE*(0) having opposite signs.

Variable | Description |
---|---|

Z | Randomized treatment assignment to vaccine (z = 1) or placebo (z = 0) |

Y | Indicator of diagnosis of HIV-1 infection after τ = 6.5 months post enrollment through 24 months (outcome of interest) |

Y^{τ} | Indicator of diagnosis of HIV-1 infection by τ = 6.5 months |

S^{τ} | Binary immune response biomarker measured at τ (eligible if Y = 0)^{τ} |

risk(_{z}s_{1}, s_{0}) | = P(Y(z) = 1∣ S(1) = ^{τ}s_{1}, S(0) = ^{τ}s_{0}, Y(1) = ^{τ}Y(0) = 0):^{τ} |

Probability of infection diagnosis if assigned Z = z for the subgroup uninfected at τ under both treatment assignments and with biomarker | |

measures s_{1} and s_{0} under assignment to vaccine and placebo, respectively. HVTN 505 studies binary biomarkers S described in Section 7.^{τ} | |

CEP(s_{1}, s_{0}) | = h(risk_{1}(s_{1}, s_{0}), risk_{0}(s_{1}, s_{0})) with h(x, y) a contrast function satisfying h(x, y) = 0 if and only if x = y and h(x, y) < 0 for x < y. |

Case CB | A study where S is constant (“Constant Biomarker”) in ^{τ}Z = 0 participants. |

VE(s_{1}) | = CEP(s_{1}, 0) for s_{1} ∈ {0, 1} with h(x, y) = 1−x/y. In HVTN 505, Case CB holds (P(S = 0) = 1). ^{τ}VE(1) and VE(0) are vaccine efficacy for the subgroups with biomarker response if assigned vaccine S(1) = 1 and ^{τ}S(1) = 0, respectively.^{τ} |

## 3 Assumptions and Identifiability of the PS Target Parameters *CEP*(*s*_{1}, *s*_{0})

### 3.1 Additional Notation and Identifiability Assumptions

Let *W* be a vector of baseline covariates measured in everyone, and *R* be the indicator of whether the binary endpoint *S ^{τ}* is measured at

*τ*. To fit the most common applications in clinical trials we assume

*Y*is binary; in the case of a time-to-event outcome,

*Y*=

*I*(

*T*≤

*t*), where

*T*is the time from randomization until the endpoint and

*t*>

*τ*is a fixed time point of interest. Note that

*S*is undefined if

^{τ}*T*≤

*τ*, which we denote as

*S*= *. Define

^{τ}*p*(

*s*

_{1},

*s*

_{0}) ≡

*P*(

*S*(1) =

^{τ}*s*

_{1},

*S*(0) =

^{τ}*s*

_{0}∣

*Y*(1) =

^{τ}*Y*(0) = 0) for (

^{τ}*s*

_{1},

*s*

_{0}) ∈ {(0, 0), (1, 0), (1, 1)}.

Throughout we make a baseline set of assumptions made in essentially all previous frequentist inference SACE papers. We assume the

*Y _{i}*(1),

*Y*(0)),

_{i}*i*= 1, …,

*n*, are iid [with observed data

*R*= 1] and assume

_{i}**A1**

*SUTVA (Stable Unit Treatment Value Assumption)*;

**A2**

*Ignorable Treatment Assignment*: Conditional on

*W*,

*Z*is independent of (

*Y*(1),

^{τ}*Y*(0),

^{τ}*S*(1),

^{τ}*S*(0),

^{τ}*Y*(1),

*Y*(0)); and

**A3**

*No Censoring or Random Censoring*: The binary endpoint

*Y*is observed for all participants, or, if

*Y*=

*I*(

*T*≤

*t*) with

*T*subject to right-censoring

*C*,

*C*(

*z*) is random conditional on

*W*(

*T*(

*z*) ⊥

*C*(

*z*)∣

*W*for

*z*= 0, 1). As introduced above, we develop the methods under each of three assumptions regarding early final endpoint events before the intermediate endpoint

*S*is measured at

^{τ}*τ*:

**A4***No Early Effect* (**NEE**): *P*(*Y ^{τ}*(1) =

*Y*(0)) = 1

^{τ}**A4′***No Early Harm* (**NEH**): *P*(*Y ^{τ}*(1) ≤

*Y*(0)) = 1

^{τ}**A4″***No Early Benefit* (**NEB**): *P*(*Y ^{τ}*(1) ≥

*Y*(0)) = 1

^{τ}We also assume *p*(1, 0) > 0, which holds trivially for any PS application of interest. **A4′** and **A4″** are standard SACE monotonicity assumptions for the analysis of {*risk*_{1}, *risk*_{0}}, which weakens the strong **NEE** assumption made in almost all papers on PS methods.

For the non-Case CB scenario where the biomarker *S ^{τ}*(0) varies we also reduce the number of non-identified terms via a biomarker monotonicity assumption:

**A5***Biomarker Monotonicity*: *P*(*S ^{τ}*(0) ≤

*S*(1)∣

^{τ}*Y*(1) =

^{τ}*Y*(0) = 0) = 1.

^{τ}**A5** states that among participants who will not experience the clinical outcome *Y* = 1 by *τ* regardless of treatment assignment, none have a negative treatment effect on the biomarker.

We develop the methods under four scenarios of assumption sets– **NEE-VB**: **A1–A5** hold and Variable Biomarker (**VB**) *S ^{τ}*(0);

**NEE-CB**:

**A1–A4**hold and Constant Biomarker (

**CB**);

**NEH-CB**:

**A1–A3**,

**A4′**hold and Constant Biomarker;

**NEB-CB**:

**A1–A3**,

**A4″**hold and Constant Biomarker.

### 3.2 Identifiability of *CEP*(*s*_{1}, *s*_{0})

There are 16 basic principal strata corresponding to all possible combinations of values for (*S ^{τ}*(1),

*S*(0),

^{τ}*Y*(1),

^{τ}*Y*(0)). Our assumptions, corresponding to the four scenarios stated above, imply that many of these basic principal strata are empty. Our interest lies in parameters defined for four principal strata: three defined by the basic principal strata

^{τ}*S*(1) =

^{τ}*s*

_{1},

*S*(0) =

^{τ}*s*

_{0},

*Y*(1) =

^{τ}*Y*(0) = 0 with (

^{τ}*s*

_{1},

*s*

_{0}) ∈ (0, 0), (1, 0), and (1, 1), and the fourth defined by

*Y*(1) =

^{τ}*Y*(0) = 0. Identification of these parameters requires additional assumptions that can be expressed using sensitivity parameters. Once sensitivity parameters are fixed, the parameters of interest become identifiable. Derivations for identification are given in Web Appendix A. For scenarios

^{τ}**NEE-VB**,

**NEE-CB**,

**NEH-CB**, and

**NEB-CB**, the numbers of required sensitivity parameters are 2, 1, 4, and 2.

## 4 SACE Methods and their Translation to PS Analysis

### 4.1 SACE Methods

In this section we use generic notation (*Z*, *S*, *Y*) to define the general SACE parameter [16], and in Section 4.2 show how it maps to the notation for the PS problem. A common objective is to assess the effect of a randomized binary treatment *Z* on an outcome *Y*, where *Y* is only defined or observable for participants with a post-randomization intermediate response variable *S* equal to a certain level (*S* = 1, say). The SACE is the average causal effect of *Z* on *Y* in the subgroup with *S* = 1 under both treatment assignments:

where *h*(*x*, *y*) is a contrast function meeting the requirements described in Table 1. Here *S*(*z*) and *Y*(*z*) are potential outcomes of *S* and *Y* if a subject is assigned to treatment *Z* = *z*.

The large SACE methods literature has focused on the additive-difference contrast *h*(*x*, *y*)

= *x* − *y* and includes techniques for nonparametric bounds [17, 18, 19, 20, 21] and techniques for sensitivity analysis that estimate the SACE under a spectrum of selection bias models [22, 23, 24, 25, 26, 27]. Several of these methods make the SACE monotonicity assumption [*P*(*S*(0) ≤ *S*(1)) = 1] and several relax this assumption. We use a general contrast function *h*(*x*, *y*) because vaccine efficacy is typically measured by a multiplicative reduction in risk rather than an additive difference.

### 4.2 SACE and PS Methods as Two Strains of Principal Stratification

For PS applications, the multiple SACE-defining intermediate outcomes *S* = 1 depend on both *Y ^{τ}* and

*S*(detailed in Table 2). Because

^{τ}*S*is typically categorical or continuous, PS applications makes inference across a spectrum of principal stratum subgroups, whereas SACE applications make inference for a single principal stratum. Focusing on a single stratum has facilitated development of nonparametric and semiparametric SACE methods that need only deal with one or a few terms that are not identified from the observed data, whereas an attempt to apply SACE methods for each principal stratum defined by a general categorical

^{τ}*S*would face a much larger number of non-identified terms that grows with the number of categories. However, there is an important special case for which SACE methods can be practically adapted– when

^{τ}*S*is binary– for which only a few principal strata are of interest. The motivating HVTN 505 application has binary markers of interest.

^{τ}Assumption Sets (Identifiability Assumptions) | ||||
---|---|---|---|---|

Target Parameter | NEE-VB | NEE-CB | NEH-CB | NEB-CB |

(A1–A5) | (A1–A4) | (A1–A3, A4′) | (A1–A3, A4″) | |

SACE_{AEU} = | Nonpar. | Nonpar. | Any SACE | Any SACE |

h(risk_{1}, risk_{0}) | ident. | ident. | method w/ | method w/ |

SACE with | mon. A4′^{[a]} | mon. A4″^{[b]} | ||

S ≡ 1 − Y^{τ} | (0 S.P.s) | (0 S.P.s) | (1 S.P.) | (1 S.P.) |

SACE_{AEUB0} = | Any SACE | Any SACE | Any SACE | Any SACE |

h(risk_{1}(0, 0), risk_{0}(0, 0)) | method w/ | method w/ | method w/o | method w/ |

SACE with | mon. A5^{[c]},^{[d]} | mon. A5^{[d]} | mon.^{[e]} | mon.^{[f]} |

S ≡ [1 − Y][1 − ^{τ}S]^{τ} | (1 S.P.) | (1 S.P.) | (3 S.P.s) | (1 S.P.) |

SACE_{AEUB1} = | Any SACE | N/A | N/A | N/A |

h(risk_{1}(1, 1), risk_{0}(1, 1)) | method w/ | N/A | N/A | N/A |

SACE with | mon. A5^{[g]} | N/A | N/A | N/A |

S ≡ [1 − Y]^{τ}S^{τ} | (1 S.P.) | N/A | N/A | N/A |

Under each of the four assumption sets defined above, we show how to apply SACE methods to the PS problem: inference about *CEP*(0, 0), *CEP*(1, 0), and *CEP*(1, 1). First, we write the overall conditional risks, *risk _{z}* ≡

*P*(

*Y*(

*z*) = 1∣

*Y*(1) =

^{τ}*Y*(0) = 0) for

^{τ}*z*= 0, 1, as weighted averages of the marker-subgroup conditional risks:

Equation (2) uses the fact that *p*(0, 1) = 0 under the assumptions we use.

The *risk _{z}* and

*risk*(

_{z}*s*

_{1},

*s*

_{0}) parameters measure risks in marker-defined subsets of the “always early uninfected” principal stratum defined by {

*Y*(1) =

^{τ}*Y*(0) = 0}. For scenario

^{τ}**NEH-CB**we will also use the following parameters measuring risks in subsets of the “early protected” (EP) principal stratum defined by {

*Y*(1) = 0,

^{τ}*Y*(0) = 1}:

^{τ}*s*

_{1},

*s*

_{0}) ≡

*P*(

*Y*(

*z*) = 1∣

*S*(1) =

^{τ}*s*

_{1},

*S*(0) =

^{τ}*s*

_{0},

*Y*(1) = 0,

^{τ}*Y*(0) = 1) for

^{τ}*z*,

*s*

_{1},

*s*

_{0}∈ {0, 1}. And, for

**NEB-CB**, the analogous parameters are needed for the “early harmed” principal stratum.

The observation that motivated this work is that a contrast in *risk*_{1} and *risk*_{0} is a SACE (defined by intermediate event *S* = 1 − *Y ^{τ}* = 1), a contrast in

*risk*

_{1}(0, 0) and

*risk*

_{0}(0, 0) is a SACE (defined by intermediate event

*S*= [1 −

*Y*][1 −

^{τ}*S*] = 1), a contrast in

^{τ}*risk*

_{1}(1, 1) and

*risk*

_{0}(1, 1) is a SACE (defined by intermediate event

*S*= [1 −

*Y*]

^{τ}*S*= 1), and, whereas a contrast in

^{τ}*risk*

_{1}(1, 0) and

*risk*

_{0}(1, 0) is not a SACE, these two parameters are identified from (2) and the other parameters. Thus any two existing SACE methods can be employed to estimate the two sets of means/probabilities {

*risk*

_{1},

*risk*

_{0}} and {

*risk*

_{1}(0, 0),

*risk*

_{0}(0, 0)}, and another for {

*risk*

_{1}(1, 1),

*risk*

_{0}(1, 1)} in scenario

**NEE-VB**, and then equation (2) is applied to yield estimates of the remaining two probabilities {

*risk*

_{1}(1, 0),

*risk*

_{0}(1, 0)} via

(where *risk _{z}*(1, 1) = 0 in the Case CB scenarios

**NEE-CB**,

**NEH-CB**, and

**NEB-CB**).

We interpret the SACE parameters in terms of our motivating application. The first is SACE_{AEU} = *h*(*risk*_{1}, *risk*_{0}), with the AEU subgroup being “always early uninfected” individuals with *S* = 1 indicating being uninfected with HIV-1 at time *τ*. The second is SACE_{AEUB0} = *h*(*risk*_{1}(0, 0), *risk*_{0}(0, 0)), with AEUB0 being “always early uninfected and biomarker (B) value *S ^{τ}* = 0” individuals. The third is SACE

_{AEUB1}=

*h*(

*risk*

_{1}(1, 1),

*risk*_{0}(1, 1)), for “always early uninfected and biomarker value *S ^{τ}* = 1” individuals.

The *CEP* parameters are linked to the three SACE parameters and risks through *CEP*(0, 0) = SACE_{AEUB0}, *CEP*(1, 1) = SACE_{AEUB1}, and *CEP*(1, 0) = *h*(*risk*_{1}(1, 0), *risk*_{0}(1, 0)) with *risk _{z}*(

*s*

_{1},

*s*

_{0}) as in (3). Thus our approach pieces together SACE estimates and estimates of the means

*p*(

*s*

_{1},

*s*

_{0}) to yield estimates of the

*CEP*parameters, where the estimators focusing on subgroups defined by

*S*must account for

^{τ}*S*only measured in a subset, e.g., through inverse probability weighting. For

^{τ}*h*(

*x*,

*y*) =

*x*−

*y*,

*h*(

*risk*

_{1}(1, 0),

*risk*

_{0}(1, 0)) = [SACE

_{AEU}−

*p*(0, 0) SACE

_{AEUB0}−

*p*(1, 1) SACE

_{AEUB1}]/

*p*(1, 0).

### 4.3 Mapping Existing SACE Methods to Estimate the PS Parameters *CEP*(*s*_{1}, *s*_{0})

Table 2 summarizes how SACE methods can be used to estimate the *CEP*(*s*_{1}, *s*_{0}) parameters for the four assumption-set scenarios. Under **NEE-VB**, SACE_{AEU} is nonparametrically identified by **A4**. *CEP*(0, 0) = SACE_{AEUB0} can be estimated using any SACE method making the monotonicity assumption *P*(*S*(1) ≤ *S*(0)) = 1, with *S* = [1 − *Y ^{τ}*] [1 −

*S*] = 1. Similarly,

^{τ}*CEP*(1, 1) = SACE

_{AEUB1}can be estimated with a SACE method using the same monotonicity assumption but now with

*S*= [1 −

*Y*]

^{τ}*S*= 1. The mixing parameters,

^{τ}*p*(

*s*

_{1},

*s*

_{0}), are all nonparametrically identified, such that

*CEP*(1, 0) =

*h*(

*risk*

_{1}(1, 0),

*risk*_{0}(1, 0)) can be estimated based on equation (2).

Estimation under **NEE-CB** is identical to that of scenario **NEE-VB** except that *risk*_{1}(1, 1) and *risk*_{0}(1, 1) vanish (because the principal stratum defined by [1 − *Y ^{τ}*(0)]

*S*(0) =

^{τ}[1 − *Y ^{τ}*(1)]

*S*(1) = 1 is empty). For scenario

^{τ}**NEH-CB**, SACE

_{AEU}can be estimated using an existing SACE method making the monotonicity assumption

*P*(

*S*(1) ≤

*S*(0)) = 1, with intermediate event

*S*= 1 −

*Y*= 1. Then,

^{τ}*CEP*(0, 0) = SACE

_{AEUB0}can be estimated using an arbitrary SACE method that does not assume monotonicity.

## 5 Illustration of the Approach with IPW Extensions of Particular SACE Methods

For each assumption set scenario, we show how the PS estimation and inference works using an extension of Shepherd, Gilbert, and Dupont’s [24] SACE method, which in scenarios **NEE-VB, NEE-CB** simplifies to the Gilbert, Bosch, and Hudgens [22] (GBH) SACE method. The extension incorporates inverse probability weighting (IPW) to handle missing *S ^{τ}*. We focus on semiparametric efficient estimators given the data (

*Z*,

*R*,

*Y*,

^{τ}*S*,

^{τ}*Y*) not including baseline covariates

*W*, which amount to sample means with or without IPW as needed. Moreover, we focus on the case that

*Y*is binary and observed for all participants; Web Appendix C summarizes how the methods translate to

*Y*=

*I*(

*T*≤

*t*) with

*T*subject to right-censoring before

*t*. We use general estimating function notation so that users preferring to use more efficient estimators leveraging information in

*W*(e.g., [28]) may substitute alternative estimating functions into the equations.

### 5.1 General IPW Estimation

The SACE estimators involve estimation of identified terms *E*[*Y*∣*S* = 1, *Z* = *z*] for subgroups *S* = 1 [with *S* = 1 − *Y ^{τ}*, (1 −

*Y*)

^{τ}*S*, or (1 −

^{τ}*Y*)(1 −

^{τ}*S*)] and of terms

^{τ}*E*[

*S*∣

^{τ}*Y*= 0,

^{τ}*Z*=

*z*], where

*S*is measured at time

^{τ}*τ*and may be missing. Define the probability of observing

*S*as

^{τ}*π*(

*O*

_{1}) ≡

*P*(

*R*= 1∣

*O*

_{1}), where

*O*

_{1}is data observed in everyone, i.e., (

*Z*,

*W*,

*Y*,

^{τ}*Y*). We assume

*S*is missing at random,

^{τ}*π*(

*O*

_{1}) =

*P*(

*R*= 1∣

*O*

_{1},

*S*), and that

^{τ}*π*(

*O*

_{1}) is bounded away from zero,

*π*(

*O*

_{1}) ≥

*ε*with probability 1 for some fixed

*ε*> 0.

Following standard IPW estimation, we specify a model *π*(*O*_{1}, *ψ*) for *π*(*O*_{1}) (e.g., logistic), and estimate the parameter *ψ* by maximum likelihood, yielding *π̂ _{i}* =

*π*(

*O*

_{1i},

*ψ̂*). Efficiency and robustness may be improved by calibrating the estimated weights

*π̂*accounting for

_{i}*W*(e.g., [29, 30]).

_{i}### 5.2 Dichotomous Outcome SACE Methods Under Scenario **NEE-VB**

For scenario **NEE-VB**, the first step is to estimate the terms that are nonparametrically identified– {*risk*_{1}, *risk*_{0}}, *p*(0, 0), *p*(1, 0), and *p*(1, 1). Each *risk _{z}* for

*z*= 0, 1 can be estimated by any preferred method for estimating a mean, most simply by solving

*O*;

_{i}*risk*) = 0 with estimating function

_{z}*I*(

*Z*= z)(

_{i}*Y*−

_{i}*risk*).

_{z}Given *p*(0, 0) = *P*(*S ^{τ}* = 0∣

*Z*= 1,

*Y*= 0), if full data were available, then a simple approach would estimate

^{τ}*p*(0, 0) by solving

*O*;

_{i}*p*(0, 0)) = 0 with

*U*

^{01}(

*O*;

_{i}*p*(0, 0)) = 0 if

*O*;

_{i}*p*(0, 0)) /

*π̂*= 0. The parameter

_{i}*p*(1, 1) may be estimated similarly by solving

*O*;

_{i}*p*(1, 1)) /

*π̂*= 0 with

_{i}*O*;

_{i}*p*(1, 1)) = 0 if

*p̂*(0, 0) and

*p̂*(1, 1), we then set

*p̂*(1, 0) = 1 −

*p̂*(0, 0) −

*p̂*(1, 1).

Lastly, we estimate each pair {*risk*_{1}(0, 0), *risk*_{0}(0, 0)} and {*risk*_{1}(1, 1), *risk*_{0}(1, 1)} with a SACE method that assumes monotonicity. We summarize how the GBH method, extended to incorporate IPW, accomplishes this task.

*GBH Semiparametric SACE Method with IPW (Notation as in GBH)*. Consider the odds ratio selection bias model with user-specified fixed sensitivity parameter *β*_{0}:

Under **A1–A3**, **B.0**, monotonicity [*P*(*S*(1) ≤ *S*(0)) = 1], and positivity [*P*(*S*(1) = 0, *S*(0) = 1) > 0, *P*(*S*(1) = 1, *S*(0) = 1) > 0], the two parameters of interest, *P*^{11}(*z*) ≡ *P*(*Y*(*z*) = 1∣*S*(1) = *S*(0) = 1) for *z* = 0, 1, are nonparametrically identified.

By monotonicity *P*^{11}(1) = *P*(*Y*(1) = 1∣*S*(1) = 1), such that *P*^{11}(1) is estimated by solving *R _{i}*

*U*

^{1}(

*O*;

_{i}*P*(

*Y*(1) = 1∣

*S*(1) = 1))/

*π̂*= 0 where

_{i}*U*

^{1}(

*O*;

_{i}*P*(

*Y*(1) = 1∣

*S*(1) = 1)) =

*Z*

_{i}*S*(

_{i}*Y*−

_{i}*P*(

*Y*(1) = 1∣

*S*(1) = 1)). Next,

*P*

^{11}(0) is estimated by first estimating

*α*

_{0}as the solution to

*R*

_{i}*U*

^{0}(

*O*;

_{i}*α*

_{0},

*β*

_{0})/

*π̂*= 0 where

_{i}where *P̂*(*Y*(0) = 1∣*S*(0) = 1) is obtained in the same way as *P̂*(*Y*(1) = 1∣*S*(1) = 1). Then with *α̂*_{0} from (4), *P̂*^{11}(0) = [*P̂*(*S*(0) = 1)/*P̂*(*S*(1) = 1)] *w*_{0}(1; *α̂*_{0}, *β*_{0}) *P̂*(*Y*(0) = 1∣ *S*(0) = 1). We implement this “IPW GBH SACE Method” verbatim multiple times below, with the meaning of *S* = 1 changing for estimating needed terms in *CEP*(*s*_{1}, *s*_{0}).

*Assumptions Needed for Valid Inference with the IPW GBH SACE Method*. We state additional assumptions, beyond the identifiability assumptions, that are needed for valid inference via the IPW GBH SACE Method. The method requires *P*(*S*(1) = 1) < *P*(*S*(0) = 1) in order that *P̂*^{11}(*z*) for each *z* = 0, 1 has an asymptotic normal distribution and thus to ensure that Wald confidence intervals for *P*^{11}(*z*) based on asymptotic or nonparametric bootstrap variance estimates have correct coverage probabilities [23]. Moreover, if *P*(*S*(1) = 1) < *P*(*S*(0) = 1) but the probabilities are close, then the Wald confidence intervals can have poor coverage. The general SACE assumption *P*(*S*(1) = 1) < *P*(*S*(0) = 1) translated for ensuring valid inference on SACE_{AEUB0} and SACE_{AEUB1} for **NEE-VB** (Table 2) is **A6** “Early Biomarker Effect”: *P*(*S ^{τ}*(0) = 1∣

*Y*= 0) <

^{τ}*P*(

*S*(1) = 1∣

^{τ}*Y*= 0).

^{τ}**A6**is also the assumption needed for valid inference on SACE

_{AEUB0}for

**NEE-CB**. For

**NEH-CB**, the general SACE assumption translated for valid inference on SACE

_{AEU}is

**A7**“Early Benefit”:

*P*(

*Y*(1) = 1) <

^{τ}*P*(

*Y*(0) = 1), while for

^{τ}**NEB-CB**it is

**A7′**“Early Harm”

*P*(

*Y*(1) = 1) >

^{τ}*P*(

*Y*(0) = 1). Lastly, for

^{τ}**NEB-CB**the general SACE assumption translated for inference on SACE

_{AEUB0}is

**A8**:

*P*(

*S*(1) = 0∣

^{τ}*Y*(1) = 0) <

^{τ}*P*(

*Y*(0) = 0)/

^{τ}*P*(

*Y*(1) = 0).

^{τ}**A6**almost always holds in applications and one of

**A7**or (

**A7′**,

**A8**) plausibly holds for many real binary PS applications, as discussed in Section 7. Moreover,

**A7**,

**A7′**,

**A8**are testable such that the conditions needed to assure valid inference can be checked.

*Implementing the IPW GBH SACE Method for**CEP*(*s*_{1}, *s*_{0}). The semiparametric MLEs *P̂*^{11}(1) and *P̂*^{11}(0), using *S* ≡ [1 − *Y ^{τ}*][1 −

*S*] with

^{τ}**B.0**and a fixed

*β*

_{0}. The semiparametric MLEs

*S*≡ [1 −

*Y*]

^{τ}*S*, with monotonicity assumption in the reverse direction [i.e.,

^{τ}*P*(

*S*(1) ≥

*S*(0)) = 1]. This means that we use the IPW GBH SACE Method reversing the roles of

*Z*= 1 and

*Z*= 0, leading to a selection bias model defined by:

The estimate *P̂*^{11}(0) is obtained based on *U*^{0}(*O _{i}*;

*P*(

*Y*(0) = 1∣

*S*(0) = 1)) = (1 −

*Z*)

_{i}*S*(

_{i}*Y*−

_{i}*P*(

*Y*(0) = 1∣

*S*(0) = 1))/

*Z*)

_{i}*S*and

_{i}*P̂*

^{11}(1) is obtained by setting

after estimating *α*_{1} from *U*^{1}(*O _{i}*;

*α*

_{1},

*β*

_{1}) =

By standard estimating equation theory, the above estimators are consistent and asymptotically normal for given fixed *β*_{0} and *β*_{1}. To obtain Wald confidence intervals for each *CEP*(*s*_{1}, *s*_{0}), consistent estimating-function based variance estimators may be used for the estimates *P̂*^{11}(*z*) not involving *α̂*_{0} or *α̂*_{1}; e.g., the estimated variance of *P̂*^{11}(1) for the IPW GBH SACE Method is given by *R _{i}* /

*π̂*) [

_{i}*U*

^{1}(

*O*;

_{i}*P̂*(

*Y*(1) = 1∣

*S*(1) = 1))]

^{2}. Influence-function based variance estimates are similarly obtained for the estimates

*P̂*

^{11}(

*z*) involving

*α̂*

_{0}or

*α̂*

_{1}, by using a vector estimating function and the delta method. For example, the four components of the estimating function in (6) are for (

*α̂*

_{1},

*P̂*(

*S*(1) = 1),

*P̂*(

*S*(0) = 1),

*P̂*(

*Y*(1) = 1∣

*S*(1) = 1))

^{T}, with delta method applied with

*g*(

*w*,

*x*,

*y*,

*z*) =

*w*

_{1}(1;

*w*,

*β*

_{1})

*z*. All variance estimation is performed with the R package geex [31].

To perform a sensitivity analysis, one approach specifies a plausible range [*l _{k}*,

*u*] (or maximum possible) for each sensitivity parameter

_{k}*β*,

_{k}*k*= 0, 1. An ignorance interval for

*CEP*(

*s*

_{1},

*s*

_{0}) may be estimated by the span of values between the minimum and maximum estimates, obtained by setting

*β*

_{0}and

*β*

_{1}to the boundary values. Using the method of Imbens and Manski [32] and Vansteelandt et al. [33], a Wald asymptotic (1 −

*α*)% estimated uncertainty interval (EUI) for

*CEP*(

*s*

_{1},

*s*

_{0}) may be calculated as in formulas (40) and (41) of Richardson et al. [34], using the variance estimates of the minimum and maximum

*CEP*(

*s*

_{1},

*s*

_{0}) estimates. In particular, let

*s*

_{1},

*s*

_{0}) and

*s*

_{1},

*s*

_{0}) be the estimates of

*CEP*(

*s*

_{1},

*s*

_{0}) fixing the sensitivity parameters at the values within a pre-specified plausible region

*Γ*= [

*l*

_{0},

*u*

_{0}] × [

*l*

_{1},

*u*

_{1}] of the sensitivity parameter(s) that minimize or maximize

*s*

_{1},

*s*

_{0}), respectively. With

*s*

_{1},

*s*

_{0}) and

*s*

_{1},

*s*

_{0}), respectively, a (1 −

*α*)% EUI is given by [

*s*

_{1},

*s*

_{0}) −

*c*

_{α}*σ̂*/

_{l}*c*satisfies

_{α}where *Φ*(⋅) denotes the cdf of a standard normal variate. The same approach can be used to construct Wald confidence intervals and EUIs for the other scenarios and SACE approaches described below. Theoretical justification of these EUIs relies on the assumption that the values *γ _{l}*,

*γ*∈

_{u}*Γ*that correspond to the ignorance interval for

*CEP*(

*s*

_{1},

*s*

_{0}) are the same for all possible observed data laws (condition (39) from [34]), which holds for

**NEE-VB**and

**NEE-CB**and may need validation for

**NEH-CB**and

**NEB-CB**applications.

### 5.3 Dichotomous Outcome SACE Methods Under Scenario **NEE-CB**

For scenario **NEE-CB**, *CEP*(*s*_{1},0) for *s*_{1} = 0, 1 can be estimated exactly as for **NEE-VB**, with one change that *SACE*(1, 1) vanishes because *p*(1, 1) = 0. In particular, first {*risk*_{1}, *risk*_{0}} and *p*(0, 0) are estimated as in scenario **NEE-VB**, and then *p*(1, 0) is estimated as *p̂*(1, 0) = 1 − *p̂*(0, 0). Secondly, {*risk*_{1}(0, 0), *risk*_{0}(0, 0)} are estimated as in scenario **NEE-VB**. Lastly, *risk _{z}*(1, 0) for each

*z*= 0, 1 is estimated via equation (3) plugging in estimates for each term. These steps amount to first estimating

*risk*

_{1}(0, 0) by the solution to

*risk*

_{0}(0, 0) and

*risk*

_{0}(1, 0) are estimated by the solutions to the equations

**B.0**and

*risk*

_{0}(0, 0)

*p̂*(0, 0) −

*risk*

_{0}(1, 0)

*p̂*(1, 0) = 0; our code for the simulation study and example are implemented in this manner.

### 5.4 Dichotomous Outcome SACE Methods Under Scenario **NEH-CB**

We implement an IPW extension of the Shepherd, Gilbert, and Dupont [24] SACE method that relaxes monotonicity by using the sensitivity parameter *β*_{0} in **B.0** plus three additional sensitivity parameters:

where *risk*_{1}(*s*_{1},*) ≡ *P*(*Y*(1) = 1∣ *S ^{τ}*(1) =

*s*

_{1},

*S*(0) = *,

^{τ}*Y*(1) = 0,

^{τ}*Y*(0) = 1) for

^{τ}*S*

_{1}= 0, 1 and

*P*(

*S*(1) = 1∣0, 1) ≡

^{τ}*P*(

*S*(1) = 1∣

^{τ}*Y*(1) = 0,

^{τ}*Y*(0) = 1). The estimation steps are similar to those taken for scenario

^{τ}**NEE-CB**: First, estimate

*p*(1, 0) and

*P*(

*S*(1) = 1∣

^{τ}*Y*(1) = 0,

^{τ}*Y*(0) = 1) as the solutions to the two equations

^{τ}**B.4**and

Then *risk*_{0}(0, 0) and *risk*_{0}(1, 0) are estimated as in scenario **NEE-CB**. Finally, *risk*_{1}(*s*_{1}, 0) and *risk*_{1}(*s*_{1}, *) are estimated as the solutions to *P̂*(*Y*(1) = 1∣ *Y ^{τ}*(1) = 0,

*S*(1) =

^{τ}*s*

_{1}) −

*risk*

_{1}(

*s*

_{1}, 0)

*P̂*(

*Y*(0) = 0,

^{τ}*S*(0) = 0∣

^{τ}*Y*(1) = 0,

^{τ}*S*(1) =

^{τ}*s*

_{1}) −

*risk*

_{1}(

*s*

_{1}, *){1 −

*P̂*(

*Y*(0) = 0,

^{τ}*S*(0) = 0∣

^{τ}*Y*(1) = 0,

^{τ}*S*(1) =

^{τ}*s*

_{1})} = 0 and either

**B.2**for

*s*

_{1}= 0 or

**B.3**for

*s*

_{1}= 1.

### 5.5 Dichotomous Outcome SACE Methods Under Scenario **NEB-CB**

A SACE method for this scenario uses the sensitivity parameter *β*_{0} in$\textbf{B.0} and one additional sensitivity parameter defined as **B.5**:

The parameters *CEP*(1, 0) and *CEP*(0, 0) can be estimated exactly as in **NEE-CB**, except that *risk*_{0} is not identifiable and thus its estimation leverages the additional assumption **B.5**. We estimate *risk*_{0} and *P*(*Y*(0) = 1 ∣ *Y ^{τ}*(1) = 1,

*Y*(0) = 0) as the solutions to

^{τ}**B.5**and

*P*(

*Y*(0) = 1 ∣

*Y*(0) = 0) =

^{τ}*P*(

*Y*(1) = 0 ∣

^{τ}*Y*(0) = 0)

^{τ}*risk*

_{0}+

*P*(

*Y*(1) = 1 ∣

^{τ}*Y*(0) = 0)

^{τ}*P*(

*Y*(0) = 1 ∣

*Y*(1) = 1,

^{τ}*Y*(0) = 0). Then, estimation of

^{τ}*risk*

_{1},

*p*(0, 0),

*risk*

_{1}(1, 0),

*risk*

_{1}(0, 0),

*risk*

_{0}(1, 0), and

*risk*

_{0}(0, 0) is identical to that described for

**NEE-CB**.

### 5.6 Effect Modification Analysis: Inference on Contrasts in *CEP*(*s*_{1}, *s*_{0})

For making inference on contrasts in the *CEP*(*s*_{1}, *s*_{0}), such as *μ* ≡ *CEP*(1, 0) − *CEP*(0, 0), our set-up constrains *μ* to values narrower than the maximum possible range −2 to 2. For example, for scenario **NEE-CB**, setting *β*_{0} = 0 implies *risk*_{0}(1, 0) = *risk*_{0}(0, 0), which leaves each of *CEP*(1, 0) and *CEP*(0, 0) free to vary over the maximum possible range as for any SACE method but constrains *μ* to −1 to 1. Thus making inference on contrasts of *CEP*(*s*_{1}, *s*_{0}) does not achieve just-nonparametric identifiability as does inference on the individual *CEP*(*s*_{1}, *s*_{0}) parameters. This should be borne in mind when inference is made on *CEP* contrasts as well as on the individual parameters.

## 6 Simulation Study

We first simulated data sets under the assumptions of **NEE-CB**. For each of *n* independent individuals, we generated potential outcomes and then observed outcomes. First, (*Y ^{τ}*(1),

*Y*(0)) was set to (0, 0) or (1, 1) with probabilities 0.8 and 0.2, such that

^{τ}**A4**(

**NEE**) holds. If

*Y*(1) = 1, then

^{τ}*Y*(0) and

*Y*(1) were set to 1. If

*Y*(1) = 0, then (

^{τ}*S*(1),

^{τ}*S*(0)) was set to (0, 0) or (1, 0) with probabilities 0.4 and 0.6, such that Case CB holds.

^{τ}To evaluate size and power of a test of *H*_{0}: *CEP*(1, 0) = *CEP*(0, 0) versus *H*_{1} : *CEP*(1, 0) ≠ *CEP*(0, 0), data were simulated under 13 different values for *CEP*(1, 0) − *CEP*(0, 0): −0.6, −0.5, …, 0.6. Specifically, if *Y ^{τ}*(1) = 0 and

*S*(1) =

^{τ}*S*(0) = 0, then

^{τ}*Y*(1) was generated as Bernoulli with mean

*a*(Bern(

*a*)) and

*Y*(0) as Bern(0.5). If

*Y*(1) = 0 and

^{τ}*S*(1) = 1,

^{τ}*S*(0) = 0, then

^{τ}*Y*(1) was Bern(

*b*) and

*Y*(0) was Bern(0.5). Thus

*CEP*(1, 0) −

*CEP*(0, 0) =

*b*−

*a*for contrast function

*h*(

*x*,

*y*) =

*x*−

*y*. The values of

*a*ranged from 0.7 to 0.1 by decrements of 0.05 and

*b*increased from 0.1 to 0.7 by increments of 0.05. Under this parameterization

*risk*

_{0}(0, 0) = 0.5 =

*risk*

_{0}(1, 0), implying

*β*

_{0}= 0.

To generate the observed data, *Z* was drawn from Bern(0.5), and the vector of observable random variables (*Z*, *Y ^{τ}*,

*S*,

^{τ}*Y*) = (

*Z*,

*Y*(

^{τ}*Z*),

*S*(

^{τ}*Z*),

*Y*(

*Z*)) was determined. Simulations were conducted with and without case-cohort sampling of

*S*. For the latter, membership in the random subcohort was determined by

^{τ}*R*drawn from Bern(

*ν*). For individuals with

*Y*= 0,

^{τ}*S*was observed for subcohort members (i.e.,

^{τ}*R*= 1) and cases (i.e.,

*Y*= 1).

Data sets were generated for all 156 combinations of: *n* ∈ {200, 400, 800, 1600}; *ν* ∈ {0.1, 0.25, 1}; and *CEP*(1, 0) − *CEP*(0, 0) ∈ {−0.6, −0.5, …, 0.6}, except the 13 scenarios with *n* = 200 and *ν* = 0.1 were not studied because only 8 vaccine recipient uninfected controls are expected to have *S ^{τ}* measured. All results are based on 2000 simulated data sets. Analyses used [

*l*

_{0},

*u*

_{0}] ∈ {[0, 0], [−1, 1], [−2.5, 2.5]}. The true values for

*CEP*(1, 0) and

*CEP*(0, 0) were selected to include zero effect modification and gradients in effect modification magnitudes in both directions, with values (

*a*,

*b*) ∈ {(0.1, 0.7), (0.15, 0.65), (0.2, 0.6), (0.25, 0.55), (0.55, 0.25), (0.6, 0.2), (0.65, 0.15), (0.7, 0.1)} expressing qualitative interactions. For each simulated data set, the data analysis was done using the methods for the scenario

**NEE-CB**assumption set. The null hypothesis

*H*

_{0}was rejected if and only if the 95% EUI for

*μ*=

*CEP*(1, 0) −

*CEP*(0, 0) calculated as described in Section 5 excluded 0. Power was estimated by the proportion of simulated data sets where

*H*

_{0}was rejected (Figure 1).

Empirical type I error was close to the nominal level *α* = 0.05. Power increased with sample size and decreased as the interval [*l*_{0}, *u*_{0}] became wider and as the subcohort size decreased. Web Figure 1 (Web Appendix D) shows the average widths of the 95% EUIs for *μ*. The EUIs cover at approximately the nominal rate when *l*_{0} = *u*_{0}, i.e., when *μ* is identifiable, and are conservative otherwise. The widths are relatively constant across values of *μ* and increase as the subcohort size decreases. Empirical coverage of the 95% EUIs are plotted in Web Figure 2. Web Figures 3 and 4 display bias of the *μ̂* estimates and ratios of the empirical standard errors (ESE) to the average of the sandwich variance estimated standard errors (ASE), showing unbiasedness of both the point and standard error estimators.

A second simulation study was conducted, with data simulated under scenario **NEH-CB** such that **A4** in scenario **NEE-CB** failed. Web Appendix D describes details and results, with data analysis by the methods under assumption-set **NEH-CB** (Web Figures 5–9).

## 7 Application: HVTN 505 HIV Vaccine Efficacy Trial

We apply the new PS methods to HVTN 505. The PFS biomarker described in the introduction [2] is a quantitative aggregate score derived from constituent qualitative biomarkers that are of interest for binary intermediate outcome PS analysis. The data from a vaccine recipient’s Month 6.5 blood sample are measurements of expression of 6 different functional markers (Granzyme-B, IL4, CD40L, TNFalpha, IL2, IFNgamma) in CD8+ T cells after stimulation with HIV-1 peptides. The Bayesian method COMPASS [35] provided, for each vaccine recipient and each of the 2^{6} = 64 possible cell subsets (i.e., combinations of functional markers), the probability the subset was expressed. A question of interest is whether the vaccine effect on HIV-1 infection varied by expression yes vs. no for any of the 64 subsets. Identification of specific cell subsets associated with vaccine protection or harm would give clues on cellular mechanisms of vaccine effect.

In this application *Y ^{τ}* is the indicator of HIV-1 infection diagnosis between enrollment and the Month

*τ*= 6.5 study visit, and

*Y*is the indicator of this event by the Month 24 final follow-up visit; moreover

*S*is a binary biomarker measured from a Month 6.5 visit blood sample. We defined a vaccine recipient as expressing a given T cell subset (

^{τ}*S*= 1) if the COMPASS posterior probability exceeded 0.9, and as not expressing the subset (

^{τ}*S*= 0) if the posterior probability was below 0.1. We restricted to cell subsets with at least 25% of vaccine recipients expressing and at least 25% not expressing the subset, to focus on subsets with ample data support. This yielded three cell subsets for analysis, defined by (GranzymeB, IL4, CD40L, TNFalpha, IL2, IFNgamma) expression pattern of (1) (−, −, −, −, +, +), (2) (+, −, −, −, +, +), and (3) (+, −, +, +, −, +), where + or - indicate that cells do or do not have the function. Among uninfected vaccine recipients (i.e., sampled controls with

^{τ}*Y*= 0) with

*S*data, 71 of 110 (64.5%), 70 of 124 (56.5%), and 43 of 125 (34.4%) have

^{τ}*S*= 1, respectively. Among infected vaccine recipients (i.e., cases with

^{τ}*Y*= 1) with

*S*data, 5 of 21 (23.8%), 3 of 25 (12.0%), and 1 of 25 (4.0%) have

^{τ}*S*= 1. Given Case CB holds, for each biomarker our goal is inference on

^{τ}*VE*(0) =

*CEP*(0, 0) and

*VE*(1) =

*CEP*(1, 0), using the vaccine efficacy contrast

*h*(

*x*,

*y*) = 1−

*x*/

*y*(Table 1). For reasons given below, we implement the methods using the

**NEE-CB**method and the

**NEB-CB**method. We consider the meaning and plausibility for HVTN 505 of the critical assumptions in question needed for these two approaches – (

**A4**,

**A6**) and (

**A4″**,

**A7′**,

**A8**), respectively, where

**A6**,

**A7′**, and

**A8**are needed for valid inference as described in Section 5.2.

**A4**(

**NEE**) states that the vaccine has no individual-level effects on infection between enrollment and Month 6.5, which is defensible given that

*P̂*(

*Y*(1) = 1) = 14/1251 = 0.011 and

^{τ}*P̂*(

*Y*(0) = 1) = 10/1245 = 0.0080 with Fisher’s exact test 2-sided p-value of

^{τ}*p*= 0.54 for a difference. Because 0.011 > 0.0080, we also consider the

**NEB-CB**method that relaxes

**A4**to

**A4″**(no individual-level beneficial vaccine effects through Month 6.5).

**A6** states that the biomarker has a higher frequency of response for at-risk vaccine than placebo recipients, which essentially always holds. Under Case CB as in HVTN 505 (with *P*(*S ^{τ}*(0) = 1∣

*Y*= 0) = 0), it obviously holds.

^{τ}**A7′**states that there is a harmful vaccine effect (in expectations) through Month 6.5, which is consistent with the point estimates 0.011 vs. 0.0080 but is not supported by the hypothesis testing with

*p*= 0.54. In rare event studies such as HVTN 505,

**A8**states that the fraction of vaccine recipients with positive biomarker response

*S*= 1 is not near zero, which holds. In sum, the method for scenario

^{τ}**NEE-CB**may be the most reasonable because it does not require Early Harm

**A7′**, and we focus on its results; we also apply the

**NEB-CB**method for comparison. The

**NEE-CB**and

**NEB-CB**methods are implemented as described in Sections 5.3 and 5.5, respectively.

Figure 2 shows the results in terms of ignorance intervals and 95% EUIs under each of the three ranges of sensitivity parameters specified in the simulations. To interpret the sensitivity analysis, note that the single sensitivity parameter *β*_{0} for the **NEE-CB** method has interpretation as the log odds ratio that the biomarker response to vaccination *S ^{τ}*(1) takes value 0 for a placebo recipient who is diagnosed with HIV-1 infection between Month 6.5 and 24 compared to a placebo recipient who is not diagnosed with HIV-1 infection through 24 months. Setting

*β*

_{0}to 0, 0.5, and 1 specifies no, intermediate, and highest degree of selection bias, corresponding to the odds ratio

*e*

^{β0}varying over [1,1], [0.61,1.65], and [0.37,2.72]. The

**NEB-CB**method uses the additional sensitivity parameter

*β*

_{5}(defined in Section 5.5), which is the log odds ratio of infection for a placebo recipient, comparing the always early uninfected (AEU) subgroup (numerator) to the early harmed subgroup (denominator); thus it measures the degree to which vaccine-caused infection by Month 6.5 is prognostic for infection by Month 24 if assigned placebo. It was varied over the same range as

*β*

_{0}.

On the results, first note that the **NEE-CB** and **NEB-CB** methods give very similar results (top panel), with EUIs for the latter method only very slightly wider (and thus the bottom panel only shows results from the **NEE-CB** method). Web Appendix E (especially Web Figure 10) studies why the methods under scenarios **NEE-CB** and **NEB-CB** are essentially equivalent for HVTN 505, with extra simulations suggesting that a key part of the explanation is that *P*(*Y ^{τ}*(1) = 1,

*Y*(0) = 0) is small. Second, the results are similar for the three markers, with evidence for effect modification across the expressed vs. not expressed subgroups based on the 95% EUIs for

^{τ}*μ*=

*VE*(1) −

*VE*(0) lying above zero even when allowing for the largest amount of potential selection bias (

*β*= 1, Figure 2 upper-right panel, with lower EUI limit 0.12, 0.19, 0.18 for marker

The bottom panel shows that *VE*(1) is estimated to be positive for each marker (ignorance intervals 0.70–0.85, 0.78–0.91, 0.85–0.96 and 95% EUIs 0.44–0.98, 0.54–1.0, 0.59–0.96, respectively), whereas *VE*(0) is estimated to be less than zero with wide estimated uncertainty intervals. Based on ignorance intervals these results support a qualitative interaction for each marker. However, because the upper 95% EUI limits for *VE*(0) extend well above 0, there is not compelling evidence for qualitative interactions.

Remarkably, the results support differential vaccine efficacy according to whether the vaccine induced CD8+ T cells with expression vs. not expression of specific cell subsets, suggesting that the vaccine may have conferred partial protection for individuals with certain identified expression signatures. This motivates follow-on basic science studies of cellular mechanisms of protection. The HIV vaccine field has “moved on” from the DNA/rAd5 type of HIV vaccine platform, no longer considering it. Thus these new results sound a note of caution to not prematurely abandon this platform, in suggesting that if a new version of the regimen could be invented that induces the specific cell subset responses in a much larger subgroup of vaccine recipients, then it could potentially confer high enough overall vaccine efficacy to confer worthwhile public health benefit.

## 8 Discussion

A sizable literature on nonparametric and semiparametric methods for inference on the survivor average causal effect (SACE) parameter has developed over the past 20 years. Motivated by the need for more robust principal surrogate (PS) analysis in HVTN 505, we described how these methods can be adapted to PS analysis for a binary intermediate variable. This provides new methods for PS analysis, with novel features summarized in the Introduction, all of which were needed for the HVTN 505 application, to appropriately account for the study design and available data and to relax questionable and untestable assumptions. The new methods have application to similar PS questions in other randomized clinical trials.

More robust PS assessment of a qualitative interaction in HVTN 505 was important given the appropriate skepticism that the DNA/rAd5 vaccine could have both beneficial and detrimental effects. The data analysis supported that DNA/rAd5 vaccinated subgroups defined by induction of CD8+ T cells with specific functions had beneficial vaccine efficacy, motivating further research to re-engineer the vaccine regimen to increase induction of these cells.

PS analysis can be improved by accounting for baseline covariates associated with principal strata and/or the final outcome [16, 36]. The specific PS methods described in Section 5 can account for baseline covariates by implementing any preferred covariate-adjusted estimator of the means *risk*_{1}, *risk*_{0}, *p*(0, 0), *p*(1, 0), *p*(1, 1). An appealing alternative approach that fully accounts for baseline covariates – principal score methods [10, 11, 12] –provide PS analysis under a different assumption, principal ignorability, and provide an alternative sensitivity analysis. In general, given that the key steps in estimating the *CEP*(*s*_{1}, *s*_{0}) parameters is estimation of the three SACES listed in Table 2, any SACE method that accounts for baseline covariates can be integrated into a new method of PS analysis.

## Supplementary Materials

**Title: Supportive results** (A) Identifiability results; (B) Application of Chiba and VanderWeele’s (2011) SACE method for evaluating a binary principal surrogate under Scenarios **NEE-VB** and **NEE-CB**; (C) Extending the methods for a time-to-event outcome with right-censoring; (D) Web Figures 1–9 simulation study results; (E) Additional Application results, including Web Figure 10; (F) R computer code.

## Acknowledgement

We thank the participants and investigators of HVTN 505. Research reported in this publication was supported by the National Institute Of Allergy And Infectious Diseases of the National Institutes of Health under Award Number R37AI054165 and the U.S. Public Health Service Grant AI068635. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

## References

[1] S. Hammer, M. Sobieszczyk, H. Janes, S. Karuna, M. J. Mulligan, D. Grove, B. Koblin, and et al., Efficacy trial of a DNA/rAd5 HIV-1 preventive vaccine, N. Engl. J. Med. 369 (2013), no. 22, 283—292.10.1056/NEJMoa1310566Search in Google Scholar PubMed PubMed Central

[2] H. E. Janes, K. W. Cohen, N. Frahm, S. C. De Rosa, B. Sanchez, J. Hural, C. A. Magaret, and et al., Higher T-cell responses induced by DNA/rAd5 HIV-1 preventive vaccine are associated with lower HIV-1 infection risk in an efficacy trial, J. Infect. Dis 215 (2017), no. 9, 1376—1385.10.1093/infdis/jix086Search in Google Scholar PubMed PubMed Central

[3] C. Frangakis and D. Rubin, Principal stratification in causal inference, Biometrics 58 (2002), no. 1, 21—29.10.1111/j.0006-341X.2002.00021.xSearch in Google Scholar PubMed PubMed Central

[4] D. Follmann, Augmented designs to assess immune response in vaccine trials, Biometrics 62 (2006), no. 4, 1161—1169.10.1111/j.1541-0420.2006.00569.xSearch in Google Scholar PubMed PubMed Central

[5] C. E. Frangakis and D. B. Rubin, Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes, Biometrika 86 (1999), no. 2, 365—379.10.1093/biomet/86.2.365Search in Google Scholar

[6] D. Rubin, Comment on “Causal inference without counterfactuals,” by A.P. Dawid, J. Am. Stat. Assoc. 95 (2000), no. 450, 435—437.Search in Google Scholar

[7] P. B. Gilbert and M. Hudgens, Evaluating candidate principal surrogate endpoints, Biometrics 64 (2008), no. 4, 1146-1154.10.1111/j.1541-0420.2008.01014.xSearch in Google Scholar PubMed PubMed Central

[8] E. E. Gabriel, M. C. Sachs, and P. B. Gilbert, Comparing and combining biomarkers as principal surrogates for time-to-event clinical endpoints, Stat. Med. 34 (2015), no. 3, 381—395.10.1002/sim.6349Search in Google Scholar PubMed PubMed Central

[9] Y. Huang, P. B. Gilbert, and J. Wolfson, Design and estimation for evaluating principal surrogate markers in vaccine trials, Biometrics 69 (2013), no. 2, 301—309.10.1111/biom.12014Search in Google Scholar PubMed PubMed Central

[10] B. Jo and E. A. Stuart, On the use of propensity scores in principal causal effect estimation, Stat. Med. 28 (2009), no. 23, 2857—2875.10.1002/sim.3669Search in Google Scholar PubMed PubMed Central

[11] E. A. Stuart and B. Jo, Assessing the sensitivity of methods for estimating principal causal effects, Stat. Meth. Med. Res. 24 (2015), no. 6, 657—674.10.1177/0962280211421840Search in Google Scholar PubMed PubMed Central

[12] P. Ding and J. Lu, Principal stratification analysis using principal scores, J. R. Stat. Soc. Ser. B Methodol. 79 (2017), no. 3, 757—777.10.1111/rssb.12191Search in Google Scholar

[13] Y. Wang and J. Taylor, A measure of the proportion of treatment effect explained by a surrogate marker, Biometrics 58 (2002), no. 4, 803—812.10.1111/j.0006-341X.2002.00803.xSearch in Google Scholar PubMed

[14] Y. Li, J. Taylor, and M. Elliott, A Bayesian approach to surrogacy assessment using principal stratification in clinical trials, Biometrics 66 (2010), no. 2, 523—531.10.1111/j.1541-0420.2009.01303.xSearch in Google Scholar PubMed PubMed Central

[15] C. Zigler and T. Belin, A Bayesian approach to improved estimation of causal effect predictiveness for a principal surrogate endpoint, Biometrics 68 (2012), no. 3, 922—932.10.1111/j.1541-0420.2011.01736.xSearch in Google Scholar PubMed PubMed Central

[16] D. B. Rubin, Causal inference through potential outcomes and principal stratification: application to studies with “censoring” due to death, Stat. Sci. 21 (2006), no. 3, 299—309.10.1214/088342306000000114Search in Google Scholar

[17] J. Zhang and D. Rubin, Estimation of causal effects via principal stratification when some outcomes are truncated by “death”, J. Educ. Behav. Stat. 28 (2003), no. 4, 353—368.10.3102/10769986028004353Search in Google Scholar

[18] M. Hudgens, A. Hoering, and S. Self, On the analysis of viral load endpoints in HIV vaccine trials, Stat. Med. 22 (2003), no. 14, 2281—2298.10.1002/sim.1394Search in Google Scholar PubMed

[19] K. Imai, Sharp bounds on the causal effects in randomized experiments with ‘truncation by death’, Stat. Probabil. Lett. 78 (2008), no. 2, 144—149.10.1016/j.spl.2007.05.015Search in Google Scholar

[20] Y. Chiba, The large sample bounds on the principal strata effect with application to a prostate cancer prevention trial, Int. J. Biostat. 8 (2012), no. 1, Article 12.10.1515/1557-4679.1365Search in Google Scholar PubMed

[21] D. M. Long and M. G. Hudgens, Comparing competing risk outcomes within principal strata, with application to studies of mother-to-child transmission of HIV, Stat. Med. 31 (2012), no. 27, 3406—3418.10.1002/sim.5583Search in Google Scholar PubMed PubMed Central

[22] P. B. Gilbert, R. Bosch, and M. Hudgens, Sensitivity analysis for the assessment of vaccine effects on viral load in HIV vaccine trials, Biometrics 59 (2003), no. 3, 531—541.10.1111/1541-0420.00063Search in Google Scholar PubMed

[23] Y. Jemiai, A. Rotnitzky, B. Shepherd, and P. B. Gilbert, Semiparametric estimation of treatment effects given base-line covariates on an outcome measured after a postrandomization event occurs, J. R. Stat. Soc. Ser. B Methodol. 69 (2007), no. 5, 879—902.10.1111/j.1467-9868.2007.00615.xSearch in Google Scholar PubMed PubMed Central

[24] B. Shepherd, P. B. Gilbert, and C. Dupont, Sensitivity analyses for comparing time-to- event outcomes only existing in a subset selected postrandomization and relaxing monotonicity, Biometrics 67 (2011), no. 3, 1100—1110.10.1111/j.1541-0420.2010.01508.xSearch in Google Scholar PubMed PubMed Central

[25] Y. Chiba and T. vanderWeele, A simple method for principal strata effects when the outcome has been truncated due to death, Am. J. Epidemiol. 173 (2011), no. 7, 745—751.10.1093/aje/kwq418Search in Google Scholar PubMed PubMed Central

[26] E. Tchetgen Tchetgen, Identification and estimation of survivor average causal effects, Stat. Med. 33 (2014), no. 21, 3601—3628.10.1002/sim.6181Search in Google Scholar PubMed PubMed Central

[27] L. Wang, X.-H. Zhou, and T. S. Richardson, Identification and estimation of causal effects with outcomes truncated by death, Biometrika 104 (2017), no. 3, 597—612.10.1093/biomet/asx034Search in Google Scholar PubMed PubMed Central

[28] M. Zhang, A. Tsiatis, and M. Davidian, Improving efficiency of inferences in randomized clinical trials using auxiliary covariates, Biometrics 64 (2008), no. 3, 707—715.10.1111/j.1541-0420.2007.00976.xSearch in Google Scholar PubMed PubMed Central

[29] J. Robins, A. Rotnitzky, and L. Zhao, Estimation of regression-coefficients when some regressors are not always observed, J. Am. Stat. Assoc. 89 (1994), no. 427, 846–866.10.1080/01621459.1994.10476818Search in Google Scholar

[30] S. Rose and M. J. van der Laan, A targeted maximum likelihood estimator for two-stage designs, Int. J. Biostat 7 (2011), no. 1, 1—21.Search in Google Scholar

[31] B. Saul and M. Hudgens, The calculus of M-estimation in R with geex, arXiv:1709.01413, (2017), [stat.AP] [Preprint] 5 Sep 2017. Last revised 5 Jan 2019. Cited 3 Sep 2019. Available from https://arxiv.org/abs/1709.01413.10.18637/jss.v092.i02Search in Google Scholar PubMed PubMed Central

[32] G. Imbens and C. Manski, Confidence intervals for partially identified parameters, Econometrica 72 (2004), no. 6, 1845—1857.10.1111/j.1468-0262.2004.00555.xSearch in Google Scholar

[33] S. Vansteelandt, E. Goetghebeur, M. Kenward, and G. Molenberghs, Ignorance and uncertainty regions as inferential tools in a sensitivity analysis, Stat. Sin. 16 (2006), 953—979.Search in Google Scholar

[34] A. Richardson, M. Hudgens, P. B. Gilbert, and J. Fine, Nonparametric bounds and sensitivity analysis of treatment effects, Stat. Sci. 29 (2014), no. 4, 596—618.10.1214/14-STS499Search in Google Scholar PubMed PubMed Central

[35] L. Lin, G. Finak, K. Ushey, C. Seshadri, T. R. Hawn, N. Frahm, T. J. Scriba, H. Mahomed, W. Hanekom, P.-A. Bart, et al., COMPASS identifies T-cell subsets correlated with clinical outcomes, Nat. Biotech. 33 (2015), no. 6, 610.10.1038/nbt.3187Search in Google Scholar PubMed PubMed Central

[36] D. M. Long and M. G. Hudgens, Sharpening bounds on principal effects with covariates, Biometrics 69 (2013), no. 4, 812—819.10.1111/biom.12103Search in Google Scholar PubMed PubMed Central

**Received:**2019-08-05

**Accepted:**2020-04-06

**Published Online:**2020-07-25

© 2020 P. B. Gilbert *et al*., published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.