Robins et al. (2004) introduced the extended g-formula to estimate from observational data the risk of failure under hypothetical interventions wherein a subject’s treatment at time k is assigned based on the natural value of treatment at k; that is, the value of treatment that would have been observed at k were the intervention discontinued right before k. Several authors (Robins et al. 2004; Taubman et al. 2009; Lajous et al. 2013; Danaei et al. 2013; García-Aymerich et al. 2014) have parametrically applied this approach to estimate the risk of failure in observational studies under hypothetical time-varying interventions of the following form: “If a subject’s natural value of treatment at k is below a particular threshold (or above in the case of a harmful exposure) then set treatment to this threshold value. Otherwise, do not intervene on this subject at k.”
Taubman et al. (2008) referred to this special case of an intervention that depends on the natural value of treatment as a threshold intervention. For example, Taubman et al. (2009) used the parametric extended g-formula to estimate the 20-year risk of coronary heart disease (CHD) in the Nurses’ Health Study (NHS) under the following hypothetical threshold intervention on daily minutes of exercise on all days of follow-up “If a subject’s natural value of exercise by the end of day k is less than 30 minutes, set her exercise on day k to exactly 30 minutes. Otherwise, do not intervene on this subject on day k”. Threshold interventions have the property that they guarantee a continuous treatment is maintained within a pre-specified range (e.g. at least 30 minutes per day) continually throughout the follow-up while minimizing the number of subjects requiring intervention at each time.
Non-parametrically, the extended g-formula differs from the (non-extended) g-formula of Robins (1986) in that it includes (i) a specific user-supplied intervention density that depends on the natural value of treatment at each k and (ii) the density of natural treatment itself at each k conditional on past measured confounders (Robins et al. 2004). Richardson and Robins (2013) recently defined a condition such that the extended g-formula non-parametrically identifies risk under an intervention that depends on the natural value of treatment associated with the user-supplied intervention density in (i), provided this expression is well-defined. In this paper, we complement this result by showing the algebraic equivalence between the extended g-formula associated with a user-supplied intervention density (i) and the (non-extended) g-formula associated with a particular random dynamic regime that does not depend on the natural value of treatment and may, at most, depend on the measured confounders.
Provided the identifying condition of Richardson and Robins (2013) holds, this algebraic equivalence gives
a sufficient positivity condition such that the extended g-formula is well-defined and thus non-parametrically identifies risk under an intervention that depends on the natural value of treatment in an observational study and
semi-parametric alternatives to the parametric extended g-formula for estimation.
Finally, there has been no consideration of the limits on physical implementation of interventions that depend on the natural value of treatment. For example, once we observe that a subject has exercised 20 minutes by the end of day k we cannot subsequently intervene and make her exercise any more (or any fewer) minutes by the end of that day. Therefore, given a hypothetical intervention that depends on the natural value of treatment, we define a plausible (implementable) approximation to this intervention. We also provide an untestable assumption that, when satisfied, would give exact equivalence.
The structure of the paper is as follows. In Section 2, we define the observational data structure of interest and give a classification of hypothetical interventions that do not depend on the natural value of treatment and may, at most, depend on the measured confounders, including random dynamic regimes. In Section 3, we review a set of conditions that non-parametrically identifies risk by the end of follow-up in the observational study under any hypothetical intervention within this classification by the (non-extended) g-formula. In Section 4, we show the algebraic equivalence between the extended g-formula associated with an intervention that depends on the natural value of treatment and the (non-extended) g-formula associated with a particular random dynamic regime. In Section 5, we review the parametric extended g-formula estimator and give a semi-parametric alternative that follows immediately from the results of Section 4 given previous semi-parametric results in the context of random dynamic regimes. In Section 6, we define a plausible approximation to an intervention that depends on the natural value of treatment and an assumption for exact equivalence.
2 A classification of interventions that do not depend on the natural value of treatment
Consider an observational study in which the following random variables are measured during each follow-up time (e.g. day) for each of subjects. We assume subjects are independent and identically distributed and thus suppress the i subscript. Let be an indicator of failure (e.g. CHD) by k, a vector of measured confounders at the start of k (e.g. smoking, body mass index [BMI] and diet), and the treatment observed during k (e.g. number of minutes of actual daily exercise). During any given time , precedes . We denote the history of a random variable using overbars. For example, is the observed treatment history through k. For notational convenience, we set and to be identically 0 and, by definition, . We use lower-case letters to denote possible realizations of a random variable, for example, is a possible realization of treatment . For simplicity, we assume that no subjects are lost to follow-up or die from competing risks and that all variables are perfectly measured. If a subject has failed by i.e. then by convention, we will set .
Our goal is to estimate the risk of failure that would have been observed by the end of follow-up had all subjects in this study population followed a hypothetical intervention or treatment regime. Generally define a treatment regime that does not depend on the natural value of treatment as a rule that assigns treatment at k as an independent draw from an intervention density that may, at most, depend on , (Robins 1986).
Treatment regimes can be either deterministic or random. A regime is deterministic if may only equal zero or one for all and . Otherwise, it is random. In particular, we denote to be the deterministic regime associated with the intervention density defined by if and 0 otherwise, where is any component of and is recursively defined by the function of , .
Treatment regimes can further be classified as static or dynamic. A deterministic regime g is static if does not depend on any component of for all k. Otherwise g is dynamic. Analogously, a random regime may be classified as static if the intervention density does not depend on any component of for all k. Otherwise, this random regime can be classified as dynamic. As noted by Picciotto et al. (2012), and as made explicit in our notation, treatment assignment under any regime within the current classification depends on surviving to k (i.e. the event ).
To fix ideas, let us consider some examples of treatment regimes in the context of interventions on daily exercise:
Deterministic static regime: “Set daily exercise to 30 minutes on every day k for all subjects” or if and 0 otherwise for all . For this regime g, for any k and confounder history .
Deterministic dynamic regime: “If a subject’s BMI at the start of day k is , then set her exercise to exactly 30 minutes on that day. Otherwise, set her exercise to exactly 60 minutes” or, for the component of corresponding to the day k BMI measurement,
if , then if and 0 otherwise
if , then if and 0 otherwise . For this regime g, if and otherwise for all k.
Random static regime: “Randomly assign a subject’s exercise on day k such that the probability of receiving 30 minutes is 0.8 and the probability of receiving 60 minutes is 0.2” or if , 0.2 if , and 0 otherwise. The intervention density may take on values between 0 and 1 but its value does not depend on for any k.
Random dynamic regime: “If a subject’s BMI at the start of day k is , randomly assign her exercise on day k such that the probability of receiving 30 minutes is 0.8 and the probability of receiving 60 minutes is 0.2. Otherwise, set her exercise to 60 minutes on that day” or
if , then if , 0.2 if and 0 otherwise
if , then if and 0 otherwise . The intervention density may take on values between 0 and and its value depends on for some k.
3 Identifying risk under interventions that do not depend on the natural value of treatment
In observational studies, treatment is not under the control of the investigator but is assigned by some unknown treatment rule that generally differs from the hypothetical regime of interest . In this section, we will review a set of conditions under which data from an observational study can still be used to identify the risk had all subjects, contrary to fact, followed a treatment regime characterized by .
Let , and represent the counterfactual outcome, treatment and confounder histories, respectively, under a deterministic treatment regime g. We now define three g-specific identifying conditions for each :
Consistency: If , then and .
Exchangeability  encodes the assumption that the measured history is sufficient to control confounding for the effect of treatment at k on future outcomes. It is often referred to as the assumption of no unmeasured confounding and the vector the measured confounder history at k.
where denotes the observed treatment density, that is, the conditional density of treatment at k in the observational study evaluated at a particular .
Under the three g-specific identifying assumptions stated above for each deterministic regime , where is the set of all deterministic regimes, the risk by under an intervention characterized by any is equivalent to the g-formula (Robins 1986): where and are the observed joint density of the confounders at k and probability of the outcome by , respectively, conditional on past treatment, confounders, and survival to k, with the first components of , . A proof of this equivalence under the current data structure and notation is provided in the appendix of Young et al. (2011) following Lemma 4.2 of Robins (1986).
One-minus expression  is equivalent to survival by under a treatment regime characterized by , . This survival can be written as a weighted average of deterministic survival probabilities associated with the deterministic regimes with weights defined in terms of . Appendix A reviews this equivalence and provides a simplified numerical example in a low-dimensional setting. Note for a given choice of , the three identifying assumptions need only hold for the subset of deterministic regimes g that contribute a non-zero weight to the weighted average.
In settings with high-dimensional confounders and/or multiple follow-up times, it will often be quite cumbersome (if not impossible) to list every deterministic regime in the set with non-zero weights corresponding to a particular choice of . An exception is the case where is defined in terms of a single deterministic regime g. In this special case, all weight is given to this single deterministic regime and expression  reduces to: which may be more familiar to some readers.
4 Identifying risk under interventions that depend on the natural value of treatment
Given an intervention, define the natural value of treatment at k as the value of treatment that would have been observed at time k were the intervention discontinued right before k. We denote the natural value of treatment at k as where, for notational simplicity, we suppress dependence on the associated intervention. Thus far, we have only considered interventions that may, at most, depend on the measured confounders as classified in Section 2. We now extend our consideration to interventions that may also depend on the natural value of treatment at k. We shall represent such a hypothetical intervention by its intervention density, . An example of is the threshold intervention of Taubman et al. (2009) on daily exercise stated in Section 1 such that Note that, in an observational study, the natural value of treatment at k is equivalent to the observed treatment as no intervention has been made.
Robins et al. (2004) defined the extended g-formula for risk by associated with an intervention density : where we stress that is the conditional density of in the observational study evaluated at given past treatment, confounders and survival to k, . To emphasize this fact, we sometimes write this density as .
Richardson and Robins (2013) defined a condition such that expression  identifies from observational data the risk by under a hypothetical intervention provided this expression is well-defined. We can informally understand this condition as the assumption that is not a confounder and has no effect on the outcome except through future treatment. We consider this condition more formally in the context of a simple example in Appendix B.
Consider one particular intervention that does not depend on within the classification of Section 2 specifically chosen as for any . We will say that this choice of is an implied treatment rule because it is a marginalization of the user-supplied density over the observational data density of . For this particular choice of , the extended g-formula  is equivalent to the (non-extended) g-formula . This equivalence follows by the absence of from the conditioning statement of the conditional probability of the outcome at any time in expression .
By this equivalence, it immediately follows that, with defined by eq. , the positivity condition  of Section 3 guarantees that both the (non-extended) g-formula  and the extended g-formula  are well-defined. Note, again, for this , this condition need only hold for the subset of deterministic regimes g that contribute a non-zero weight to the associated weighted average of deterministic regimes. Díaz Muñoz and van der Laan (2011, 2012) and Haneuse and Rotnitzky (2013) noted a similar result in the point treatment setting for random dynamic regimes that might be interpreted in terms of implied random dynamic regimes based on an explicit deterministic mechanism depending on the natural value of treatment. The regimes considered by these authors are discussed in Section 5.2.
The implied intervention density  is a function of the observed treatment density , which is generally unknown in high-dimensional observational data (although, it may be estimated). Therefore, the implied will also generally be unknown. For example, for as defined in , the marginalization  evaluates to The implied rule  is a random dynamic regime by the classification given in Section 2 as will generally be a nondegenerate density.
Finally, while the extended g-formula  and the (non-extended) g-formula  associated with the random dynamic regime  require the same positivity condition by their equivalence, the conditions required for risk identification under an intervention and under  are not generally equivalent. In particular, the identifying condition defined by Richardson and Robins (2013) for an intervention is generally more stringent than that required for the random dynamic mechanism , the latter of which is equivalent to the exchangeability condition  of Section 3. An exception is under the null; here the two conditions are equivalent. For details, see Section 5.6 of Richardson and Robins (2013) and Appendix B.
5 Estimating an intervention risk using observational data
In low-dimensional settings, we can non-parametrically estimate expression  by first enumerating all possible treatment and confounder histories under a specified intervention , calculating each component proportion, and then taking the overall sum. When is implied by the sum  then we must additionally enumerate all possible natural treatment histories and calculate this implied rule. In high-dimensional settings, such that K is large and/or there are continuously measured covariates, such an approach is not feasible. In this case, parametric or semi-parametric approaches may be used.
5.1 Parametric estimation
Robins (1986) described a parametric estimator of the (non-extended) g-formula given in expression  which involves parametrically modelling each component density and using Monte Carlo simulation to approximate the sum over all possible histories under an intervention that does not depend on the natural value of treatment as in Section 2. Robins et al. (2004) and Taubman et al. (2009) generalized this algorithm to allow for an intervention that depends on the natural value of treatment as in Section 4. Briefly, this more general approach involves the following steps:
Parametrically estimate the joint density of natural treatment and confounders at each follow-up time (except baseline) given survival and past treatment and confounders.
Parametrically estimate the probability of failure at each follow-up time given survival and past measured treatment and confounders.
Recursively, for each
Set baseline confounders and natural treatment to the observed sample values. For , generate time k confounders and natural treatment based on the estimated model coefficients and previously generated treatment and confounders under intervention.
Assign time k treatment under intervention based on the rule of interest which may be an explicitly specified depending at most on the past measured confounders or an explicitly specified depending on the natural value of treatment at k.
Calculate the discrete failure hazard at given only past generated treatment and confounders under the intervention (ignoring the natural treatment value).
Calculate the cumulative probability of failure by using the specific failure hazards for each generated treatment and confounder history under intervention.
Calculate the average cumulative probability of failure by over all generated intervention histories.
Robins et al. (2004), Taubman et al. (2009), Lajous et al. (2013), Danaei et al. (2013) and García-Aymerich et al. (2014) have applied the above approach to estimate failure risk under time-varying threshold interventions on lifestyle factors that depend on the natural value of treatment in various observational studies including the NHS, the Offspring Framingham Heart Study and the Health Professionals Follow-up Study. A more technical description of this algorithm is given in Appendix C and may be implemented using a SAS macro publicly available at www.hsph.harvard.edu/causal/software.
This estimation algorithm effectively ignores that, for an intervention , the implied treatment rule depending only on the measured confounders is as defined by the marginalization . The natural value of treatment is generated at each k regardless of whether the explicit intervention of interest depends on it or not. If the intervention does not depend on it, then is generated but not used. Note that expression  can be rewritten as where is the joint density of , an arbitrarily ordered vector including and , conditional on survival and past treatment and confounder history. Here, may be selected as an explicitly specified that may at most depend on the measured confounder history as in the examples given in Section 2 or an explicitly specified . Under the latter choice, expression  is equivalent to expression  with defined by eq.  and, thus (by the arguments of Section 4), also equivalent to the extended g-formula .
5.2 Semi-parametric estimation
The parametric g-formula may be subject to bias due to model misspecification and to the g-null paradox (Robins and Wasserman 1997). As an alternative, several authors have described semi-parametric estimators of risk under explicitly specified random dynamic regimes that may, at most, depend on the measured confounder history (Murphy et al. 2001; Cain et al. 2010; Stitelman et al. 2010; Díaz Muñoz and van der Laan 2012). These approaches do not require specification of the likelihood and may be more robust to model misspecification. Here, we describe how an inverse-probability weighted (IPW) risk estimator can be extended to implied random dynamic regimes such as that defined by eq. .
Following Cain et al. (2010), consider the following IPW estimator of risk by under an explicitly specified . Let be the solution to the estimating equation with respect to where is a flexible function of k and the parameter vector and with the MLE of given the model for the observed treatment density as defined in eq.  with the true population value of .
If this treatment model is correctly specified and there exists such that then we have for all k and the estimator consistent for and asymptotically normal. Note that, under these assumptions, the g-formula  is equivalent to The IPW estimator of expression  is then given by the plug-in estimator where in expression  is replaced by the IPW estimate . Analogous to Cain et al. (2010), we might impose a Cox marginal structural model if few individuals are following to borrow information from individuals following other interventions (Robins 2000). Note that in the case where corresponds to a single deterministic regime g, in the numerator of the weight  becomes which renders an estimating equation more familiar to some readers.
To extend the IPW estimator described above (and related semi-parametric approaches) to explicitly specified interventions of the form , we must replace for in the weight  with the marginalization  for every possible treatment and confounder history observed in the data. Thus, in contrast to the parametric g-formula estimator described above, semi-parametric methods cannot “ignore” the fact that the explicit treatment rule of interest implies the marginalization .
In general, the computational complexity of this marginalization will, of course, depend on the form of and . For example, the implied rule  requires knowledge of which must be estimated to calculate the denominator of the weights. If this is based on a parametric model then one must also estimate based on that model, which will be used for the numerator of the weights for any subject with .
Other authors have considered semi-parametric estimators of risk under random dynamic regimes that might be interpreted in terms of implied random dynamic regimes based on an explicit deterministic mechanism depending on the natural value of treatment (Díaz Muñoz and van der Laan 2011, 2012; Haneuse and Rotnitzky 2013). For example, Díaz Muñoz and van der Laan (2012) considered various semi-parametric estimators of risk under a random dynamic regime on a point treatment that somehow shifts the observed treatment density by a certain amount. They allowed this shift to, at most, depend on values of the measured confounders, considering interventions on physical activity as a particular example.
Specifically, extending to our more general time-varying setting, this shift could be achieved by the following mechanism: “On each day k, if a subject with treatment and confounder history has exercised minutes under no intervention by the end of the day then have her, instead, exercise on that day”. If we fix for all , then this intervention maintains exercise at or above 30 minutes per day for all subjects and corresponds to a particular choice of , such that if and 0 otherwise.
For this choice of , the marginalization  is conveniently equivalent to for all values of . As noted by Díaz Muñoz and van der Laan (2012), this choice of may also render practical violations of positivity less influential on the performance of the estimators. See Petersen et al. (2012) for a detailed discussion of the potential influence of practical positivity violations on various estimators.
6 A plausible approximation to interventions that depend on the natural value of treatment
In the previous sections, we have considered hypothetical interventions at k that depend on the natural value of treatment also at k. Such interventions are generally not plausible in practice. For example, once an individual has exercised less than 30 minutes by the end of day k, she cannot, instead, have exercised 30 minutes by the end of that day. It follows that, even given “perfect” conditions (e.g. identifiability and no model misspecification) it is unclear how to use observational estimates associated with such interventions to inform real-world future policy or the design of future randomized experiments.
We might, however, approximate such interventions with a plausible (implementable) experiment. Let be a subject’s stated intention with respect to treatment on day k measured at the start of that day (e.g. intended daily minutes of exercise at the start of day k). Given an intervention , denote as a plausible approximation that assigns treatment according to the same rule as at each k but replacing with .
For example, given the threshold intervention on exercise of Taubman et al. (2009) characterized by , a plausible approximation is “If a subject’s intention at the start of day k is to exercise less than 30 minutes on that day then ensure she exercises exactly 30 minutes by the end of day k. Otherwise, ensure she exercises her intended amount” or 
Suppose treatment is assigned according to and the following assumption held:
Natural value of treatment assumption: Under any intervention, for all k, every subject’s intended minutes of exercise at the start of day k is equal to what her subsequent behavior would be on that day were the intervention based on intention discontinued right before k.
Under this assumption, the plausible rule is not an approximation but exactly equal to . Further, under the reasonable assumption that intention has no direct effect on the outcome except through future treatment, the risks by under these two rules will be equivalent. Thus, all identification and estimation results of Sections 4 and 5 apply.
In an actual experiment where treatment is assigned according to , it is impossible to empirically examine whether this assumption holds, even given is measured. However, in an observational study, this relationship can be examined given is measured. In particular, in an observational study (i.e. under no intervention), the natural value of treatment assumption implies that for each subject and all k Here, again, the natural value of treatment is equivalent to the measured treatment for all subjects as no intervention is made. Note that, while assumption  implies that the natural value of treatment assumption holds for the observational study, assumption  does not guarantee this assumption will hold under an intervention .
Finally, we point out that when assumption  does not hold, is simply an example of a deterministic dynamic regime g by the classification given in Section 2 with in Section 2 replaced with . This deterministic dynamic regime g is specifically defined such that if and otherwise. Further, by the arguments of Section 3, given the assumptions of that section for this choice of g, risk under this regime is identified by the deterministic regime g-formula , again replacing with . Note that, in this setting, any of the conditional densities in the g-formula  may depend on without restriction.
By contrast, if assumption  holds in the observational study, positivity as defined in condition  immediately fails for this g. Specifically, for because we must have for all k and under this definition of g. Therefore, given , whenever . As a consequence of this positivity violation, in expression  is undefined for all histories such that , .
In this paper, we showed the equivalence between the extended g-formula associated with an intervention that depends on the natural value of treatment and the (non-extended) g-formula of Robins (1986) associated with a particular random dynamic regime that does not depend on this value. This equivalence immediately gives a sufficient positivity condition that guarantees the extended g-formula is well-defined. This positivity result, coupled with the results of Richardson and Robins (2013), now provides a formal causal framework for previously published applications of the parametric extended g-formula to estimate risk under threshold interventions in observational studies. It also immediately gives semi-parametric alternatives to the parametric extended g-formula. Finally, we considered limits on the practical implementation of threshold interventions along with possible real-world approximations.
The assumption of positivity is often informally described as the assumption that there are at least some subjects in the observational study who are observed to follow the hypothetical intervention of interest within every possible level of the “past”. By this understanding, it would appear that positivity must be violated for the threshold intervention on exercise considered by Taubman et al. (2009). Specifically, no subject who exercised less than 30 minutes on day k can be following the intervention at k. Our positivity result makes clear that, given appropriate identification conditions, it is not necessary to observe such patterns in the observational study. It is only necessary to observe some individuals following the implied random dynamic regime .
Representing the g-formula characterized by a random dynamic regime as a weighted average of deterministic regimes
Given let equal the intervention density evaluated at . In the following, we assume is discrete and we choose an ordering such that with the support of and its cardinality.
Let For , let Let .
By Lemma 4.2 of Robins (1986), given a particular defining as above for all , then one-minus expression  equals where “Pr” is equivalent to one-minus expression  and .
Simplified numerical example
Figure 1 depicts a hypothetical sequentially randomized trial where treatment is assigned at each time based on a particular intervention density by a structural tree graph (Robins 1986). For numerical simplicity, we will consider a short follow-up with and all binary treatment and covariates. For additional simplicity, we will assume that no subject fails prior to the end of follow-up (i.e. for all subjects). We will also assume that all subjects have the same value of the baseline covariate . The intervention density is defined by the probability of receiving a given level of treatment given the past read directly off the graph. These probabilities imply that corresponds to a random dynamic regime. For example, following the top branch of the graph, the probability of receiving treatment at given is or 0.5.
The survival probability for the disease of interest in this hypothetical sequentially randomized trial is simply the overall proportion of those who did not get the disease at the end of follow-up out of the total number at risk at baseline. Specifically, 52 subjects at the end have out of the 100 subjects at risk at baseline; thus survival in this hypothetical trial characterized by is . We will now show is equivalent to a weighted average of the g-formula for survival over all deterministic regimes that it is possible to follow in this hypothetical sequentially randomized trial with weights defined as in the previous section.
First the set contains the following subset of deterministic regimes :
: ; the static regime “do not treat at time 0; treat at time 1”
: ; the static regime “treat at time 0; do not treat at time 1”
: ; the static regime “always treat”
: ; the dynamic regime “do not treat at time 0; if then do not treat at time 1; otherwise treat at time 1”
: ; the dynamic regime “treat at time 0; if then do not treat at time 1; otherwise treat at time 1”
: ; the dynamic regime “treat at time 0; if then do not treat at time 1; otherwise treat at time 1”
Using the definition of the previous section such that, again, , we define as follows for each g in the subset above: Specifically, we have by Figure 1:
Each “Pr” is defined by the g-formula where . Here, we evaluate this expression for all g in the subset:
Richardson and Robins (2013) defined a graphical condition based on a d-separation relation (i.e. checking for the absence of “backdoor paths”) that gives general identification for any intervention considered in the classification of Section 2 or an intervention that depends on the history of the natural value of treatment using the (non-extended) g-formula and extended g-formula, respectively. They further show that, given an appropriate consistency assumption, this graphical condition for identification implies an exchangeability condition analogous to condition  given in Section 3. In the restricted case, where the intervention does not depend on the history of the natural value of treatment, then this condition is equivalent to condition . We refer the reader to Richardson and Robins (2013) for details of this more general exchangeability condition.
The d-separation condition of Richardson and Robins (2013) is applied to a transformation of a causal DAG (Spirtes et al. 1993; Pearl 2000) representing assumptions on the underlying data generating process that produced the data in the observational study. Richardson and Robins (2013) call this transformation a Single World Intervention Graph (SWIG). We now illustrate how to evaluate identification for different interventions on a time-varying treatment under a simple set of underlying observed data generating assumptions using SWIGs. The examples given here are similar to examples depicted in figures 19 and 21 in Richardson and Robins (2013).
Remark on notation: In describing how to construct a SWIG associated with any hypothetical intervention under an assumed observed data generating mechanism we will adopt, for this section of the appendix only, the notation of Richardson and Robins (2013). This will create two inconsistencies with notation used in the main text which we now describe, along with our motivation behind this choice. Specifically, in this appendix, we will denote any hypothetical dynamic intervention as g which may, or may not, depend on the natural value of treatment. In the main text, this notation was reserved only for deterministic regimes (dynamic or static) that do not depend on the natural value of treatment. Further, we will change the meaning of one instance of counterfactual notation used in the main text. In particular, was used in the main text to denote the counterfactual value of treatment assigned under an intervention g. Here, to be consistent with Richardson and Robins (2013), will be used to denote this counterfactual, and will, alternatively, be used to denote the counterfactual natural value of treatment under g.
We chose not to adopt this more complex notational convention of Richardson and Robins (2013) in the main text as the primary results regarding positivity and semi-parametric estimation of the main text do not require formalization of a counterfactual natural value of treatment. This allows simpler notation in the main text that is consistent with previous work on interventions that do not depend on the natural value of treatment. It also allows a notational bridge to the motivating work by Robins et al. (2004) and Taubman et al. (2009). While we could have used notation fully consistent with the main text in this section of the appendix, we chose to adopt that of Richardson and Robins (2013), the foundational paper on SWIGs, in order to avoid confusion within the newly emerging literature on this topic. We now proceed with our examples.
Consider the simple time-varying observational study depicted in the causal DAG of Figure 2(i) where, as in the simplified numerical example of Appendix A, we assume a short follow-up () and that no subject fails prior to the end of follow-up. In Figure 2(i), represents an unmeasured common cause of and and an unmeasured common cause of the covariate L and the outcome D.
The d-separation condition of Richardson and Robins (2013) is evaluated for a given dynamic intervention g based on the following sets of transformations applied to a causal DAG:
Split each treatment node at k into two nodes with one node containing the natural value of treatment at k and the other a constant value
Index all random variables after time 0 as counterfactuals under a static deterministic intervention , including the natural value of treatment.
All arrows out of the observed on the original DAG should now be out of and all arrows into the observed on the original DAG should now be into the counterfactual natural value of treatment at k (equivalent to the observed at baseline as no intervention has yet been made).
To assess identification for a dynamic intervention g, we apply the following additional transformations:
Index all counterfactuals on by g rather than by or a subvector thereof
Replace each constant with the counterfactual
Add dashed arrows from any variable temporally prior to into if treatment at k is assigned by this variable under the intervention g
Richardson and Robins (2013) prove that a dynamic intervention g is identified if, for each time k, and are d-separated conditional on in once we apply the additional k-specific transformation of removing all dashed arrows out of . This final transformation is only required to evaluate identification when g depends on the history of the natural value of treatment. Richardson and Robins (2013) define this last -specific transformation of the SWIG as a new SWIG associated with what they term a perturbed regime at k. Richardson and Robins (2013) note that the aforementioned d-separation holds if and only if there is no unblocked backdoor path between and conditional on the same set of variables.
Figure 3 depicts two dynamic SWIGs created from transformations of the non-dynamic SWIG of Figure 2(ii) which differ only by their dependence on the history of the natural value of treatment. The intervention under Figure 3(i) does not depend on any function of the history of the natural value of treatment by the absence of any dashed arrows from into either or and the absence of a dashed arrow from into . By contrast, the intervention under Figure 3(ii) depends on this history by the presence of dashed arrows from into and into .
By the d-separation condition of Richardson and Robins (2013), we can see that the intervention g in Figure 3(i), under which treatment assignment does not depend on the history of the natural value of treatment, is identified under our data generating assumptions. Specifically, there are no unblocked backdoor paths between and . Further, conditional on , and there is no unblocked backdoor path between and .
By contrast, we can see that the intervention g in Figure 3(ii), under which treatment assignment does depend on the history of the natural value of treatment, is not identified under our data generating assumptions. Again, applying the d-separation condition of Richardson and Robins (2013), following the transformation to the perturbed regime (i.e. removal of the dashed arrow from into ), we still have the unblocked backdoor path .
These examples illustrate that even given we have identification for an intervention that does not depend on the history of the natural value of treatment – for example, the random dynamic intervention  – it is not guaranteed that we will have identification for an intervention that does depend on some function of this history – for example, the threshold interventions of Taubman et al. (2009) – for all underlying observed data generating mechanisms. However, under additional restrictions on the original data generating assumptions depicted in Figure 2(i), we achieve identification for both of the dynamic regimes considered in Figure 3. For example, this would be the case under either of the following restrictions applied to our initial set of data generating assumptions in Figure 2(i):
The null is true (i.e. the arrows from and into D are removed).
The common cause of and is removed.
Let be an arbitrary permutation of the p components in , noting that
in expression  for any where are conditional densities based on the factorization implied by the user-selected permutation.
For user-chosen K and , we do the following:
Step I: parametric modelling of conditional densities
Using the n individuals in the data set, for each :
If , fit parametric models for the conditional densities , .
Fit a parametric model for the conditional probability of the outcome
Step II: Monte Carlo simulation under the user-chosen
For and :
If , set to the observed values of for subject v. Otherwise, if , recursively draw from the nested conditional densities estimated in Step I.1 based on previously drawn confounders through and assigned treatment under the user-chosen intervention.
Assign the treatment according to the user-chosen intervention. For example, for chosen as we set if and otherwise set .
Estimate the probability of failure by given survival to k for the th simulated treatment and confounder history based on the estimated coefficients from Step I.2.
STEP III: computation of disease risk by under
Estimate expression , or equivalently expression , as where is obtained in Step II.3.
As discussed in Young et al. (2011), both Steps I and II may be modified to avoid reliance on parametric models for histories such that a priori subject matter knowledge on the observed data structure is available.
Cain, L. E., Robins, J. M., Lanoy, E., Logan, R., Costagliola, D., and Hernán, M. A. (2010). When to start treatment? A systematic approach to the comparison of dynamic regimes using observational data. International Journal of Biostatistics, 6:Article 18. Web of ScienceCrossrefGoogle Scholar
Danaei, G., Pan, A., Hu, F. B., and Hernán, M. A. (2013). Hypothetical lifestyle interventions in middle-aged women and risk of type 2 diabetes: a 24-year prospective study. Epidemiology, 24:122–128. PubMedWeb of ScienceCrossrefGoogle Scholar
Dawid, A. P. and Didelez, V. (2008). Identifying optimal sequential decisions. In: Proceedings of the Twenty-Fourth Annual Conference on Uncertainty in Artificial Intelligence (UAI-08), D. McAllester and A. Nicholson (Eds.), 113–120. Corvallis, OR: AUAI Press. Google Scholar
Díaz Muñoz, I. and van der Laan, M. J. (2011). Population intervention causal effects based on stochastic interventions. U.C. Berkeley Division of Biostatistics Working Paper Series. Paper 289. http://www.bepress.com/ucbbiostat/paper289
García-Aymerich, J., Varraso, R., Danaei, G., Camargo, C. A., and Hernán, M. A. (2014). Incidence of adult-onset asthma after hypothetical interventions on body mass index and physical activity: an application of the parametric g-formula. American Journal of Epidemiology, 179(1):20–6. PubMedWeb of ScienceCrossrefGoogle Scholar
Hernán, M. A., Lanoy, E., Costagliola, D., and Robins, J. M. (2006). Comparison of dynamic treatment regimes via inverse probability weighting. Basic & Clinical Pharmacology & Toxicology, 98:237–242. PubMedCrossrefGoogle Scholar
Lajous, M., Willett, W. C., Robins, J. M., Young, J. G., Rimm, E., Mozaffarian, D., and Hernán, M. A. (2013). Changes in fish consumption in midlife and the risk of coronary heart disease in men and women. American Journal of Epidemiology, 1780(3):382–391. CrossrefGoogle Scholar
Murphy, S. A., van der Laan, M. J., and Robins, J. M. (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association, 960(456):1410–1423. CrossrefWeb of ScienceGoogle Scholar
Orellana, L., Rotnitzky, A., and Robins, J. M. (2010a). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part i: main content. International Journal of Biostatistics, 6:Article 7. Web of ScienceGoogle Scholar
Orellana, L., Rotnitzky, A., and Robins, J. M. (2010b). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part ii: proofs and additional results. International Journal of Biostatistics, 6:Article 8. Web of ScienceGoogle Scholar
Pearl, J. (2000). Causality. Cambridge, UK: Cambridge University Press. Google Scholar
Petersen, M. L., Porter, K. E., Gruber, S., Wang, Y., and van der Laan, M. J. (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research, 210(1):31–54. CrossrefWeb of ScienceGoogle Scholar
Picciotto, S., Hernán, M. A., Page, J. H., Young, J. G., and Robins, J. M. (2012). Structural nested cumulative failure time models to estimate the effects of interventions. Journal of the American Statistical Association, 1070(499):886–900. CrossrefWeb of ScienceGoogle Scholar
Richardson, T. S. and Robins J. M. (2013). Single world intervention graphs (SWIGs): a unification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, University of Washington Series. Working Paper Number 128. http://www.csss.washington.edu/Papers/
Robins, J. M. (1986). A new approach to causal inference in mortality studies with a sustained exposure period: application to the healthy worker survivor effect. Mathematical Modelling, 7:1393–1512. [Errata (1987) in Computers and Mathematics with Applications 14, 917–921. Addendum (1987) in Computers and Mathematics with Applications 14, 923–945. Errata (1987) to addendum in Computers and Mathematics with Applications 18, 477.]. Google Scholar
Robins, J. M. (1997). Causal inference from complex longitudinal data. In: Latent Variable Modeling and Applications to Causality. Lecture Notes in Statistics 120, M. Berkane (Ed.), 69–117. New York: Springer. Google Scholar
Robins, J. M. (2000). Marginal structural models versus structural nested models as tools for causal inference. In: Statistical Models in Epidemiology, M. E. Halloran and D. Berry (Eds.), 95–133. New York: Springer. Google Scholar
Robins, J. M. and Hernán, M. A. (2009). Estimation of the causal effects of time-varying exposures. In: Advances in Longitudinal Data Analysis, G. Fitzmaurice, M. Davidian, G. Verbeke, and G. Molenberghs (Eds.), 553–599. Boca Raton, FL: Chapman and Hall/CRC Press. Google Scholar
Robins, J. M., Hernán, M. A., and Siebert, U. (2004). Effects of multiple interventions. In: Comparative Quantification of Health Risks: Global and Regional Burden of Disease Attributable to Selected Major Risk Factors, M. Ezzati, A. D. Lopez, A. Rodgers, and C. J. L. Murray (Eds.), 2191–2230. Geneva: World Health Organization. Google Scholar
Robins, J. M. and Wasserman, L. (1997). Estimation of effects of sequential treatments by reparameterizing directed acyclic graphs. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, D. Geiger and P. Shenoy (Eds.), 409–420. San Francisco, CA: Morgan Kaufmann. Google Scholar
Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation, Prediction and Search. New York: Springer. Google Scholar
Stitelman, O. M., Hubbard, A. E., and Jewell, N. P. (2010). The impact of coarsening the explanatory variable of interest in making causal inferences: implicit assumptions behind dichotomizing variables. U.C. Berkeley Division of Biostatistics Working Paper Series. Paper 264. http://www.bepress.com/ucbbiostat/paper264
Taubman, S. L., Mittleman, M. A., Robins, J. M., and Hernán, M. A. (2008). Alternative approaches to estimating the effects of hypothetical interventions. In: JSM Proceedings, Health Policy Statistics Section, 4422–4426. Alexandria, VA: American Statistical Association. Google Scholar
Taubman, S. L., Robins, J. M., Mittleman, M. A., and Hernán, M. A. (2009). Intervening on risk factors for coronary heart disease: an application of the parametric g-formula. International Journal of Epidemiology, 380(6):1599–1611. Web of ScienceCrossrefGoogle Scholar
Tian, J. (2008). Identifying dynamic sequential plans. In: Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, D. McAllester, P. Myllymaki (Eds.), 554–561. Corvallis, OR: AUAI Press. Google Scholar
van der Laan, M. J, M. L Petersen, and M. M Joffe. (2005). History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens. International Journal of Biostatistics, 10(1):Article 4. CrossrefGoogle Scholar
Young, J. G., Cain, L. E., Robins, J. M., O’Reilly, E. J., and Hernán, M. A. (2011). Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula. Statistics in Biosciences. . CrossrefPubMedGoogle Scholar
About the article
Published Online: 2014-03-11
Published in Print: 2014-12-01
Funding: This research was funded by NIH grants R01 HL080644 and R37 AI032475.