The causal inference literature has provided important advances in mediation analysis in clarifying the assumptions needed for a causal interpretation of direct and indirect effect estimates and in developing approaches to mediation analysis allowing for models with interactions and non-linearities (Robins and Greenland, 1992; Pearl, 2001; van der Laan, Petersen, 2008; VanderWeele and Vansteelandt, 2009, 2010; Imai et al., 2010; Tchetgen Tchetgen and Shpitser, 2012; Valeri and VanderWeele, 2013). With a few notable exceptions (Avin et al., 2005; Albert and Nelson, 2011; Imai and Yamamoto, 2012; Zheng and van der Laan, 2012) most of this literature has been restricted to the setting of a single mediator.
When multiple mediators are of interest one approach would be to consider the mediators one at a time. As described below, however, this will in general require that the mediators do not affect one another. In this article we develop an approach that allows an investigator to assess mediation with multiple mediators simultaneously, and which can also accommodate cases in which the mediators affect one another. We allow for potential exposure–mediator interactions as well as, to a certain extent, mediator–mediator interactions. When the ordering of the mediators is known we also discuss how the magnitude of certain path-specific effects can be inferred by applying the approach sequentially. We give two different statistical techniques – one based on regression and one based on weighting – to estimate direct and indirect effects in these cases. We show how our approach is robust to unmeasured common causes of two or more mediators, whereas handling the mediators one-by-one will fail in these cases. When mediators are handled one-by-one, the sum of the proportion mediated on the additive scale for the mediators can sometimes total more than 100%, even if the direction of mediation is the same for all mediators and pathways. We show that when this arises, it is because the mediators in fact affect one another or because of mediator–mediator interactions. The methods we propose in this article can handle these situations. In a related article (VanderWeele et al., 2013) we discuss effect decomposition in the presence of exposure-induced mediator outcome confounding in which there are multiple mediators but only one mediator is of principal interest. In this article we focus on assessing direct and indirect effects when multiple mediators are of interest simultaneously. Tchetgen Tchetgen and Shpitser (2012), Zheng and van der Laan (2012) and Vansteelandt et al. (2012) develop semi-parametric approaches that can also potentially be used for multiple mediators. Here we develop a simple parametric approach that is easy to implement extending the work of VanderWeele and Vansteelandt (2009, 2010) and Valeri and VanderWeele (2013) to handle multiple mediators.
2 Direct and indirect effects for a single mediator: a review
Let A denote the exposure for an individual, let Y denote some outcome, and let M denote the value of a single potential mediator that may be on the pathway from exposure to outcome. Let C denote some set of confounding variables that may affect the exposure, mediator and/or outcome. The relationships between A, M, Y and C are given in Figure 1.
There may be other mediators as well but when focusing on only one mediator these would be represented by the direct path from A to Y not through M.
Let denote a subject’s potential or counterfactual outcome if exposure A were set, possibly contrary to fact, to a. Let denote a subject’s counterfactual value of the intermediate M if exposure A were set to the value a. Let denote a subject’s counterfactual value for Y if A were set to a and M were set to m. For a single mediator, Robins and Greenland (1992) and Pearl (2001) provided the following definitions. The controlled direct effect of exposure A on outcome Y comparing with and setting M to m is defined by and measures the effect of A on Y not mediated through M, i.e. the effect of A on Y after intervening to fix the mediator to some value m. The natural direct effect of exposure A on outcome Y comparing with intervening to set M to what it would have been if exposure had been is defined as . Essentially, the natural direct effect assumes that the intermediate M is set to , the level it would have been for each individual had exposure been , and then compares the direct effect of exposure. The natural indirect effect is defined as . The natural indirect effect assumes that exposure is set to some level and then compares what would have happened if the mediator were set to what it would have been if exposure had been a versus what would have happened if the mediator were set to what it would have been if exposure had been . A total effect can be decomposed into a natural direct and indirect effect. The total effect can be written as , where the first expression in the sum is the indirect or mediated effect and the second expression is the natural direct effect. We can likewise define average controlled direct effect and natural direct and indirect effects conditional on covariates by , , and respectively.
With a binary outcome, we can define direct and indirect effects on a risk ratio or odds ratio scale (VanderWeele and Vansteelandt, 2010). On the odds ratio scale, the total effect conditional on is given by . The controlled direct effect on the odds ratio scale conditional on is given by . The natural direct effect on the odds ratio scale conditional on is given by . The natural indirect effect on the odds ratio scale conditional on is given by . On a risk ratio scale conditional on , the total effect is given by , the controlled direct effect is given by , and the natural direct effect is given by . The natural indirect effect on the risk ratio scale conditional on is given by . The total effect then decomposes into the product of the natural direct and indirect effects on the odds ratio or risk ratio scale: and .
The identification of direct and indirect effects requires various no unmeasured confounding assumptions. We will use the notation to denote that A is independent of B conditional on C. Controlled direct effects are identified if control is made for a covariate set C that includes all confounders of not only the exposure–outcome relationship but also the mediator–outcome relationship. In counterfactual notation, we require that for all a and m,
Assumption  can be interpreted as: conditional on C, there is no unmeasured confounding for the exposure–outcome relationship. Assumption  can be interpreted as there is no unmeasured confounding for the mediator–outcome relationship conditional on (A,C). Natural direct and indirect effects will be identified if four no unmeasured confounding assumptions hold. Natural direct and indirect effects will be identified if, in addition to assumptions  and , the following two assumptions hold, that for all a, and m,
Assumption  can be interpreted as: conditional on C, there is no unmeasured confounding of the exposure–mediator relationship. On a causal diagram interpreted as a set of non-parametric structural equations (Pearl, 2009), if assumption  holds, then assumption  will hold if there is no effect L of exposure A that itself affects both M and Y, i.e. no mediator–outcome confounder that is itself affected by the exposure A. If, however, there is an effect of the exposure that confounds the mediator–outcome relationship, then natural direct and indirect effects will not in general be identified irrespective of whether data is available on that variable or not (Avin et al., 2005). Exceptions arise under strong assumptions about no interaction at the individual level (Robins, 2003).
We will now present a simple regression-based approach for the estimation of direct and indirect effects, which we will extend below to allow for multiple mediators. Suppose assumptions – hold, that Y and M are continuous and that the following regression models for Y and M are correctly specified:
Then it can be shown (VanderWeele and Vansteelandt, 2009) that the average controlled direct effect and the average natural direct and indirect effects are given by
VanderWeele and Vansteelandt (2009) also derived standard errors for these effects using the delta method; alternatively bootstrapping can also be used. If there is no interaction between A and M so that , then these expressions reduce to the expressions of Baron and Kenny (1986) employed in the psychology literature. The controlled direct effect and the natural direct effect are then both equal to and the natural indirect effect is .
Likewise for a binary outcome, suppose that the following models fit the observed data:
and that the error term in the regression model for M is normally distributed with mean 0 and conditional variance . If the covariates C sufficed to control for confounding, satisfying assumptions – above, and the outcome were rare, then the conditional controlled direct effect and the average natural direct and indirect effects on the odds ratio scale would be given by (VanderWeele and Vansteelandt, 2010):
The approximations hold to the extent that the outcome is rare. These expressions would also hold exactly for a rare or common binary outcome if the logistic model was replaced by a log-linear model and natural direct and indirect effects on the risk ratio scale were used. Valeri and VanderWeele (2013) derived similar expressions for either binary or continuous outcomes when the mediator is binary. Similar expressions also hold if the binary outcome is replaced by a count outcome with a Poisson or negative binomial model and rate ratios are used (Valeri and VanderWeele, 2013).
3 Direct and indirect effects for multiple mediators
3.1 Notation, definitions and assumptions with multiple mediators
Suppose now that there are multiple mediators of interest, and that we are interested in the effects mediated through jointly and the effects independent of . We can define controlled direct effects and natural direct and indirect effects in a similar way as before simply replacing our single mediator M with the entire vector of mediators . Thus, let be thecounterfactual value of if exposure A were set to the value a and let denote the counterfactual value for Y if A were set to a and were set to . The controlled direct effect is defined by ; the natural direct effect is defined as ; the natural indirect effect is defined as ; and once again the total effect can be decomposed into a natural direct and indirect effect: .
Suppose again that the four assumptions about confounding hold but now with respect to the whole set of mediators . In other words suppose we have (1) , (2) , (3) , and (4) . We once again need to control for all exposure–outcome, mediator–outcome, and exposure–mediator confounders, but note that now for assumptions  and  control must be made for the mediator–outcome confounders for all of the mediators, not just one and likewise control must be made for the exposure–outcome confounders for all of the mediators, not just one. Assumption  again requires that there be no effect of the exposure that confounds the mediator–outcome relationship for any of the mediators. If there were such a variable then to proceed it would have to be included in the mediator vector if assumption  were not to be violated.
3.2 Regression-based approach for multiple mediators with a continuous outcome
Under these assumptions the natural direct and indirect effects can once again be estimated using a parametric regression-based approach. We will use one regression for the outcome Y and a separate regression for each of the mediators. For simplicity and ease of implementation we will use parametric rather than non-parametric regression here. We will begin with the case of a continuous outcome with continuous mediators and no interactions and we will consider extensions allowing for exposure–mediator interactions, mediator–mediator interactions as well as binary mediators and outcomes below.
Suppose then that assumptions – held for the vector of mediators and that the following regressions are correctly specified and fit to the data:
We show in the Appendix that the controlled direct effect and natural direct and indirect effects are then given by
The direct effects are perhaps exactly what one might expect, simply the coefficient for the exposure, , in the model that contains all of the mediators. The natural indirect effect is equal to the sum over the various mediators, , of the product of the coefficient for the exposure ( for the kth mediator) in the model for the mediator and the coefficient for the mediator ( for the kth mediator) in the model for the outcome that has all the mediators. The indirect effect has this fairly intuitive form. Note, however, that this is different from applying the approach to mediation for a single mediator described in Section 2 one mediator at a time and then summing up the indirect effects. This is because if the mediators were handled one at a time then a different regression for Y would be fit for each mediator and only one mediator would be included in each of these regressions. The approach described in this section fits only a single regression for Y which includes all of the mediators under consideration.
In the Appendix we show that these two approaches will coincide if the mediators do not affect one another (or more precisely, if the mediators are independent of one another conditional on A and C) but they will diverge otherwise. They will diverge if they affect one another because certain pathways will be counted twice if the mediation analysis is done one at a time. For example if there are two mediators where affects as in Figure 2 and if the analysis were done one mediator at a time then the path would be included in the indirect effect both for the analysis for and for theanalysis for . If the two “indirect effects” were summed, the path would essentially be counted twice.
The approach described in this section circumvents this difficulty by fitting only one regression for Y. As will be seen in the next subsection, however, the approach described here will also be able to be used even in the presence of interaction. Moreover, later in this article we will see that even if the mediators are in fact dependent on one another, the approach described here will be robust to unmeasured common causes of two or more mediators, whereas the approach considering the mediators one at a time will not be robust to such unmeasured variables.
When the mediators affect one another the approach of handling one mediator at a time also suffers from another difficulty. This other difficulty is that for the second (and potentially each subsequent) mediator, assumption  will not hold if the mediators are considered one at a time. This is because may affect both and Y and thus be a mediator–outcome confounder. Including in the covariate set C would not remedy this when is affected by A, for then assumption  is still violated. Assumption  may nonetheless hold with regard to a whole collection of mediators without holding for each mediator individually. When mediators are considered one mediator at a time, natural direct and indirect effects will thus often not be identified except under strong assumptions about the absence of interaction (cf. Robins, 2003). See VanderWeele et al. (in press) for methods relevant to such exposure–induced mediator–outcome confounding.
3.3 Exposure–mediator interactions, binary mediators, and mediator–mediator interactions
In this subsection we will discuss how the simple approach above can be easily adapted to allow for exposure–mediator interactions, binary mediators, and, to a certain extent, mediator–mediator interactions. We will describe how to go about estimating causal effects for each of these variations; justification for all statements is given in the Appendix. Although it would not be difficult to give analytic standard errors for each of the variations below using the delta method, there are, as will be seen, numerous possible variations and new formulae would have to be derived in each case. We therefore recommend bootstrapping in the estimation of standard errors for the purposes of simplicity.
Suppose we wished to allow for an interaction between the exposure A and a mediator in the model for Y for example so that the outcome model became:
Provided the models are correctly specified, the expressions for the controlled direct effect and natural direct and indirect effects in eq.  are then modified by adding to the controlled direct effect, to the natural direct effect and to the natural indirect effect so that the effects become:
If a further interaction is thought to be present between the exposure and another of the mediators, for example mediator j say, then the same terms could once again be added to these expressions: to the controlled direct effect, to the natural direct effect, and to the natural indirect effect. And similarly for other exposure–mediator interactions; any number of exposure–mediator interactions could be accommodated in this manner.
Thus far we have assumed all mediators are continuous. Suppose that one or more of the mediators is binary, say mediator j, and that we fit a logistic regression model for (instead of a linear regression model):
Provided the models are correctly specified, the expressions for the controlled direct effect and naturaldirect effect then remain the same, but the expression for the natural indirect effect is modified. Instead ofthe term for the jth mediator we would include in the natural indirect effect the term: ; i.e. we would replace in the expression in eq.  with and we would likewise do this for each mediator that were binary. If we also wanted to allow for exposure–mediator interaction with a mediator that was binary, we would further add to the controlled direct effect, to the natural direct effect, and to the natural indirect effect. We could once again do this for each mediator that were binary and for which we wanted to include an exposure–mediator interaction.
Finally, suppose, that a binary exposure variable A were randomized and that no covariates were needed for assumptions – to hold. Suppose that we wanted to allow for mediator–mediator interaction between two mediators and j as for example in the regression model:
Provided the models are correctly specified, the controlled direct effect and natural direct effect will both be exactly the same as described above. However, the natural indirect effect needs to be modified further. In particular, we could fit a linear regression model for the product
We then would add the term to the natural indirect effect. Unfortunately, as discussed in the appendix, if covariates C are included in the model then this can lead to issues of model compatibility between the models for and and that for the product . In section 4 we present an alternative weighting approach that circumvents this issue and is applicable to settings with mediator–mediator interactions.
3.4 A regression-based approach for multiple mediators with a binary outcome
When the outcome is binary, rather than continuous, a similar approach to that described above, can also be employed but it is subject to the restriction that it will only work when all of the mediators are continuous. In the next section, however, we will also describe a weighting-based approach which can be used if the mediators are binary (or if some are binary and some are continuous) and that can also accommodate potential mediator–mediator interactions.
Suppose then the outcome is binary and all mediators are continuous and the following two models are correctly specified and fit to the data:
Suppose also now that the mediators follow a multivariate normal distribution conditional onA and C. We show in the Appendix that when the models are correctly specified and when assumptions – hold and when the outcome is rare (or the logistic regression model is replaced by a log-linear model for a common outcome with the effect measures interpreted as relative risks rather than odds ratios) then the log of the controlled direct effect and natural direct and indirect effect odds ratios are given by:
These were the same expressions we had obtained in eq.  for a continuous outcome. If we wish to allow for exposure–mediator interactions we can simply add additional terms. Suppose we wished to allow for an interaction between the exposure A and a mediator in the model for Y. The expressions for the controlled direct effect and natural indirect effects are then modified by adding to the controlled direct effect, and to the natural indirect effect; however, the expression for the natural direct effect is more complicated as it involves the correlation between the mediators; it is given in the Appendix.
If the data come from a case-control study with a rare outcome, then this same approach and the same expressions can be used but the regression models for the mediators are then fit only among the controls; see VanderWeele and Vansteelandt (2010) for further details. The same expressions can likewise be used for the ratio of expectations if the outcome is a count outcome and the logistic regression model is replaced by a log-linear model.
The approach described here for a binary or count outcome will, as noted above, only apply if all of the mediators are continuous. If one or more of the mediators are binary then an alternative approach will need to be used. Moreover, as we have seen, even if all of the mediators are continuous, the expressions for the natural direct effect become more complicated if there are exposure–mediator interactions. All of this motivates the alternative weighting approach, described below, which can much more flexibly accommodate binary outcomes.
3.5 Assessing mediators sequentially
Before moving on a few additional comments merit attention. First, the approach we have described so far does not necessarily require knowing the ordering of the mediators , though it does again require that there is no further variable that is affected by the exposure and that goes on to affect one of the mediators and also the outcome. If there is such a variable it needs to be included in the set . If the ordering of is known then some further progress can also be made. One could for example, begin with the first mediator and use the approach described here to examine the portion of the effect mediated through . One could then consider and jointly and use the approach described here to examine what proportion of the effect is mediated through both and together. Doing so would allow one to assess the additional contribution of beyond alone. Note that the difference between the two will potentially be different than simply the effect mediated through itself because for example, and may share common pathways (if for example affected or if, as discussed later in this article, and interact in their effects). One could further then consider and examine the proportion mediated by all three jointly along with the additional contribution of beyond . One could carry on this process, adding sequentially one mediator at a time, until all K mediators are included.
Undertaking this sequential approach does, however, place additional restrictions on the models being used. This is because for each group of mediators a different model is being fit for Y. In particular, as is discussed at greater length in the Appendix, for the various models for Y to be compatible with one another when the exposure is binary, it is necessary that either (i) there are no exposure–mediator or mediator–mediator interactions or (ii) the models must be extended to allow for exposure–covariate interaction. See the appendix for further details.
4 A weighting approach
Because of the aforementioned concerns about model incompatibility that arise because of mediator–mediator interactions and because the addition of mediators increases the need for modeling, we also present a simple alternative approach based on inverse probability weighting. This alternative weighting approach does not require models for the mediators. Instead a model for the exposure is used and this then essentially overcomes the issue of model incompatibility. This weighting approach can be used for essentially any type of outcome, including non-rare binary outcomes, and it can be used regardless of whether there are exposure–mediator or mediator–mediator interactions. However, as with other weighting approaches, its performance is best when the exposure is binary or discrete with only a few levels.
The weighting approach estimates the marginal natural direct effect, , and the marginal natural indirect effect, . Doing so, requires the estimation of three counterfactuals: , , and . For the counterfactual we can obtain this by taking a weighted average of the subjects with where each subject i is given a weight
where denotes the actual covariate value for subject i. For a binary exposure with and , the probabilities could be fit, for example, using logistic regression and obtaining the predicted probabilities for for each subject i. The approach requires that such models are correctly specified. Likewise we can obtain the counterfactual by taking a weighted average of the subjects with where each subject i is given a weight
where again denotes the actual covariate value for subject i. And again for a binary exposure with and , the probabilities could be fit, for example, using logistic regression and obtaining the predicted probabilities for for each subject i.
Finally, for the counterfactual , for each subject i with , one uses an outcome model (which can include exposure–mediator or mediator–mediator interactions) to obtain a predicted estimate of the outcome if the individual had had exposure rather than , but using the individual’s own values of the mediator, , and covariates, . The weighting approach requires that the outcome model is correctly specified. Once these predicted values are calculated one can obtain an estimate for the counterfactual by taking a weighted average of these predicted values for subjects with where each subject i is given the weight
Once the various counterfactuals are obtained, we can estimate the natural direct effect by taking the difference , and the natural indirect effect by taking the difference . Alternatively, on a ratio scale we could obtain risk ratios for the natural direct effect by taking the ratio , and for the natural indirect effect by taking the ratio and likewise for effects on an odds ratio scale. We recommend bootstrapping for confidence intervals. This approach essentially constitutes a straightforward generalization of the result in Albert (2012) to the case of a vector of mediators and is also closely related to the imputation approach of Vansteelandt et al. (2012). SAS code to implement this weighting approach is given in the Appendix. As with other weighting approaches, the approach here can be unstable if some of the probabilities in the denominator are very small so that some of the weights are very large; it is also good to check overlap in the distribution of the weights among the exposed and unexposed.
Both the weighting approach in this section and the regression approach in the previous section assume that the models are correctly specified. However, these different approaches make different modeling assumptions in that they require different models to be correctly specified. The regression approach in the previous section requires that the model for the outcome and the models for each of the mediators are correctly specified; no model for the exposure is needed in the regression approach. The approach will be biased if the models for the either the outcome or the mediators are mis-specified. In contrast, the weighting approach requires that the model for the outcome and the model for the exposure are correctly specified; no models for the mediators are needed. The weighting approach will be biased if the models for the either the outcome or the exposure is mis-specified.
To illustrate the weighting approach we will analyze 2003 US birth certificate data and will consider whether the exposure, A, of adequate or inadequate prenatal care (; those with intermediate or superadequate care are excluded from the analysis for the purposes of this illustration) on preterm birth (Y) is mediated by maternal smoking and/or drinking () or pre-eclampsia (). Adequacy of prenatal care categories are determined from data on the month prenatal care was initiated, on the number of visits, and on gestational age, according to the American College of Gynecologists recommendation as encoded in a modification of the APNCU index (Kotelchuck, 1994; VanderWeele et al., 2009). In this analysis we will take age category (below 20 years, between 20 and 35 years, or above 35 years), ethnicity (black, Hispanic, native American, white), education and marital status as baseline confounders (C). Our analysis is certainly a simplification of a more complex reality as prenatal care and maternal smoking are both ultimately time-varying and pre-eclampsia and preterm birth could be conceived of as processes whereas we will treat them as dichotomous.
Inverse probability weights were constructed on the basis of logistic regression models for adequate care. In view of the large sample size and the resulting computational burden, standard errors and confidence intervals were constructed using the subsampling bootstrap (Politis and Romano, 1994). This is similar to the bootstrap, but involved repeating the analysis for 1,000 subsamples of size (0.5% of the total sample size); on the basis of the empirical standard deviation of the 1,000 estimates, the standard error of the estimates that were obtained from the analysis of the full data set can be inferred (accounting for correlation resulting from the fact that some data points may be shared between subsamples).
In the sequential approach, we first considered maternal drinking and smoking as mediators. This shows that the direct effect of adequate care, through pathways other maternal smoking or drinking (), is a 5.6% (95% CI 5.5% to 5.7%) reduction in the risk of preterm birth and that the mediated effect via maternal smoking and/or drinking is a 0.09% (95% CI 0.08% to 0.10%) reduction in the risk of preterm birth. When pre-eclampsia is considered as an additional mediator (), we found essentially the same direct and indirect effects of 5.6% (95% CI 5.5% to 5.7%) and 0.09% (95% CI 0.08% to 0.10%). The effect of adequate prenatal care on preterm birth by pathways through pre-eclampsia, but not through maternal smoking and drinking thus seem minimal.
6 Some further properties: robustness to mediator confounding and joint versus summed proportion mediated
As noted above, when multiple mediators are of interest, the approach of considering mediators one at a time will only be appropriate if the mediators do not affect one another. If one of the mediators of interest affects another then assumption  will be violated for one or more mediators. The approach we have described in this article can, however, still be used. The approach described here has other advantages, even if the mediators do not affect one another. Suppose for example the mediators do not affect each other but there is an unmeasured common cause U of two or more mediators as in Figure 3.
In this case, the approach of considering the mediators and one-by-one will be biased because when alone is considered, U will be an unmeasured confounder for the effect of on Y (it affects Y through ) and when alone is considered, U will be an unmeasured confounder of the effect of on Y (it affects Y through ). However, when and are considered jointly as in this article, U no longer serves as a confounder for the joint effect of on Y because U no longer affects Y, except through .
When the mediators affect one another then, as discussed above, we generally cannot estimate the natural direct and indirect effects for one or more mediators. Even if we could, the sum of the proportion mediated can be more than 100%, even if all pathways affect the outcome in the same direction. This is because certain paths may be counted twice. In Figure 2, if the analysis were done one mediator at a time then the path would be included in the indirect effect both for the analysis of and for the analysis of . The approach described in this article circumvents this difficulty. However, we might then think that if the mediators do not affect one another – if for example, the mediators are independent of one another conditional on A and C – then the sum of two mediated effects considered separately should equal the joint mediated effect when both mediators are considered together. In fact, even if the mediators are independent and do not affect each other this need not hold. The sum of two mediated effects considered separately may diverge from the joint mediated effect when there are interactions between the effects of the two mediators on the outcome. Note that such interaction can arise even if the mediators do not affect each other. We show in the Appendix that if there is no additive interaction in the effects of the two mediators at the individual counterfactual level, and if the two mediators do not affect each other then the sum of two mediated effects on the additive scale, considered separately, will equal the joint mediated effect when both are considered together. If the two diverge then either the mediators must affect one another or there must be an additive interaction at the individual level between the effects of the two mediators.
In some applications, mediators are considered one at a time and the proportion mediated is calculated for each of these. Sometimes, when doing this, the sum of the proportion mediated can exceed 100%. One possible explanation for this is that there are other pathways (that operate through other mediators) that affect the outcomes in the opposite direction from those under consideration. The sum of the proportion mediated may exceed 100% if there are other mediators with a “negative” proportion mediated. If this is thought not to be the case, i.e. if all pathways are thought to operate on the outcome in the same direction, then the true proportion mediated for all mediators (known and unknown) considered jointly must be 100%. If the sum of the proportion mediated when each measured mediator is considered separately exceeds 100% then the sum and the joint proportion mediated would be different and thus it must be the case that either the mediators affect one another or that there are interactions between the effects of the mediators on the outcome. The approach described in this article could accommodate these complications by considering all mediators jointly in contrast to the approach of assessing mediators one at a time which cannot. In summary, if the sum of the proportion mediated exceeds 100% then one of the following must be true: (i) there are other mediators with a negative proportion mediated; (ii) the mediators affect one another; (iii) there are interactions between the effects of the mediators on the outcome.
Comparing the sum of the mediated effects to a joint mediated effect (or examining if the sum of the proportion mediated exceeds 100% if all mediators are thought to operate in the same direction) would thus constitute one strategy whereby an investigator could assess whether the approach of examining one mediator at a time might fail. An alternative approach might consist of examining the independence of the mediators more directly. For example, in the case of two mediators, and , a regression of on should have independent of in the regression. Statistical dependence between the two conditional on A and C would indicate that the approach of examining mediators one at a time cannot be used.
Statistical dependence between and conditional on A and C cannot distinguish between Figures 2 and 3 (or Figure 4 in which one mediator affects the other and they share an unmeasured common cause), but in either case, the approach of examining the mediators one at a time fails because assumptions required for such an approach are then violated.
In this article we have considered methods for assessing the effects of an exposure on an outcome through several mediators considered jointly. We have described a regression-based approach and a weighting-based approach, which are simple extensions of methods for a single mediator (VanderWeele and Vansteelandt, 2009, 2010; Valeri and VanderWeele, 2013; Albert, 2012). The weighting-based approach allowed for somewhat more flexibility in settings with a binary outcome, in allowing for mediator–mediator interaction and mediators of different types (e.g., continuous and categorical). Otherwise, however, the regression-based approach is perhaps to be preferred on the grounds that it will in general yield more efficient estimates.
The approaches considered here are parametric and require correct model specification. Both approaches demand correct specification of an outcome model, but differ in that the regression-based approach additionally requires correct specification of models for each of the mediators, whereas the weighting approach additionally requires correct specification of a model for the exposure. When the number of mediators is large, the weighting approach is thus considerably less demanding in terms of the number of models it requires. It is moreover less demanding in settings where the exposure is highly predictive of (one of) the mediators, since correct specification of a model for the mediators is then more difficult. The weighting approach is particularly desirable when the exposure is randomly assigned so that the model for the exposure is known by design: it then merely requires correct specification of a model for the outcome.
Recent work has proposed multiply robust estimators that specify models for the exposure, mediator and outcome and yield valid inferences if at least two of these models are correctly specified (Tchetgen Tchetgen and Shiptser, 2012; Vansteelandt et al., 2012; Zheng and van der Laan, 2012). These semi-parametric approaches have been studied in settings with a single mediator, but can in principle deal with multiple mediators as well. They have greater robustness to model mis-specification but are more difficult to implement in practice. Robustness against model mis-specification can alternatively be improved using a slight modification of the weighting approach, which uses outcome predictions obtained via more general statistical learning methods rather than parametric methods. The practical performance of these different variations and approaches deserves further study, particularly in settings where the number of confounders is large and model mis-specification thus more likely.
The assumptions required for the estimation for the approaches proposed here, like all work on mediation, are quite strong. Sensitivity analysis for direct and indirect effects are now available for a single mediator (VanderWeele, 2010; Imai et al., 2010) and these approaches could perhaps be applied and extended to settings with multiple mediators. However, by being able to handle multiple mediators, at least one of the assumptions for the estimation of direct and indirect effects is in some sense made more plausible: namely the assumption that there is no mediator-outcome confounder that is itself affected by the exposure. The approach described in this article renders this assumption more plausible in that if there were such a variable then it itself could be included in the mediator vector and the methods could still be employed. The methods we have provided here are relatively general and also fairly straightforward to use in practice and we hope will be of use in settings in which multiple mediators might be of interest.
Consider the regression models:
Under assumptions  and , we have for the controlled direct effect:
Under assumptions – we have by Pearl’s mediation formula
Thus the natural direct effect is given by:
If is continuous, . If is binary, . The natural indirect effect is given by
If is continuous, . If is binary, .
For the mediator–mediator interaction terms, we could consider the following models: for at least one of continuous for both binary.
Under these models, if at least one of or is continuous, . If both and are binary . Note, however, if covariates C are included in the model and a mediator–mediator interaction term, is also included, then this can lead to issues of model compatibility between the models for and and that for the product . For continuous mediators, a possible solution would be to assume a constant covariance matrix, in which case the average product follows from knowledge of the covariance and means. For instance, we would have that
where are the residuals in the models for both mediators. For dichotomous mediators, the Plackett copula could be used; that is, on top of the models for each mediator separately, one could postulate the model with unknown; could be estimated using standard software for alternating logistic regression, which is, for instance, available via the option “logor = exch” in proc genmod. With and , we then have that
Now suppose that the outcome is binary and rare and the following regression models are fit to the data:
with the vector of mediators following a multivariate normal distribution conditional on A and C with conditional covariance matrix . Under assumptions – we then have
Note that conditional on and c, follows a normal distribution with mean and variance , where and . It thus follows that
The log of natural direct effect odds ratio is then given by:
The log of natural indirect effect odds ratio is then given by:
Let be the subset of mediators . Consider the regression models:
for . Under assumptions –, we then have by Pearl’s mediation formula that the exposure effect that is mediated by the first k mediators equals:
While this approach is valid for each fixed k, a concern is that the models for and may not be compatible across k. To illustrate the difficulty of correct specification of these models across k, suppose that
for , which is compatible with the aforementioned models for . Then the model for implies that
This model is no longer of the same form as it includes interactions between a and c, as well as squared terms that were not previously allowed for. This can be remedied by extending the outcome regression model to include such terms:
With this extension one still has
If the exposure is binary and we allow for exposure–covariate interaction in the outcome model then this particular problem of correct model specification is thus remedied. Alternatively, if there are no exposure–mediator interactions in the outcome model then the models will remain compatible with each other.
The weighting approach is based on the following identity:
For the sequential approach, one likewise obtains that
Applying the sequential approach thus demands models for the conditional expectation ; a possible concern is then that these models for different k may not be compatible with each other.
Sum of individual mediated effects versus joint mediated effects
Consider two mediators, and , and suppose that neither affects the other. For simplicity assume a binary exposure. The natural indirect effect through is by definition . The natural indirect effect through is by definition . The natural indirect effect through , is by definition . If the mediators do not affect each other then the natural indirect effect through is equal to and the natural indirect effect through is equal to . The sum of the two natural indirect effects for and considered separately is thus . The difference between the sum of the two natural indirect effects for and considered separately and the natural indirect effect through jointly is then given by:
This difference in some sense captures the effect mediated by the interaction between and .
We now show that if the mediators do not affect each other and if there is no interaction between and at the individual counterfactual level then the sum of the two natural indirect effects for and considered separately and the natural indirect effect through jointly must be equal. We will say that there is no interaction between and at the individual counterfactual level if for any a and any two values of , is constant across or, equivalently, for any a and any two values of , is constant across . If this is the case then must be constant across and thus must be equal to and so must be equal to 0.
We now also show a similar result with linear models on the additive scale with interaction on the average level of population rather than the individual level. The approach that uses one mediator at a time, uses regression models for , which are of the form
Under this model, the natural indirect effect is given by:
When is independent of all other mediators (conditional on a and c), then this reduces to
It follows from this that the natural indirect effect equals the sum of the individual natural indirect effects when either none of the mediators is affected by the exposure, or none of the mediators affects the outcome, or the mediators are mutually independent (conditional on a and c) and there are no mediator–mediator interactions, but not generally otherwise.
We describe how the proposed weighting approach given above can be implemented in SAS statistical software (SAS Institute, Inc., Cary, North Carolina). Below we let c, a, m and y correspond to the observed confounders C, exposure A, mediator M and outcome Y, and assume, for the illustration, that A and Y are dichotomous.
proc logistic data = mydata;
model a = c;
score data = mydata out = preda;
pa1 = P_1;
a = 0; output;
a = 1; output;
proc logistic data = mydata;
model y = a m c;
score data = mydata0 out = predy0;
score data = mydata1 out = predy1;
py0 = P_1;
py1 = P_1;
merge preda predy0 predy1 mydata;
w = a/pa1 + (1-a)/(1-pa1);
The mean E[Y1M0] (except for standard errors) can now be estimated using:
proc reg data = mydataw;
where a = 0;
model py1 = ;
and the mean E[Y0M1] (except for standard errors) using:
proc reg data = mydataw;
where a = 1;
model py0 = ;
Avin, C., Shpitser, I., and Pearl, J. (2005). Identifiability of path-specific effects. In: Proceedings of the International Joint Conferences on Artificial Intelligence, 357–363. Google Scholar
Baron, R. M. and Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51:1173–1182. CrossrefGoogle Scholar
Imai, K. and Yamamoto, T. (2012). Identification and sensitivity analysis for multiple causal mechanisms: revisiting evidence from framing experiments. Political Analysis, 21:141–171. Web of ScienceGoogle Scholar
Kotelchuck, M. (1994). An evaluation of the Kessner adequacy of prenatal care index and a proposed adequacy of prenatal care utilization index. American Journal of Public Health, 84:1414–1420. CrossrefGoogle Scholar
Pearl, J. (2001). Direct and indirect effects. In: Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence. San Francisco: Morgan Kaufmann, 411–420. Google Scholar
Pearl, J. (2009). Causality: Models, Reasoning, and Inference. 2nd Edition. Cambridge: Cambridge University Press. Google Scholar
Robins, J. M. (2003). Semantics of causal DAG models and the identification of direct and indirect effects. In: Highly Structured Stochastic Systems, P. Green, N. L. Hjort, and S. Richardson (Eds.), 70–81. New York: Oxford University Press. Google Scholar
Tchetgen Tchetgen, E. J. and Shpitser, I. (2012). Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness, and sensitivity analysis. Annals of Statistics, 40(3):1816–1845. CrossrefWeb of ScienceGoogle Scholar
Valeri, L. and VanderWeele, T. J. (2013). Mediation analysis allowing for exposure-mediator interactions and causal interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods, 18:137–150. CrossrefWeb of SciencePubMedGoogle Scholar
van der Laan, M. J. and Petersen, M. L. (2008). Direct effect models. International Journal of Biostatistics, 4:Article 23. Google Scholar
VanderWeele, T. J., Lantos, J. D., Siddique, J., and Lauderdale, D. S. (2009). A comparison of four prenatal care indices in birth outcome models: Comparable results for predicting small-for-gestational-age outcome but different results for preterm birth or infant mortality. Journal of Clinical Epidemiology, 62:438–445. Web of SciencePubMedCrossrefGoogle Scholar
VanderWeele, T. J. and Vansteelandt, S. (2009). Conceptual issues concerning mediation, interventions and composition. Statistics and Its Interface, 2:457–468. Google Scholar
VanderWeele T. J., Vansteelandt, S., and Robins, J. M. (2013). Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology, in press. Web of ScienceGoogle Scholar
Vansteelandt, S., Bekaert, M., and Lange, T. (2012). Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods, 1:131–158. Google Scholar