Causal mediation analysis in presence of multiple mediators uncausally related

: Mediation analysis aims at disentangling the effects of a treatment on an outcome through alternative causal mechanisms and has become a popular practice in biomedical and social science applications. The causal framework based on counterfactuals is currently the standard approach to mediation, with important methodological advances introduced in the literature in the last decade, especially for simple mediation, that is with one mediator at the time. Among a variety of alternative approaches, Imai et al. showed theoretical results and developed an R package to deal with simple mediation as well as with multiple mediation involving multiple mediators conditionally independent given the treatment and baseline covariates. This approach does not allow to consider the often encountered situation in which an unobserved common cause induces a spurious correlation between the mediators. In this context, which we refer to as mediation with uncausally related mediators, we show that, under appropriate hypothesis, the natural direct and jointindirect effectsare non-parametrically identifiable. Moreover, we adopt the quasi-Bayesian algorithm developed by Imai et al. and propose a procedure based on the simulation of counterfactual distributions to estimate not only the direct and joint indirect effects but also the indirect effects through individual mediators. We study the properties of the proposed estimators through simulations. As an illustration, we apply our method on a real data set from a large cohort to assess the effect of hormone replacement treatment on breast cancer risk through three mediators, namely dense mammographic area, nondense area and body mass index.


Introduction
Causal mediation analysis comprises statistical methods to study the mechanisms underlying the relationships between a cause, an outcome and a set of intermediate variables. This approach has become increasingly popular in various domains such as biostatistics, epidemiology and social sciences. Mediation analysis applies to the situation depicted by the causal directed acyclic graph of Figure 1, where an exposure (or treatment) T affects an outcome Y either directly or through one or more intermediate variables referred to as mediators. The aim of the analysis is to assess the total causal effect of T on Y by decomposing it into a direct effect and an indirect effect through the mediator(s).
Mediation analysis originally developed within the setting of linear structural equation modelling (LSEM) [1][2][3]. Following the seminal works by Robins and Greenland [4] and Pearl [5], a formal framework based on counterfactual variables established itself as the standard approach to mediation analysis, with a growing methodological literature, see for instance [6][7][8][9] and the comprehensive book [10].
In this work, we adopt the point of view and formalism of [11] and [12], who put forward a general approach based on counterfactuals to define, identify and estimate causal mediation effects without assuming any specific statistical model in the particular case of a single mediator. The theoretical results in these articles are based on a strong set of assumptions known as Sequential Ignorability. These conditions are interpreted as the requirement that there must be no confounding of the T-Y, T-M and M-Y relationships after adjustment for the measured pretreatment covariates (i.e., confounders that are not affected by T) and T, and moreover that there must not be posttreatment confounding (i.e., confounders that are affected by T) between M and Y whatsoever, measured or unmeasured. In particular, [11,12] proved that under Sequential Ignorability, the average indirect effect is nonparametrically identified, see Theorem 2.1 in the next section, and proposed a sensitivity analysis to assess the robustness of estimates to violations of Sequential Ignorability. Moreover they introduced estimation algorithms for the effects of interest that are implemented in the widely used mediation R package [13].
When multiple mediators are involved in the mediation model, three cases may arise, as shown in Figure 2: in Figure 2(a) mediators are conditionally independent given the treatment and measured covariates (not depicted here), in Figure 2(b) mediators are causally ordered, that is one affects the other; in Figure 2(c) mediators are conditionally dependent given the treatment and measured covariates without being causally ordered. In the latter situation, we will talk about uncausally correlated mediators as opposed to the situation of Figure 2(b) where mediators are causally correlated. We will also refer to the cases depicted in Figure 2(a) and (c) as mediation with multiple causally unrelated mediators. Models in Figure 2(a) and (b) have been treated in the last few years [14][15][16] and will be commented further in the discussion section. Figure 2(c) corresponds to an Acyclic Directed Mixed Graph (ADMG) as introduced by [17] and [18]. Bidirected dotted edges indicate a non-causal correlation, due for instance to a latent common cause, as in Figure 3. Shpitser and coauthors define districts as the connected components of the graph restricted to the bidirected edges and describe a necessary and sufficient condition for the effects to be identified, that is expressed in terms of observational data. In the case of multiple mediation, this condition says that the effect mediated by a set S of mediators can be written as a function of the observations if and only if S is the union of some districts. In the case of Figure 2(c), this means that the direct effect (mediated by neither M nor W) and the joint effect (mediated by both M and W) can be written in terms of observations, but that the effect mediated only by M cannot.
The estimation of such individual indirect effects, each specific to a given mediator, is however of practical importance. To do so, [19] extend their above mentioned approach to multiple mediators. When mediators are  causally unrelated, and Sequential Ignorability holds, they suggested to process several single mediator analyses in parallel, one mediator at the time. Obviously, this approach leads to a biased estimate of the direct effect, because it forces the indirect effects via all other mediators to contribute to the direct effect. More subtly, this approach is not appropriate when mediators are uncausally correlated due to an unmeasured covariate U causally affecting both mediators M and W as in Figure 3. As a matter of fact, in this situation, U is an unobserved confounder of the relationship between M and Y and Sequential Ignorability does not hold. This key fact was remarked by [19] and [14], but no explicit solution to the problem was proposed other than conducting the above mentioned sensitivity analysis. In this article, we suggest that a possible solution to this problem goes through the estimation of the multivariate law of the mediators conditionally on the treatment. This allows taking into account the spurious correlation among mediators induced by the unobserved variable U. A recent paper by Kim et al. [20] describes an alternative approach in which the dependence between mediators is characterised by a Gaussian copula together with marginal linear models; direct effect and indirect effects through each mediator are estimated imputing unobserved counterfactuals using a fully Bayesian approach. However, this approach has been specifically developed for continuous outcomes, while our method does not assume any particular form for the outcome as long as each marginal model is well specified.
In this article, we focus on the scenario of multiple causally unrelated mediators (i.e., either independent, Figure 2(a), or uncausally correlated, Figure 2(c), mediators). In Section 2, we start by reviewing definitions and results for simple mediation following [12]. Then, in Section 3, we extend these definitions and theoretical results to the scenario of multiple causally unrelated mediators. To do so, we introduce new identification hypotheses called SIMMA and compare them to Sequential Ignorability in the multiple cases as discussed by [19]. We show that under SIMMA the direct effect and the joint indirect effect through the vector of all mediators can be expressed by a formula involving observed variables only, while the indirect effect through each individual mediator is given by a formula involving both observed and counterfactual variables. The former formulae lead to an unbiased estimation of the direct and joint indirect effects, in compliance with [18]. Moreover, under an additional assumption, we propose a procedure based on the simulation of counterfactual distributions to estimate the indirect effects through individual mediators. In Section 4, we conduct an empirical study to show that the method results in unbiased estimates of the direct and indirect effects. The R implementation of our method is available on GitHub, https://github.com/AllanJe/multimediate. Finally, in Section 5, we apply our method to a real dataset from a large cohort to assess the effect of hormone replacement treatment on breast cancer risk through three uncausally correlated mediators, namely dense mammographic area, non-dense area and body mass index.
For the sake of clarity, we list here the notations used in this article: -T ∈ {0, 1}: treatment -Z ∈ R K : vector of all mediators -M k ∈ R: kth mediator; when this is clear from the context, we will use the notation M = M k -W k ∈ R K−1 : complement of M k in Z; when this is clear from the context, we will use the notation W = W k -X ∈ R P : vector of pretreatment confounders -Y ∈ R or {0,1}: outcome -δ k (t): indirect effect of T mediated by M k -δ(t): indirect effect of T mediated by M -ζ(t): direct effect of T -δ,ζ: averages (δ(0) + δ(1))/2 and (ζ (0) + ζ (1))/2 -τ: total effect -PM k (t) δ k (t)/τ: proportion mediated by M k -Φ: the cumulative distribution function of the standard normal distribution N (0, 1) -A Γ : the transpose of a matrix or vector A -A Γj : the transpose of the jth row of matrix A.

Brief review of simple mediation
We begin by recalling the main results by [12] in the case of a simple mediator and a binary treatment; we will adopt the same notations. Let Y be the variable denoting the observed outcome, T the treatment or exposure (coded as 1 for treated or exposed and 0 for non-treated or non-exposed) and M a single intermediate variable on the causal path from the T to Y. Finally let X represent a vector of pretreatment confounders. The causal diagram in Figure 4 depicts the causal relation between the four variables.
The causal approach to mediation analysis requires two types of counterfactual variables. On one hand, we consider the potential mediator when the treatment is set to t, denoted M(t). On the other hand, we consider the potential outcome under the treatment status t and with the value of the mediator set to the potential value it would have under t′, denoted Y(t, M(t ′ )). We recall the definition of counterfactuals in the supplementary materials.
The three quantities of interest in simple mediation analysis are the average causal indirect effect denoted δ(t), the average direct effect ζ (t), for t ∈ {0, 1}, and the average total effect τ: Imai and collaborators showed that these effects can be identified regardless of a model assumption under two crucial hypotheses that go under the name of Sequential Ignorability Assumption (SIA): Theorem 2.1. [12]. Under SIA, the average indirect effect and the direct effect are identified non-parametrically and are given by In the setting of linear models, the two corollaries below follow, the first for a continuous outcome and the second for a binary outcome.

Extension to multiple causally unrelated mediators
In this subsection, we consider that K mediators intervene in the causal relationship between T and Y as in Figure 5. In particular, the following definitions and results apply when mediators are independent (Figure 2(a)) or uncausally correlated (Figure 2(c)).

Effect definitions
Let Z be the vector of all K ≥ 2 mediators and M k the mediator of interest. We denote by W k the complement of M k in Z, that is all mediators that are not of direct interest, and X the vector of pretreatment confounders.
The average indirect effect mediated by M k was defined by [19] as As a measure of the average joint indirect effect, that is the indirect effect mediated by all the mediators, we take Remark. Note that the joint indirect effect can be decomposed as A proof of this result can be found in Appendix A. Each of the 2 K direct effects is defined as For the sake of simplicity, among all these direct effects, we will consider only ζ(0, … , 0) and ζ(1, … , 1), denoted ζ (t), t ∈ {0, 1}.
The total effect τ is Note that τ is the sum of the joint indirect effect of treatment t and of the direct effect of treatment 1 − t:

Assumptions
Throughout the paper, we adopt the Stable Unit Treatment Value Assumption (SUTVA, [21] which implies that 1) there is no interference in the sense that potential mediator and outcome values of individual i do not depend on treatments of other individuals (i.e., . We augment the standard SUTVA to also assume that there are no multiple versions of mediators, that is if [22]. Our results are based on the following hypotheses that we called Sequential Ignorability for Multiple Mediators Assumption (SIMMA): for all possible values of t, t′, t″, m, w. A detailed explanation of SIMMA can be found in Appendix B.
Here, we recall that X is the vector of all the observed pretreatment covariates (by definition these variables are unaffected by the treatment). The first hypothesis implies that there must be no unobserved pretreatment confounders between the treatment and the outcome and between the treatment and the individual mediators after conditioning on all observed covariates. The second and third hypotheses exclude the existence of two distinct types of confounders between the mediators taken jointly and the outcome: the confounding by an unobserved pretreatment variable and the confounding by an observed or unobserved posttreatment variable.
Crucially, these hypotheses replace the second and third hypothesis that [19] make in the situation of multiple causally independent mediators, where a similar requirement applies to each counterfactual mediator separately and is interpreted as the randomisation of each mediator with respect to the outcome conditionally on the treatment arm (cf Appendix B). However, it is important to stress that assumption (B.4) is not more restrictive than Imai's hypotheses in the sense that it does not imply them, as we show in Appendix B. This hypothesis is the same as assumption 2) for multiple mediators in [14]. Our third assumption (B.5) is not included in [11] nor in [14] and is necessary to estimate the individual indirect effect of each mediator.
The reason for replacing Imai's hypotheses with (B.4) and (B.5) is that we are interested in the situation where M and W are uncausally correlated, typically because of a pretreatment variable U affecting both as in Figure 6(a). Note that if U is unobserved (i.e., it is not part of the variables in X) conditions (B.4) and (B.5) are not violated because the joint distribution of the mediators incorporates the influence of U on the individual mediators. On the contrary, such a U would violate the corresponding hypothesis in [19] because it constitutes an unobserved confounder of the relations between W and Y and M and Y.

Identifiability
In the following, the mediator of interest M can be any of the K mediators, so that the results below can be applied to each mediator. In particular, this will allow to express the indirect effect mediated by each mediator taken individually.
Our first result extends Theorem 2.1 to multiple mediators, not only when mediators are causally independent as done by [19], but also when they are uncausally correlated.
Theorem 3.1. Consider K mediators that can be either independent or uncausally correlated. Under SIMMA the following results hold.
The average indirect effect of the mediator of interest is given by: Moreover the joint indirect effect, the direct effect and the total effect are identified non-parametrically respectively by: In multiple mediation, Theorem 3.1 has the same role as Theorem 2.1 in simple mediation, because it shows that under proper assumptions, the (joint) indirect and direct effects are non-parametrically identified. In particular, from the equations above one can derive estimators for the joint indirect effect and for the direct effect, as already shown by [17]. In general, however, Eq. (3.1) does not allow to derive an estimator of the individual indirect effect of the mediator of interest, because the conditional distribution of (M(t′), W(t)) is not observable. Note that in the particular case where M is independent of W, Eq. (3.1) becomes which is the equation for δ(t) given by Theorem 2.1, thus allowing to identify the average indirect effect nonparametrically. This result was reported by [19]. A proof of Theorem 3.1 can be found in Appendix C.
The following two corollaries show identification formulae for the indirect and direct effects in the setting of the LSEM or when the mediating variables are Gaussian and Y is binary. Crucially, in the following corollaries, we assume that the correlations between the potential mediators are the same whatever the treatment governing the mediators: This hypothesis is indeed sufficient to identify the individual indirect effects through M from Eq. (3.1) in models where the joint distribution of the mediators is completely described by the expectation and covariance matrix, such as the multivariate Gaussian. In this particular situation, for all combinations of t ≠ t′, the expectation of (M(t), W(t′))|X x is given by the vector (E[M|T t, X x|], E(W|T t′, X x|)) and the covariance matrix is identified by the covariance matrix of (M|T t, X x) and (W|T t′, X x) i.e., of (M|T t, X x) and (W|T t, X x).

Continuous outcome
Corollary 3.2. With K mediators and P covariates, we assume the following linear model We assume that the K mediators are either independent or non-causally correlated. In the latter case, we assume that pairwise correlations between potential mediators do not depend on the treatments governing them, i.e., we assume condition (3.2). Under SIMMA, the indirect effect of the kth mediator is identified and given by: Moreover, the joint indirect effect is the sum of the average indirect effects by each mediator: The direct effect is also identified and given by A proof of Corollary 3.2 can be found in the Supplementary material. Note that an equivalent result for the joint indirect effect is shown in [14]. Also note that the additivity of the individual indirect effects into the joint direct effect (i.e., δ Z (t) k δ k (t)) holds only in the context of Corollary 3.2, otherwise it does not. We have already observed that if the K mediators are independent, the equation for the marginal indirect effect given by Theorem 3.1 (multiple analysis) reduces to the equation given by Theorem 2.1 (simple analysis). In this situation, Corollary 3.2 implies that in the LSEM setting, the indirect effects given by simple analyses can be summed up to obtain the joint indirect effect. Obviously, simple analyses do not allow to assess a comprehensive direct effect, because depending on the mediator of interest, each simple analysis will lead to a different direct effect. All these aspects will be illustrated through simulations in Section 4.

Binary outcome
We now address the case of a binary outcome. As for simple mediation, we consider either the probit regression or the logistic regression Corollary 3.3. Assume the following model with a binary outcome: . We assume that the K mediators are either independent or non-causally correlated. In the latter case, we assume that pairwise correlations between potential mediators do not depend on the treatments governing them as in condition (3.2). Under SIMMA, the effects of interest are given by: where for a probit regression we have and for a logit regression we have When the mediators are independent, we have for a probit regression and for a logistic regression A proof of Corollary 3.3 can be found in the supplementary materials.

Estimation algorithm
The proof of Theorem 3.1 can be generalised to prove that, under SIMMA, the densities of the counterfactual outcomes can be expressed as follows: 2) (and assuming that the joint distribution of mediators is completely determined by its expectation and covariance matrix), Eq. (3.9) makes it possible to sample Y(t, M(t′), W(t)) as well and therefore to estimate its expectation and the indirect effect through M. In particular, SIMMA and (3.2) allow to estimate the conditional covariance matrix of the counterfactual mediators for each possible combination of interventions as the covariance matrix of the mediators given the treatment and the pretreatment covariates. Accordingly we adapt the quasi-Bayesian algorithm presented by [11], to the situation of multiple mediators uncausally related, i.e., for independent and uncausally correlated mediators.
Algorithm. In order to estimate the effects of interest: (1) Fit parametric models for the observed outcome (given all the mediators, treatment and covariates), and mediators (given all the treatment and covariates), denoted respectively as Θ Y and Θ Obtain the estimate 2 of the covariance matrix between mediators given the treatment and the covariates.
(2) For each model, sample J values for each of its parameters according to their multivariate sampling distri- ). As in [11], we use the approximation based on the multivariate normal distribution centered at the estimates of the parameters and with the estimated asymptotic covariance matrix between the estimators.
(3) For each j = 1, … , J, repeat the followings steps: -Simulate the potential values of each mediator. In particular, for each of the K mediators, each pair (t, t′) ∈ {0, 1} 2 , and each individual i∈{1, … , n}, simulate R values of Z (kr) (ji) (t, t′) (M (kr) (ji) (t), W (kr) (ji) (t′)). When all mediators have the same treatment value, the vector of all mediators will be denoted as Z (r) (ji) (t) Z (kr) (ji) (t, t). Note that it is at this step that we take into account the correlation between mediators Σ 2 .
-Simulate the potential outcomes given the simulated values of the potential mediators, denoted as Y (r) (ji) (t, Z (kr) (ji) (t′, t″)) for each i, k and t, t′, t ″ ∈{0, 1}. -Estimate the causal mediation effects: (4) From the empirical distribution of each effect above, obtain point estimates together with p-values and confidence intervals.
Note that this algorithm does not implement the formulae given for the specific models of Corollaries 3.2 and 3.3.
We implemented this algorithm in the R package multimediate, currently available on GitHub. Our main function is based on the mediate() function of the package mediation [13] and makes it possible to work not only with continuous mediators but also binary and ordered categorical mediators using probit models.

Simulation studies
In this section, we validate our methodological results through empirical studies. In particular, we compare our estimates of the mediation causal effects to the true effects and to the estimates obtained by running simple mediation analyses, one for each mediator.

Data simulation method
Except for the LSEM framework, it is in general not straightforward to obtain the true mediation effect values from a causal generative model, that is from a set of causal structural equations. To overcome this difficulty, we start by simulating a large database of values for the treatment T and for all the counterfactual mediators M k (t), and outcomes Y(t, M 1 (t 1 ), …, M K (t K )), see Table 1 for an example. Then we simply compute the indirect effects δ k (t) and δ Z (t) and the direct effect ζ(t) as means, according to the definitions given in Section 3.1. The large size of the dataset guarantees that these Monte-Carlo estimates can be taken as the true values. In this study we generate a dataset of 10 6 observations, so that the estimate error is as small as 0.2% of the standard deviation of the effect of interest.
In order to obtain a subset of observations to test the considered estimation methods, we sample N individuals (i.e., rows) i = 1, … , N and for each of them we select only the values Y(T i ,Z i (T i )) and Z i (T i ) corresponding to the specific value of T i . More precisely: Tables 1 and 2 illustrate the simulation procedure.
For several simulation models, we estimate the different effects of interest by means of the general algorithm for multiple mediators described above in Section 3.6. We compare our estimates with both the true values and the estimates of two simple analyses (one for each mediator) obtained with the mediation package.
For comparative purposes, we analyse the simulations with our multiple mediation method, and also with the approach consisting in running simple analyses in parallel [19], and with the method described by [14] which we refer to as V&V in the figures. In the latter case, we not only report the estimates of the joint indirect and direct effects, but also the estimates of the mediator-specific indirect effects, even though the authors clearly explain that correlation between mediators would lead to bias.

Limitations of repeated simple analyses when the common cause of mediators is not measured
In this section, data are generated under the model described in Figure 6 Table : Simulated observed data with two independent mediators. Observations were extracted from Table .
Note that the correlation between the two mediators conditionally on the treatment (and not on U, Figure 2(c)), is equal to 0.7. When we have two causally independent mediators and U is observed, the approach by [19] is to perform two simple analyses as in Figure 6(b) and (c). However, when U is unobserved, the situation is like in Figure 2(c) with mediators showing residual correlation. In this case, conducting separate simple analyses is not appropriate because Sequential Ignorability assumptions (B.2) and (B.3) are violated [19].
Here, we illustrate this problem through simulations. For comparison purposes, we also show the results obtained with our method for multiple analysis and with the method by [14].
As expected, Tables 3 and 4 show that simple analyses adjusted for U give precise and accurate estimates of the indirect effects (but obviously not of the direct effect), while they give biased estimates when U is not taken into account. On the contrary, our method gives precise and accurate estimates of all effects with or without taking into account U, showing that it is still possible to conduct a mediation analysis to estimate all effects even when U is unobserved.
In the following subsection, we suppose that U is unobserved, as it is often the case in practical situations.

Empirical study of the properties of the proposed estimators
The previous section illustrated our method on a single simulation run. In this section, we perform a simulation-based study to assess the properties of the proposed estimation procedure. More specifically, we  compute bias, confidence interval coverage probability, mean square error (MSE) and variance of our estimators as means over 200 simulation runs for each considered parameter setting. We compare the estimates of several simple analyses, one for each mediator, to the estimates obtained with our multiple mediation analysis for different correlation levels. We consider two causal simulation models accounting for two types of outcome (continuous and logit binary), and two settings with two continuous causally unrelated mediators. Uncausally correlated mediators, Figure 2(c), are simulated from a multivariate normal distribution with fixed covariance matrix. The details of the simulation models can be found in Appendix D. Simulations according to model 1 (continuous outcome) are run for different values of correlation between the mediators and increasing sample size (N = 50, 200, 500, 1000). Results for bias and coverage probability can be seen in Figures 7-10. These figures clearly show that our approach allows an unbiased estimation, contrary to the simple analyses, for both the direct and indirect effects.
The empirical 95% confidence interval given by our method contains the real value in approximatively 95% of the runs, for both the indirect and direct effects and whatever the correlation between the mediators. On the contrary, simple analysis obtains fair coverages only when the correlation is almost null. As expected, the estimators of the individual indirect effects obtained with the method of [14] have the same behaviour as simple analysis estimates. Moreover, the estimators of the joint indirect and direct effects by the method of [14] behave similarly as ours, except that the coverage probability is constant for their method. Our estimators have low variance and low MSE for sample sizes larger than 200.
Simulations were also run for model 2 (binary outcome) for different values of correlation between the mediators with 1000 observational data. As illustrated by Figure 13 in the Appendix, the results for bias, coverage probability, variance and MSE confirm that our estimators are unbiased and have low variance and the expected coverage probability, thus outperforming simple analysis. It is worth noting that for positive correlations, the coverage probability of the confidence intervals of the individual indirect effects is unsatisfactory. This is likely due to the very low variance of the estimators.

Application
We applied our method to estimate the amount of causal effect of hormone replacement therapy (HRT) on breast cancer (BC) risk that is mediated by mammographic density (MD)specifically dense area (DA) and non-dense area (NDA)and body mass index (BMI) in postmenopausal women. The data come from the E3N French cohort study [23]. Based on more than 5000 cases diagnosed between baseline and 2008 [24], a nested case-control study was designed using incidence density sampling. For 640 invasive breast cancer cases with known laterality and at least one mammogram taken between baseline and age at diagnosis, one control was randomly selected from women who had not been diagnosed with breast cancer at the age when the matched case was diagnosed (reference age). After excluding women with missing value, 489 cases and 489 controls were available for the analysis. HRT, prescribed to relief menopausal symptoms, consists in providing women with hormones whose production naturally decreases with menopause [25]. One of the consequences of taking HRT is that women do not experience the decrease of DA, the increase of NDA and the increase of BMI typically occurring at menopause [26]. HRT use has been since long recognised to be a risk factor for BC [27]. Independent BC risk factors are also high postmenopausal BMI and high per age and per BMI MD [28,29]. In order to better understand the mutual relationship between HRT, MD and BMI in BC carcinogenesis, it is important to determine whether and eventually to which extent the effect of HRT on BC risk is due to its action on MD and BMI (mediated effect) and to which extent it is independent of MD and BMI (direct effect).
Based on evidence from association studies on breast cancer risk [30,31], we can reasonably assume that BMI and mammography density are uncausally correlated, being their correlation likely due to common genetic traits, as suggested by twin studies [30,32] and Mendelian randomization analysis [33]. We make the implicit assumption that HRT precedes the mediators and that these precede BC; Figure 11 depicts the causal assumptions made for the following mediation analysis.

Regression models
The continuous variables were normalised using the Box-Cox likelihood-like approach [34], t(M) M λ −1 λ , with λ equal to 0.38, 0.34 and −1.19 for DA, NDA and BMI respectively, as we can see in Figure 14. HRT was treated as a dichotomous variable whose levels were never versus ever users (past or current).
In preparation to our mediation analysis, we regressed each mediator on HRT and AGE (Table 5, models 1, 2 and 3 respectively) and BC on HRT and AGE with or without conditioning on the three mediators (respectively models 4a, 4b). As expected, HRT ever users had significantly higher values of DA and significantly lower of NDA and BMI (Table 5); DA and BMI were positively associated with BC risk, whereas NDA was negatively associated with risk (Table 5). HRT was positively associated with BC risk and the association decreased of the 20% in the log-OR scale when accounting for DA, NDA and BMI into the model (

Multiple mediation analysis
We applied our method with models 1 2, 3 and 4.b from Table 5 to estimate the causal mediated effect due to all mediators and the causal mediated effect due to each of them when accounting for their mutual correlation. As shown in Table 6 the causal mediated effects due to DA and NDA were positive, whereas the causal mediated effect due to BMI was negative; this resulted in a proportion of the total mediated effect of 22% (95% CI: 1 to    63%). Our finding that the effect of HRT is partially mediated by MD is consistent with previous reports in the literature [35,36]. So does the negative sign of the mediated effect by BMI [37,38]. MacKinnon et al. [39] described a situation with opposite signs mediated effects as inconsistent mediation models, as the effects may cancel out each other. In the present case, the negative mediated effect of BMI is not large enough to make the relation between HRT and BC non-significant.

Discussion
This article addresses the problem of estimating direct and indirect effects, including indirect effects through individual mediators, in the framework of multiple mediation with uncausally related mediators. Theoretical work of Shpitser and coauthors proved that in presence of latent variables not all mediation quantities are identified [17,18]. In particular, in the presence of a latent common cause between the mediators, indirect effects trough individual mediators cannot be expressed as functions of the observable data only. On the other hand, a common practice in multiple mediation is to perform several simple mediation analyses, one for each mediator, despite the introduction of a bias. Most of the approaches to mediation analysis are based on strong assumptions such as Sequential Ignorability [11,40], and several authors have tried to address the problem through different techniques. In the framework of multiple mediation with uncausally related mediators, we define a set of hypotheses, called SIMMA, under which we express the direct and the joint indirect effect as functions of observed variables and the indirect effect through individual mediators in terms of both observed and counterfactual variables. Coupled to a choice of model and the quasi-Bayesian algorithm developed by [11]; the latter formula gives an estimation method for the individual indirect effects. Note that we restricted ourselves to models with the additional hypothesis that the correlation between counterfactual mediators is the same whatever the treatment governing them. The development of methods for addressing the situation in which this additional hypothesis is violated is left to future work, together with the development of a sensitivity analysis for assessing the robustness to departures from SIMMA.
The method is implemented in R. Currently our program makes it possible to work with parametric models with continuous or ordered categorical mediators and continuous or binary outcomes. A package has been published on Github.
We applied our R program to validate the proposed method empirically. This simulation study shows that our method provides an unbiased estimate of the direct effect, while, as expected, estimates obtained by running simple mediation analyses one mediator at the time are biased, even in the case of independent mediators. Moreover, when mediators share an unobserved common cause, we show that our multiple analysis provide estimates of the direct effects through individual mediators that are less biased than the ones obtained from simple analyses one mediator at the time. The reason behind this improvement, is that our method, by considering the joint law of the mediators conditionally on the treatment and the law of the outcome conditionally on all the mediators, automatically takes into account the influence that the unobserved common cause U has on the mediators and the outcome. On the contrary, doing a simple analysis one mediator at the time is not appropriate in this setting because U confounds the relationship between each mediator and the outcome. Moreover, we show empirically that, contrary to repeated simple analyses, the proposed quasi-Bayesian algorithm provides confidence intervals with the expected coverage property.
Repeated individual mediator analyses are still a popular approach despite a growing literature warning about its limitations. Indeed, the presence of an unobserved common cause for the mediators is not the only situation in which such an approach is problematic. VanderWeele and Vansteelandt [14] observed that, even when mediators are uncausally related, it is not possible to decompose the joint indirect effect in the sum of individual indirect effects when their effect on the outcome is characterised by an interaction in the additive scale, a situation we excluded in our theoretical results. In this situation, [41] provided a three way decomposition of the joint indirect effect into individual natural indirect effects and an interactive effect. Interestingly, the assumptions required to show the identifiability of all the terms in this decomposition are similar to ours, with the only important difference that potential mediators are assumed to be conditionally independent given all observed covariates. More recently, [42] provided a decomposition of the total effect in the more general situation with both mediator-mediator and mediators-outcome interactions.
Another important setting where repeating simple analyses is the wrong approach to multiple mediation is when mediators are causally ordered as in Figure 2(b). In this situation, considering the vector of intermediate variables as one mediator and conducting a simple analysis will correctly estimate the joint indirect effect and the direct effect. However the former joint indirect effect is not equal to the sum of the individual indirect effects, each estimated with a simple analysis, because some paths are counted twice and the effect mediated by W is biased by M which acts as a posttreatment confounder of the W-Y relationship. More generally, unless strong conditions hold, it is not possible to identify all specific paths [43]. VanderWeele and Vansteelandt [14] introduced a sequential approach to identify the joint indirect effect, the direct effect, the effect mediated by M and the effect mediated by W but not M. The different steps in this strategy can be implemented using medflex, a recently introduced R package based on the natural effect model and imputation or weighting methods [44]. An alternative approach based on linear structural equations with varying coefficients was discussed by [19] and implemented in the mediation package. Nguyen et al. [45] presented a method based on the Inverse Odds Ratio Weighting (IOWR) approach introduced by [46]. This method is very flexible as it accommodates generalized linear models, quantile regression and survival models for the outcome and multiple continuous or categorical mediators; however, it does not allow to estimate the indirect effect through individual mediators, but only the joint indirect effect.
We conclude this brief overview of the literature around multiple mediation by underlining that our framework deals with natural direct and indirect effects. Vansteelandt and Daniel [47] recently introduced socalled interventional direct and path specific indirect effects that do add up to the total effect and are identifiable even when the mediators share unmeasured common causes or the causal dependence between mediators is unknown.
As an illustration of our method, we conducted a multiple mediation analysis on a real dataset from a large cohort to assess the effect of hormone replacement treatment on breast cancer risk through three non-sequential mediators, namely dense mammographic area, non-dense area and body mass index. The causal effects that we have estimated and reported can be interpreted as risk differences, that is differences in percentage points. For a binary outcome, it is however often preferred to measure risk changes in terms of odds ratios (OR). In a parallel work in progress aimed at the epidemiological community, we expand on the application of Section 5 and work out a method to compute the causal effects of interest in the OR scale following the definition by [8].
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission. Even though intuitively it would sound reasonable to think that the indirect effect via the kth mediator δ k is the difference between the joint effect δ Z and the indirect effect by all other mediators η k , we show that this is not true in general.
We want to express δ Z according to K k 1 δ k . To do so, we start from δ k : (1) η k may be interpreted as the indirect effect by all mediators except the kth, when the treatment is fixed at t and the kth mediator is set to the value it would have under treatment 1 − t. Summing over the K mediators, we have: Thus the joint indirect effect can be rewritten as:

B Assumptions
According to [19], the Sequential Ignorability Assumption in the situation of multiple mediators that are causally unrelated is: In order to demonstrate Theorem 3.1 for the joint indirect effect δ Z , the direct effect ζ and the total effect τ, we start by rewriting the definitions in terms of counterfactuals: It is then sufficient to demonstrate that: It will then follow that: We have: Note that in this proof we have only used assumptions (B.1) and (B.4).

C.2 Indirect effect via the mediator of interest
It follows from the definition that: It is then sufficient to demonstrate that: We have: Note that in this proof we have used all SIMMA assumptions. In the case, where M and W are independent, we have:
With this choice of parameters, 30% of the sampled observations are cases. As we can see in Corollary 3.3, with binary outcome, causal effects are related to the covariance of mediators. Figure 12 shows how the true causal values change when correlation changes.