Figure 1 illustrates that the instrumental variable, *Z*, and the unmeasured confounder, *U*, are marginally independent since the path between *Z* and *U* is blocked by the collider, *T*. Once we condition on *T*, which we do when we estimate the treatment effect, we induce a correlation between *Z* and *U* which is represented by the dashed arrows between *Z* and *U* in Figure 2(a). The dashed lines illustrate that this association is induced and non-causal. The two arrows between *Z* and *U* are used to illustrate that this induced association represents two pathways: (1) the indirect path in the direction from *Z* to *U* ($Z\to U$) and (2) the indirect path in the direction from *U* to *Z* ($U\to Z$). The value of the path $Z\to U$ is represented by the term ${\mathrm{\lambda}}_{5}$ and can be interpreted as the regression coefficient for *Z* in a regression of *U* on *Z* and *T*. Similarly, the value of the path $U\to Z$ is represented by the term ${\mathrm{\lambda}}_{6}$ and is interpreted as the regression coefficient of *U* in a regression of *Z* on *U* and *T*.

Figure 2 (a) Conditioning on *T* induces an association between *Z* and *U*. The dashed lines between *Z* and *U* illustrate that this association is indirect and non-causal. The parallel arrows illustrate that this induced association represents two pathways. (b) After conditioning on *T* and *Z*, all pathways from *T* to *Y* that occur through the instrument, *Z*, are blocked

There are now four paths from treatment, *T*, to the outcome, *Y*: (1) the direct path $T\to Y$, (2) the indirect path $T\leftarrow U\to Y$ (confounding by *U*), (3) the indirect path $T\leftarrow Z\to U\to Y$ (confounding by *U* and *Z*, created by the correlation between *U* and *Z* that occurs when conditioning on *T*), and (4) the indirect path $T\leftarrow U\to Z\to U\to Y$ (which also represents confounding by *U* and *Z*, created by the correlation between *U* and *Z* after conditioning on *T*). Were we to regress *Y* on *T*, the coefficient for *T* would represent the sum effect of these four paths. Formally,
$\frac{\mathrm{\partial}}{\mathrm{\partial}t}E\left[Y|T=t\right]=\frac{\mathrm{\partial}}{\mathrm{\partial}t}\sum _{z}\sum _{u}E\left[Y|t,z,u\right]p(u|z,t)p(z|t)$(1)
$={\mathrm{\lambda}}_{1}+r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}+{\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}\frac{\mathrm{\partial}}{\mathrm{\partial}t}E[Z|t]$(2)
$={\mathrm{\lambda}}_{1}+r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}+r({\mathrm{\lambda}}_{4}){\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}+r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{6}{\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}.$(3)

Each of the terms in eq. (3) represents one of the four pathways from *T* to *Y* when we regress *Y* on *T*. The first term, ${\mathrm{\lambda}}_{1}$, is the direct effect of *T* on *Y* and represents the unconfounded treatment effect. The second term in eq. (3), $r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}$, represents the value of the path $T\leftarrow U\to Y$. The third term, $r({\mathrm{\lambda}}_{4}){\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}$, represents the value of the path $T\leftarrow Z\to U\to Y$. The last term, $r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{6}{\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}$, represents the value of the path $T\leftarrow U\to Z\to U\to Y$. The sum of the last three terms in eq. (3) represents the bias in the treatment effect when not controlling for the instrument, *Z*, and can be expressed as
${B}_{0}=r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}+r({\mathrm{\lambda}}_{4}){\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}+r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{6}{\mathrm{\lambda}}_{5}{\mathrm{\lambda}}_{3}.$(4)

After controlling for *Z*, all of the pathways between *T* and *Y* that occur through *Z* are blocked (Figure 2(b)). There are now only two paths between treatment and the outcome: (1) the direct path $T\to Y$ and (2) the confounding path $T\leftarrow U\to Y$. A regression of *Y* on *T* and *Z* can then be calculated as the sum of the values of these two pathways.
$\frac{\mathrm{\partial}}{\mathrm{\partial}t}E\left[Y|T=t,Z=z\right]=\frac{\mathrm{\partial}}{\mathrm{\partial}t}\sum _{u}E\left[Y|t,z,u\right]p(u|t,z)$(5)
$={\mathrm{\lambda}}_{1}+{\mathrm{\lambda}}_{3}\frac{\mathrm{\partial}}{\mathrm{\partial}t}E[U|t,z]$(6)
$={\mathrm{\lambda}}_{1}+r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}.$(7)In eq. (7), ${\mathrm{\lambda}}_{1}$ represents the value of the direct effect of *T* on *Y* and the term $r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}$ represents the value of the indirect path $T\leftarrow U\to Y$. The bias in the treatment effect after controlling for the instrument, *Z*, can then be expressed as
${B}_{z}=r({\mathrm{\lambda}}_{2}){\mathrm{\lambda}}_{3}.$(8)

Pearl [12] shows that ${B}_{0}=(1-{\mathrm{\rho}}_{ZT}^{2}){B}_{Z}$, where ${\mathrm{\rho}}_{ZT}$ is the correlation between *Z* and *T*. Since $0\le {\mathrm{\rho}}_{ZT}^{2}\le 1$, ${B}_{0}\le {B}_{Z}$ with equality only if ${\mathrm{\rho}}_{ZT}=0$. In other words, not controlling for the instrument, *Z*, reduces the overall bias in the treatment effect that is caused by the unmeasured confounder, *U*. When we do not condition on *Z*, we allow for an induced correlation to occur between *Z* and *U* upon conditioning on *T*. This induced correlation results in additional confounding (an induced confounding) that is of smaller magnitude and in the opposite direction than the confounding caused by *U* alone. Therefore, the overall confounding in the treatment effect is less than the confounding caused only by *U*. Controlling for *Z* eliminates the induced confounding and returns the total confounding to the value it would have been in the absence of the instrument.

Pearl [12] provides an intuitive way of describing this effect which is discussed by Myers et al. [14]. According to Pearl, not conditioning on an instrumental variable allows the instrument to account for part of the change or variation in the treatment variable. This reduces the amount of variation in the treatment variable that is explained by the unmeasured confounder, thereby reducing the total amount of residual confounding caused by the unmeasured confounder.

While we have described bias amplification in terms of controlling for an instrumental variable, the same arguments apply when controlling for any variable affecting treatment (e.g. measured confounders). Controlling for a measured confounder, *X*, will remove the confounding due to *X*, but also eliminate the induced reduction in confounding by *U*. This may increase or decrease the overall confounding, depending on the relative strengths of these two confounding effects. For a detailed discussion on bias amplification, we refer the reader to Pearl [12]. The purpose for this discussion is to emphasize and illustrate that bias amplification results from controlling induced correlations that occur between instrumental variables (or measured confounders) and unmeasured confounders upon conditioning on treatment. Viewing bias amplification from this perspective helps to discuss ways in which DRSs can be estimated to avoid controlling these induced correlations while simultaneously controlling for the direct effects of measured confounders on the outcome.

## Comments (0)