We will now explore how knowing *Q*, the “theoretical effect” of an infeasible intervention, can be useful to policy makers who care only about the impact of “feasible interventions.”

Consider a simple linear model, $\mathit{I}\to \mathit{X}\to \mathit{Y}$ with no unmeasured confounders and no direct link from *I* to *Y*. Let *a* and *b* stand for the structural coefficients associated with the two arrows, and let *X* be non-manipulable.

If we wish to predict the average causal effect $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$ of intervention *I* (say a new diet) on *Y* (say life expectancy), then we have (after proper normalization)
$$\begin{array}{c}\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}=\mathit{E}[\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{I}+1\mathrm{)}]-\mathit{E}[\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{I}\mathrm{)}]=\mathit{a}\ast \mathit{b}.\end{array}$$
Thus, *b* constitutes an upper bound for $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$. Yet, since *X* is not manipulable, the coefficient *b* is purely theoretical, and the manipulativity critics will object to granting it a “causal effect” status. Oddly, this theoretical quantity does inform our target quantity $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$ which meets all criteria of feasibility and manipulativity. Practically, if for some reason we are able to estimate *b*, but not *a*, we have an extremely valuable information about the magnitude of $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$. In particular, if *b* is close to zero, we can categorically conclude that $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$ should be zero as well. Such a prediction would be critical, for example, if intervention *I* is still in its developmental stage, and our study involves measurement of a surrogate intervention ${\mathit{I}}^{\prime}$ yielding ${\mathit{a}}^{\prime}$ and ${\mathit{b}}^{\prime}$. Our model dictates that the ${\mathit{b}}^{\prime}$ estimand under ${\mathit{I}}^{\prime}$ will remain unaltered as we move to *I*. Therefore, estimating the theoretical quantity ${\mathit{b}}^{\prime}$ allows us to assess $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$ from a study conducted under ${\mathit{I}}^{\prime}$.

The basic structure of this knowledge transfer holds for nonlinear systems as well. For example, if the chain model above is governed by arbitrary functions $\mathit{X}=\mathit{f}\mathrm{(}\mathit{I}\mathrm{,}{\mathit{\u03f5}}_{\mathit{x}}\mathrm{)}$ and $\mathit{Y}=\mathit{g}\mathrm{(}\mathit{X}\mathrm{,}{\mathit{\u03f5}}_{\mathit{y}}\mathrm{)}$ (with ${\mathit{\u03f5}}_{\mathit{x}}$ independent of ${\mathit{\u03f5}}_{\mathit{y}}$), the overall causal effect of *I* on *Y* becomes a convolution of the two local causal effects. Formally,
$$\begin{array}{c}\mathit{E}\mathrm{(}\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{I}\mathrm{)}\mathrm{)}={\displaystyle \sum}_{\mathit{x}}\mathit{P}\mathrm{(}\mathit{x}|\mathit{d}\mathit{o}\mathrm{(}\mathit{I}\mathrm{)}\mathrm{)}\mathit{E}[\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{x}\mathrm{)}].\end{array}$$
Thus, we can infer the causal effect of a practical intervention *I* by combining the theoretical effect of a non-manipulable variable *X*, with the causal effect of *I* on *X*. Note again that if the theoretical effect of *X* on *Y* is zero (i. e., $\mathit{E}\mathrm{(}\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{x}\mathrm{)}\mathrm{)}$ is independent of *x*), the causal effect of the intervention *I* is also zero.

Let us move now from the simple chain to a more complex model (still linear) where the arrow $\mathit{X}\to \mathit{Y}$ is replaced by a complex graph, rich with mediators and unobserved confounders. Linearity dictates that $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$ will still be given by a product $\mathit{a}\mathit{c}$ where *a* is the same as before and *c* stands for the difference:
$$\begin{array}{c}\mathit{c}=\mathit{E}[\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{x}+1\mathrm{)}]-\mathit{E}[\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{x}\mathrm{)}].\end{array}$$
Thus, whenever we are able to identify the theoretical effect $\mathit{Q}=\mathit{E}\mathrm{(}\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{x}\mathrm{)}\mathrm{)}$ we are also able to identify the causal effect of the intervention *I*. This statement may appear to be empty when the latter is identifiable directly from the model. However, when we consider again the task of predicting $\mathit{A}\mathit{C}\mathit{E}\mathrm{(}\mathit{I}\mathrm{)}$ from a surrogate study involving ${\mathit{I}}^{\prime}$, the benefit of having $\mathit{Q}=\mathit{E}\mathrm{(}\mathit{Y}|\mathit{d}\mathit{o}\mathrm{(}\mathit{x}\mathrm{)}\mathrm{)}$ becomes clear. It is this theoretical effect that would permit us to transfer knowledge between the two studies.

To summarize these two aspects of *Q*, I will reiterate an example from [8] where smoking was taken to represent a variable that defies direct manipulation. In that context, we concluded that “if careful scientific investigations reveal that smoking has no effect on cancer, we can comfortably conclude that increasing cigarette taxes will not decrease cancer rates, and that it is futile for schools to invest resources in anti-smoking educational programs.”

## Comments (0)

General note:By using the comment function on degruyter.com you agree to our Privacy Statement. A respectful treatment of one another is important to us. Therefore we would like to draw your attention to our House Rules.