Abstract
This paper provides empirical interpretation of the
1 Introduction
The structural causal modeling (SCM) framework described in [1], [2], [3] defines and computes quantities of the form
Mathematically, the expression
2 The semantics of Q
Assume we are conducting an observational study guided by model M in which Q is identifiable, and is evaluated to be
where
Q represents a theoretical limit on the causal effects of manipulable interventions that might become available in the future.
Q imposes constraints on the causal effects of currently manipulable variables.
Q serves as an auxiliary mathematical operation in the derivation of causal effects of manipulable variables.
2.1 Q as a limit on pending interventions
Consider a set
Now suppose we manage to identify and estimate Q in an observational study. What does it tell us about the set of pending interventions
Note that Q can be considered a “theoretical limit” and an “ultimate effect”—not in the sense of presenting a ceiling on the impact of
2.2 What Q tells us about the effects of feasible interventions
We will now explore how knowing Q, the “theoretical effect” of an infeasible intervention, can be useful to policy makers who care only about the impact of “feasible interventions.”
Consider a simple linear model,
If we wish to predict the average causal effect
Thus, b constitutes an upper bound for
The basic structure of this knowledge transfer holds for nonlinear systems as well. For example, if the chain model above is governed by arbitrary functions
Thus, we can infer the causal effect of a practical intervention I by combining the theoretical effect of a non-manipulable variable X, with the causal effect of I on X. Note again that if the theoretical effect of X on Y is zero (i. e.,
Let us move now from the simple chain to a more complex model (still linear) where the arrow
Thus, whenever we are able to identify the theoretical effect
To summarize these two aspects of Q, I will reiterate an example from [8] where smoking was taken to represent a variable that defies direct manipulation. In that context, we concluded that “if careful scientific investigations reveal that smoking has no effect on cancer, we can comfortably conclude that increasing cigarette taxes will not decrease cancer rates, and that it is futile for schools to invest resources in anti-smoking educational programs.”
2.3 d o ( x ) as an auxiliary mathematical construct
In 2000 Phil Dawid published a paper entitled “Causal reasoning without counterfactuals” in which he objected to the use of counterfactuals on philosophical grounds. His reasons:
“By definition, we can never observe such [counterfactual] quantities, nor can we assess empirically the validity of any modeling assumption we may make about them, even though our conclusions may be sensitive to these assumptions.”
In my comment on Dawid’s paper [12], I agreed with Dawid’s insistence on empirical validity, but stressed the difference between pragmatic and dogmatic empiricism. A pragmatic empiricist insists on asking empirically testable queries, but leaves the choice of tools to convenience and imagination; the dogmatic empiricist requires that the entire analysis, including all auxiliary symbols and all intermediate steps, “involve only terms subject to empirical scrutiny.” As an extreme example, a strictly dogmatic empiricist would shun division by negative numbers because no physical object can be divided into a negative number of equal parts. In the context of causal inference, a pragmatic empiricist would welcome unobservable counterfactuals of individual units (e. g.,
I now apply this distinction to our controversial construct Q which, in the opinion of some critics, is empirically ill-defined when X is non-manipulable. Let us regard Q—not as a causal effect or as a limit of causal effects—but as a purely mathematical construct which, like complex numbers, has no empirical content on its own, but permits us to derive empirically meaningful results.
For example, if we look at the derivation of the front-door estimate in
Such auxiliary constructs are not rare in science. For example, although it is possible to derive De-Moivre’s formula for
3 Testing d o ( x ) claims
We are now ready to tackle the final question posed in the introduction: Granted that
Since X is non-manipulable, we must forgo verification of Q through the direct control of X, and settle instead on indirect tests as is commonly done in observational studies. This calls for devising observational or experimental studies capable of refuting the claim
Since the claim
Not all models have testable implications, but those that do advertise those implications in the model’s graph and invite standard statistical tests for verification. Typical are conditional independence tests and equality constraints. For example, if
If the model contains manipulable variables, then randomized controls over the manipulable variables provide additional tests for the structure of M, and hence for the validity of Q. To illustrate, consider the front-door model of Fig. 1, where I is manipulable, X non-manipulable, and U an unobserved confounder. The model has no testable implication in observational studies. However, randomizing I yields an estimate of

A model in which equating the effect of I on Y in an RCT with that obtained through the front-door formula produces a test for
We see that, whereas direct tests of
4 Non-manipulability and reinforcement learning
The role of models in handling a non-manipulable variable has interesting parallels in machine learning applications, especially in its reinforce learning (RL) variety [14], [15]. Skipping implementational details, a RL algorithm is given a set of actions or interventions, say
Through trial and error training of a neural network, the RL algorithm constructs a functional mapping between each state s and the next action to be taken. In the course of this construction, however, the algorithm evaluates a huge number of reward functions of the form
A question often asked about the RL framework is whether it is equivalent in power to SCM in terms of its ability to predict the effects of interventions.
The answer is a qualified YES. By deploying interventions in the training stage, RL allows us to infer the consequences of those interventions, but ONLY those interventions. It cannot go beyond and predict the effects of actions not tried in training. To do that, a causal model is required [16]. This limitation is equivalent to the one faced by researchers who deny legitimacy to
A simple example illustrating this point is shown in Fig. 1, which depicts the causal structure of the environment prior to learning. X and Z are manipulable, while
Thus, the freedom to manipulate Z and estimate its effects on X and Y enables us to evaluate the effect of an action

Model (a) permits us to learn the effect of X on Y by manipulating Z, instead of X. In Model (b) learning the effect of X on Y requires that X itself be manipulated.
To see the critical role that causal modeling plays in this exercise, note that the model in Fig. 2(b) does not permit such evaluation by any algorithm whatsoever, a fact verifiable from the model structure [17]. This means that a model-blind RL algorithm would be unable to tell whether the optimal choice of untried actions can be computed from those tried.
5 Conclusions
We have shown that causal effects associated with non-manipulable variables have empirical semantics along several dimensions. They provide theoretical limits, as well as valuable constraints over causal effects of manipulable variables. They facilitate the derivation of causal effects of manipulable variables and, finally, they can be tested for validity, albeit indirectly.
Doubts and trepidations concerning the effects of non-manipulable variables and their empirical content should give way to appreciating the important roles that these effects play in causal inference.
Turning attention to machine learning, we have shown parallels between estimating the effects of non-manipulable variables and learning the effect of feasible yet untried actions. The role of causal modeling was shown to be critical in both frameworks.
Armed with these clarifications, researchers need not be concerned with the distinction between manipulable and non-manipulative variables, except of course in the design of actual experiments. In the analytical stage, including model specification, identification and estimation, all variables can be treated equally, and are therefore equally eligible to receive the
Funding source: Defense Advanced Research Projects Agency
Award Identifier / Grant number: W911NF-16-057
Funding source: National Science Foundation
Award Identifier / Grant number: IIS-1302448
Award Identifier / Grant number: IIS-1527490
Award Identifier / Grant number: IIS-1704932
Funding source: Office of Naval Research
Award Identifier / Grant number: N00014-17-S-B001
Funding statement: This research was supported in part by grants from Defense Advanced Research Projects Agency [#W911NF-16-057], National Science Foundation [#IIS-1302448, #IIS-1527490, and #IIS-1704932], and Office of Naval Research [#N00014-17-S-B001].
Acknowledgment
Discussions with Elias Bareinboim contributed substantially to this paper.
References
1. Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82:669–710.10.1093/biomet/82.4.669Search in Google Scholar
2. Pearl J. On the consistency rule in causal inference: An axiom, definition, assumption, or a theorem? Epidemiology. 2011;21:872–5.10.1097/EDE.0b013e3181f5d3fdSearch in Google Scholar
3. Pearl J. The seven tools of causal reasoning with reflections on machine learning. Commun ACM. 2019;62:54–60.10.1145/3241036Search in Google Scholar
4. Cartwright N. Hunting Causes and Using Them: Approaches in Philosophy and Economics. New York, NY: Cambridge University Press; 2007.10.1017/CBO9780511618758Search in Google Scholar
5. Heckman J, Vytlacil E. Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation. In: Handbook of Econometrics. vol. 6B. Amsterdam: Elsevier B.V.; 2007. p. 4779–874.10.1016/S1573-4412(07)06070-9Search in Google Scholar
6. Pearl J. Causality: Models, Reasoning, and Inference. 2nd ed. New York: Cambridge University Press; 2009.10.1017/CBO9780511803161Search in Google Scholar
7. Hernán M. Does water kill? A call for less casual causal inferences. Ann Epidemiol. 2016;26:674–80.10.1016/j.annepidem.2016.08.016Search in Google Scholar PubMed PubMed Central
8. Pearl J. Does obesity shorten life? Or is it the soda? On non-manipulable causes. J Causal Inference. Causal, Casual, and Curious Section. 2018;6. 10.1515/jci-2018-2001.Search in Google Scholar
9. Hernán M, VanderWeele T. Compound treatments and transportability of causal inference. Epidemiology. 2011;22:368–77.10.1097/EDE.0b013e3182109296Search in Google Scholar PubMed PubMed Central
10. Pearl J. Physical and metaphysical counterfactuals: Evaluating disjunctive actions. J Causal Inference. Causal, Casual, and Curious Section. 2017;5. 10.1515/jci-2017-0018.Search in Google Scholar
11. Dawid A. Causal inference without counterfactuals (with comments and rejoinder). J Am Stat Assoc. 2000;95:407–48.10.1080/01621459.2000.10474210Search in Google Scholar
12. Pearl J. Comment on A.P. Dawid’s, Causal inference without counterfactuals. J Am Stat Assoc. 2000;95:428–31.10.2307/2669380Search in Google Scholar
13. Rosenbaum P, Rubin D. The central role of propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.10.21236/ADA114514Search in Google Scholar
14. Sutton RS, Barto AG. Reinforcement learning: An introduction. Cambridge, MA: MIT press; 1998.10.1109/TNN.1998.712192Search in Google Scholar
15. Szepesvári C. Algorithms for reinforcement learning. San Rafael, CA: Morgan and Claypool; 2010.10.2200/S00268ED1V01Y201005AIM009Search in Google Scholar
16. Zhang J, Bareinboim E. Transfer learning in multi-armed bandits: A causal approach. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17). Minneapolis, MN. 2017.10.24963/ijcai.2017/186Search in Google Scholar
17. Bareinboim E, Pearl J. Causal inference by surrogate experiments: z-identifiability. In: de Freitas N, Murphy K, editors. Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence. Corvallis, OR: AUAI Press; 2012. p. 113–20.Search in Google Scholar
18. Bareinboim E, Pearl J. Causal inference and the data-fusion problem. Proc Natl Acad Sci. 2016;113:7345–52.10.1073/pnas.1510507113Search in Google Scholar PubMed PubMed Central
© 2019 Walter de Gruyter GmbH, Berlin/Boston