## Abstract

This paper provides empirical interpretation of the

## 1 Introduction

The structural causal modeling (SCM) framework described in [1], [2], [3] defines and computes quantities of the form *X* on *Y*. The computation of *Q* simulates a minimally invasive intervention that sets the value of *X* to *x*, and leaves all other relationships unaltered. Several critics of SCM have voiced concerns about this interpretation of *Q* when *X* is non-manipulable; that is, *X* is a variable whose value cannot be controlled directly by an experimenter [4], [5], [6], [7], [8]. Indeed, asking for the effect of setting *X* to a constant *x* makes perfect sense when *X* is a treatment, say “drug-1” or “diet-2,” but how can we imagine an action *X* is non-manipulable, like gender, race, or even a state of a variable such as blood-pressure or cholesterol level?^{[1]}

Mathematically, the expression *X* is part of a causal model *M*, for it can be computed using the surgical procedure of the *Q* raises two questions when *X* is a state of a variable. The first question is semantical: What information does *Q* convey aside from being a mathematical property of our model? Since one cannot translate *Q* into a prediction about the effect of an executable action, what does *Q* tell us about reality which is not just an artifact of the model? Take for example the proposition: “The number of variables in the model is a prime number”; it is undeniably a property of *M*, but would hardly qualify as a feature of reality. The second question raised is empirical: Even assuming that *Q* conveys an important feature of reality, how can we test it empirically? And if we cannot test it, is it part of science? I will address these two questions in the following sections.

## 2 The semantics of *Q*

Assume we are conducting an observational study guided by model *M* in which *Q* is identifiable, and is evaluated to be

where *x*, computed from the joint distribution of observed variables in the model. To what use can one put this information? I will discuss three distinct uses.

*Q*represents a theoretical limit on the causal effects of manipulable interventions that might become available in the future.*Q*imposes constraints on the causal effects of currently manipulable variables.*Q*serves as an auxiliary mathematical operation in the derivation of causal effects of manipulable variables.

### 2.1 *Q* as a limit on pending interventions

Consider a set *Y* we wish to compare. Assume that these interventions are suspected of affecting *Y* through their effect on *X*, and *X* is not directly manipulable. For example, *Y* stands for “life expectancy.” Some of these interventions will have side effects and some will not. Some will change *X* deterministically, such that *X* stochastically. The ideal intervention will, of course, have no side effect on the outcome *Y* and will affect *X* deterministically. However, an ideal intervention may not be feasible given the current state of technology, but may become feasible in the future. For example, cloud seeding made “rain” manipulable in our century, and genetic engineering may render gene variations manipulable in the future. If we simulate the impact of such an ideal intervention, one with no side effects and with a deterministic *f*, its resultant effect on *Y* will be *Q*.

Now suppose we manage to identify and estimate *Q* in an observational study. What does it tell us about the set of pending interventions *Q* gives us the ultimate effect ANY intervention can possibly have on *Y* by leveraging *Y*’s dependence on *X*. This information may not be directly usable to a decision maker trying to assess the effectiveness of any given interventions *X*. Clearly, if *Q* is low, the exploration is futile, while if *Q* is high, the possibility exists that by finding a more effective modifier of *X*, we would obtain better control over *Y*.

Note that *Q* can be considered a “theoretical limit” and an “ultimate effect”—not in the sense of presenting a ceiling on the impact of *Y*, but rather as a ceiling on the *X*-attributable component of that impact. If some intervention, say *Y* than that predicted through *Q*, we can safely conclude that much of that impact is due to side effects, not due to *X*.

### 2.2 What *Q* tells us about the effects of feasible interventions

We will now explore how knowing *Q*, the “theoretical effect” of an infeasible intervention, can be useful to policy makers who care only about the impact of “feasible interventions.”

Consider a simple linear model, *I* to *Y*. Let *a* and *b* stand for the structural coefficients associated with the two arrows, and let *X* be non-manipulable.

If we wish to predict the average causal effect *I* (say a new diet) on *Y* (say life expectancy), then we have (after proper normalization)

Thus, *b* constitutes an upper bound for *X* is not manipulable, the coefficient *b* is purely theoretical, and the manipulativity critics will object to granting it a “causal effect” status. Oddly, this theoretical quantity does inform our target quantity *b*, but not *a*, we have an extremely valuable information about the magnitude of *b* is close to zero, we can categorically conclude that *I* is still in its developmental stage, and our study involves measurement of a surrogate intervention *I*. Therefore, estimating the theoretical quantity

The basic structure of this knowledge transfer holds for nonlinear systems as well. For example, if the chain model above is governed by arbitrary functions *I* on *Y* becomes a convolution of the two local causal effects. Formally,

Thus, we can infer the causal effect of a practical intervention *I* by combining the theoretical effect of a non-manipulable variable *X*, with the causal effect of *I* on *X*. Note again that if the theoretical effect of *X* on *Y* is zero (i. e., *x*), the causal effect of the intervention *I* is also zero.

Let us move now from the simple chain to a more complex model (still linear) where the arrow *a* is the same as before and *c* stands for the difference:

Thus, whenever we are able to identify the theoretical effect *I*. This statement may appear to be empty when the latter is identifiable directly from the model. However, when we consider again the task of predicting

To summarize these two aspects of *Q*, I will reiterate an example from [8] where smoking was taken to represent a variable that defies direct manipulation. In that context, we concluded that “if careful scientific investigations reveal that smoking has no effect on cancer, we can comfortably conclude that increasing cigarette taxes will not decrease cancer rates, and that it is futile for schools to invest resources in anti-smoking educational programs.”

### 2.3 d o ( x ) as an auxiliary mathematical construct

In 2000 Phil Dawid published a paper entitled “Causal reasoning without counterfactuals” in which he objected to the use of counterfactuals on philosophical grounds. His reasons:

“By definition, we can never observe such [counterfactual] quantities, nor can we assess empirically the validity of any modeling assumption we may make about them, even though our conclusions may be sensitive to these assumptions.”

In my comment on Dawid’s paper [12], I agreed with Dawid’s insistence on empirical validity, but stressed the difference between pragmatic and dogmatic empiricism. A pragmatic empiricist insists on asking empirically testable queries, but leaves the choice of tools to convenience and imagination; the dogmatic empiricist requires that the entire analysis, including all auxiliary symbols and all intermediate steps, “involve only terms subject to empirical scrutiny.” As an extreme example, a strictly dogmatic empiricist would shun division by negative numbers because no physical object can be divided into a negative number of equal parts. In the context of causal inference, a pragmatic empiricist would welcome unobservable counterfactuals of individual units (e. g.,

I now apply this distinction to our controversial construct *Q* which, in the opinion of some critics, is empirically ill-defined when *X* is non-manipulable. Let us regard *Q*—not as a causal effect or as a limit of causal effects—but as a purely mathematical construct which, like complex numbers, has no empirical content on its own, but permits us to derive empirically meaningful results.

For example, if we look at the derivation of the front-door estimate in

Such auxiliary constructs are not rare in science. For example, although it is possible to derive De-Moivre’s formula for

## 3 Testing d o ( x ) claims

We are now ready to tackle the final question posed in the introduction: Granted that

Since *X* is non-manipulable, we must forgo verification of *Q* through the direct control of *X*, and settle instead on indirect tests as is commonly done in observational studies. This calls for devising observational or experimental studies capable of refuting the claim

Since the claim *M*, confirming the testable implications of those assumptions constitutes a test for the equality

Not all models have testable implications, but those that do advertise those implications in the model’s graph and invite standard statistical tests for verification. Typical are conditional independence tests and equality constraints. For example, if *M*, and hence a test for *Q*.

If the model contains manipulable variables, then randomized controls over the manipulable variables provide additional tests for the structure of *M*, and hence for the validity of *Q*. To illustrate, consider the front-door model of Fig. 1, where *I* is manipulable, *X* non-manipulable, and *U* an unobserved confounder. The model has no testable implication in observational studies. However, randomizing *I* yields an estimate of

We see that, whereas direct tests of *Q*. Metaphorically, these tests can be likened to the way planet Neptune was discovered (1845)—not by direct observation, but through the anomaly it caused in the trajectory of Uranus.

## 4 Non-manipulability and reinforcement learning

The role of models in handling a non-manipulable variable has interesting parallels in machine learning applications, especially in its reinforce learning (RL) variety [14], [15]. Skipping implementational details, a RL algorithm is given a set of actions or interventions, say *s* of the environment an action *s*. This reward function can be written as *Y* the stream of future payoffs received by acting

Through trial and error training of a neural network, the RL algorithm constructs a functional mapping between each state *s* and the next action to be taken. In the course of this construction, however, the algorithm evaluates a huge number of reward functions of the form *s* are very similar to the function

A question often asked about the RL framework is whether it is equivalent in power to SCM in terms of its ability to predict the effects of interventions.

The answer is a qualified YES. By deploying interventions in the training stage, RL allows us to infer the consequences of those interventions, but ONLY those interventions. It cannot go beyond and predict the effects of actions not tried in training. To do that, a causal model is required [16]. This limitation is equivalent to the one faced by researchers who deny legitimacy to *X* is non-manipulable. In the RL context, however, the prohibition extends to manipulable variables as well, in case they were not activated in the training phase.

A simple example illustrating this point is shown in Fig. 1, which depicts the causal structure of the environment prior to learning. *X* and *Z* are manipulable, while *Z* on both *Y* and *X*. We now wish to infer the effect of action

Thus, the freedom to manipulate *Z* and estimate its effects on *X* and *Y* enables us to evaluate the effect of an action

To see the critical role that causal modeling plays in this exercise, note that the model in Fig. 2(b) does not permit such evaluation by any algorithm whatsoever, a fact verifiable from the model structure [17]. This means that a model-blind RL algorithm would be unable to tell whether the optimal choice of untried actions can be computed from those tried.

## 5 Conclusions

We have shown that causal effects associated with non-manipulable variables have empirical semantics along several dimensions. They provide theoretical limits, as well as valuable constraints over causal effects of manipulable variables. They facilitate the derivation of causal effects of manipulable variables and, finally, they can be tested for validity, albeit indirectly.

Doubts and trepidations concerning the effects of non-manipulable variables and their empirical content should give way to appreciating the important roles that these effects play in causal inference.

Turning attention to machine learning, we have shown parallels between estimating the effects of non-manipulable variables and learning the effect of feasible yet untried actions. The role of causal modeling was shown to be critical in both frameworks.

Armed with these clarifications, researchers need not be concerned with the distinction between manipulable and non-manipulative variables, except of course in the design of actual experiments. In the analytical stage, including model specification, identification and estimation, all variables can be treated equally, and are therefore equally eligible to receive the

**Funding source: **Defense Advanced Research Projects Agency

**Award Identifier / Grant number: **W911NF-16-057

**Funding source: **National Science Foundation

**Award Identifier / Grant number: **IIS-1302448

**Award Identifier / Grant number: **IIS-1527490

**Award Identifier / Grant number: **IIS-1704932

**Funding source: **Office of Naval Research

**Award Identifier / Grant number: **N00014-17-S-B001

**Funding statement: **This research was supported in part by grants from Defense Advanced Research Projects Agency [#W911NF-16-057], National Science Foundation [#IIS-1302448, #IIS-1527490, and #IIS-1704932], and Office of Naval Research [#N00014-17-S-B001].

## Acknowledgment

Discussions with Elias Bareinboim contributed substantially to this paper.

## References

1. Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82:669–710.10.1093/biomet/82.4.669Search in Google Scholar

2. Pearl J. On the consistency rule in causal inference: An axiom, definition, assumption, or a theorem? Epidemiology. 2011;21:872–5.10.1097/EDE.0b013e3181f5d3fdSearch in Google Scholar

3. Pearl J. The seven tools of causal reasoning with reflections on machine learning. Commun ACM. 2019;62:54–60.10.1145/3241036Search in Google Scholar

4. Cartwright N. Hunting Causes and Using Them: Approaches in Philosophy and Economics. New York, NY: Cambridge University Press; 2007.10.1017/CBO9780511618758Search in Google Scholar

5. Heckman J, Vytlacil E. Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation. In: Handbook of Econometrics. vol. 6B. Amsterdam: Elsevier B.V.; 2007. p. 4779–874.10.1016/S1573-4412(07)06070-9Search in Google Scholar

6. Pearl J. Causality: Models, Reasoning, and Inference. 2nd ed. New York: Cambridge University Press; 2009.10.1017/CBO9780511803161Search in Google Scholar

7. Hernán M. Does water kill? A call for less casual causal inferences. Ann Epidemiol. 2016;26:674–80.10.1016/j.annepidem.2016.08.016Search in Google Scholar PubMed PubMed Central

8. Pearl J. Does obesity shorten life? Or is it the soda? On non-manipulable causes. J Causal Inference. Causal, Casual, and Curious Section. 2018;6. 10.1515/jci-2018-2001.Search in Google Scholar

9. Hernán M, VanderWeele T. Compound treatments and transportability of causal inference. Epidemiology. 2011;22:368–77.10.1097/EDE.0b013e3182109296Search in Google Scholar PubMed PubMed Central

10. Pearl J. Physical and metaphysical counterfactuals: Evaluating disjunctive actions. J Causal Inference. Causal, Casual, and Curious Section. 2017;5. 10.1515/jci-2017-0018.Search in Google Scholar

11. Dawid A. Causal inference without counterfactuals (with comments and rejoinder). J Am Stat Assoc. 2000;95:407–48.10.1080/01621459.2000.10474210Search in Google Scholar

12. Pearl J. Comment on A.P. Dawid’s, Causal inference without counterfactuals. J Am Stat Assoc. 2000;95:428–31.10.2307/2669380Search in Google Scholar

13. Rosenbaum P, Rubin D. The central role of propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.10.21236/ADA114514Search in Google Scholar

14. Sutton RS, Barto AG. Reinforcement learning: An introduction. Cambridge, MA: MIT press; 1998.10.1109/TNN.1998.712192Search in Google Scholar

15. Szepesvári C. Algorithms for reinforcement learning. San Rafael, CA: Morgan and Claypool; 2010.10.2200/S00268ED1V01Y201005AIM009Search in Google Scholar

16. Zhang J, Bareinboim E. Transfer learning in multi-armed bandits: A causal approach. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17). Minneapolis, MN. 2017.10.24963/ijcai.2017/186Search in Google Scholar

17. Bareinboim E, Pearl J. Causal inference by surrogate experiments: *z*-identifiability. In: de Freitas N, Murphy K, editors. Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence. Corvallis, OR: AUAI Press; 2012. p. 113–20.Search in Google Scholar

18. Bareinboim E, Pearl J. Causal inference and the data-fusion problem. Proc Natl Acad Sci. 2016;113:7345–52.10.1073/pnas.1510507113Search in Google Scholar PubMed PubMed Central

**Published Online:**2019-02-28

**Published in Print:**2019-04-26

© 2019 Walter de Gruyter GmbH, Berlin/Boston