# Simple yet sharp sensitivity analysis for unmeasured confounding

Jose M. Peña

## Abstract

We present a method for assessing the sensitivity of the true causal effect to unmeasured confounding. The method requires the analyst to set two intuitive parameters. Otherwise, the method is assumption free. The method returns an interval that contains the true causal effect and whose bounds are arbitrarily sharp, i.e., practically attainable. We show experimentally that our bounds can be tighter than those obtained by the method of Ding and VanderWeele, which, moreover, requires to set one more parameter than our method. Finally, we extend our method to bound the natural direct and indirect effects when there are measured mediators and unmeasured exposure–outcome confounding.

MSC 2010: 62D20

## 1 Introduction

Unmeasured confounding may bias the estimation of the true causal effect. One way to address this problem is through sensitivity analysis, i.e., reporting one or several intervals that include the true causal effect and whose bounds are functions of certain sensitivity parameter values provided by the analyst. These parameters are usually meant to quantify the association of the unmeasured confounders with the exposure and outcome. The previous work [1], hereafter DV, proposed a method for sensitivity analysis that has received considerable attention, as evidenced by the survey work [2]. See also the follow-up works [3,4, 5,6]. The latter shows that DV’s interval bounds are not always sharp or attainable, i.e., logically possible.

In this work, we introduce a new method for sensitivity analysis. Our method requires the analyst to set two sensitivity parameters. This is one parameter less than DV’s method. Otherwise, our method is assumption free. We derive the feasible region for our parameters and show that, unlike DV’s, our interval’s bounds are arbitrarily sharp, i.e., practically attainable. Moreover, we show through simulations that our bounds can be tighter than DV’s. This suggests that it may be advantageous to combine DV’s and our method, by computing both sets of bounds and reporting the tightest of them.

Our sensitivity analysis method includes the parameter-free method proposed in ref. [6], hereafter AS, as a special case. The widest interval that our method can return is actually AS’ interval: It is returned when the analyst chooses the least informative values for our sensitivity parameters. In other words, our bounds are always tighter than AS’. Similar to refs [1,6], we only consider binary outcomes. AS’ bounds coincide with Manski’s bounds for binary outcomes [7]. Our bounds, on the other hand, can be seen as an adaptation of Manski’s bounds for nonbinary outcomes to binary outcomes. We elaborate further on this later.

The rest of the paper is organized as follows. Section 2 presents our method for sensitivity analysis of the risk ratio. Section 3 extends it to the risk difference. Section 4 extends our method to the risk ratio/difference conditioned or averaged over measured covariates. Sections 5 and 6 illustrate our method on real and simulated data. Section 7 considers the case where the effect of the exposure on the outcome is mediated by measured covariates, and our method is adapted to bound the natural direct and indirect effects under exposure–outcome confounding. Finally, Section 8 closes with some discussion.

## 2 Bounds on the risk ratio

Consider the causal graph to the left in Figure 1, where E denotes the exposure, D denotes the outcome, and U denotes the set of unmeasured confounders. Let E and D be binary random variables. For simplicity, we assume that U is a categorical random vector, but our results also hold for ordinal and continuous confounders. For simplicity, we treat U as a categorical random variable whose levels are the Cartesian product of the levels of the components of the original U . We use upper-case letters to denote random variables, and the same letters in lowercase to denote their values.

Figure 1

Causal graph where U is unmeasured.

The causal graph to the left in Figure 1 represents a nonparametric structural equation model with independent errors, which defines a joint probability distribution p ( D , E , U ) . We make the usual positivity assumption that if p ( U = u ) > 0 , then p ( E = e U = u ) > 0 , i.e., E is not a deterministic function of U , and thus, every individual in the subpopulations defined by the confounder can possibly be exposed or not exposed [8]. Then, the true risk ratio is defined as follows:

(1) RR true = p ( D 1 = 1 ) p ( D 0 = 1 ) ,

where D e denotes the counterfactual outcome when the exposure is set to level E = e . Since there is no confounding besides U , we have that D e E U for all e , and thus, we can write

RR true = u p ( D = 1 E = 1 , U = u ) p ( U = u ) u p ( D = 1 E = 0 , U = u ) p ( U = u ) ,

using first the law of total probability, then D e E U , and finally, the law of counterfactual consistency, i.e., E = e D e = D . This quantity is incomputable though. The observed risk ratio is defined as follows:

(2) RR obs = p ( D = 1 E = 1 ) p ( D = 1 E = 0 ) ,

which is computable. However, RR true and RR obs do not coincide in general. In this section, we give bounds on RR true in terms of the observed data distribution and two sensitivity parameters.

We start by noting that

(3) p ( D 1 = 1 ) = p ( D 1 = 1 E = 1 ) p ( E = 1 ) + p ( D 1 = 1 E = 0 ) p ( E = 0 ) = p ( D = 1 E = 1 ) p ( E = 1 ) + p ( D 1 = 1 E = 0 ) p ( E = 0 ) ,

where the second equality follows from counterfactual consistency, and likewise,

(4) p ( D 0 = 1 ) = p ( D 0 = 1 E = 1 ) p ( E = 1 ) + p ( D = 1 E = 0 ) p ( E = 0 ) .

If the analyst is able to confidently provide bounds on p ( D 1 = 1 E = 0 ) and p ( D 0 = 1 E = 1 ) , then these can be used together with the observed data distribution to bound RR true via equations (3) and (4). However, bounding counterfactual probabilities directly may be difficult in some domains, else the analyst might bound equation (1) directly. Therefore, we propose instead to bound them in terms of p ( D E , U ) . Specifically,

p ( D 1 = 1 E = 0 ) = u p ( D 1 = 1 E = 0 , U = u ) p ( U = u E = 0 ) = u p ( D = 1 E = 1 , U = u ) p ( U = u E = 0 ) max e , u p ( D = 1 E = e , U = u ) ,

where the second equality follows from D e E U for all e , and counterfactual consistency. Likewise,

p ( D 1 = 1 E = 0 ) min e , u p ( D = 1 E = e , U = u ) .

We believe that bounding the counterfactual probabilities by specifying these maximum and minimum probabilities may be easier in some domains than bounding the counterfactual probabilities directly, e.g., in domains where the identity of the confounders is known but their values are not, or in domains where neither the identity nor the values of the confounders are known but where there is a consensus on conservative estimates of the maximum and minimum probabilities (we elaborate further on conservative estimates later). This is also the motivation behind DV’s method, as it requires the analyst to quantify the relationship between E and U and the relationship between U and D . See Appendix B for a recap of DV’s sensitivity analysis.

Now, let us define

M = max e , u p ( D = 1 E = e , U = u ) ,

and

m = min e , u p ( D = 1 E = e , U = u ) .

Then,

(5) p ( D = 1 , E = 1 ) + p ( E = 0 ) m p ( D 1 = 1 ) p ( D = 1 , E = 1 ) + p ( E = 0 ) M ,

and

(6) p ( D = 1 , E = 0 ) + p ( E = 1 ) m p ( D 0 = 1 ) p ( D = 1 , E = 0 ) + p ( E = 1 ) M .

Therefore, combining equations (1), (5), and (6), we have that

(7) LB RR true UB ,

where

LB = p ( D = 1 , E = 1 ) + p ( E = 0 ) m p ( D = 1 , E = 0 ) + p ( E = 1 ) M ,

and

UB = p ( D = 1 , E = 1 ) + p ( E = 0 ) M p ( D = 1 , E = 0 ) + p ( E = 1 ) m .

M and m are two sensitivity parameters whose values the analyst has to specify. By definition, these values must lie in the interval [ 0 , 1 ] and M m . The observed data distribution constrains the valid values further. To see it, note that

p ( D = 1 E = e ) = u p ( D = 1 E = e , U = u ) p ( U = u E = e ) M ,

for all e , and likewise,

p ( D = 1 E = e ) m .

Let us define

M = max e p ( D = 1 E = e ) ,

and

m = min e p ( D = 1 E = e ) .

Then,

M M ,

and

m m .

We can thus define the feasible region for M and m as M M 1 and 0 m m .

We close this section with some observations about the bounds LB and UB . Theorem 1 in Appendix A shows that the bounds are arbitrarily sharp, meaning that there is a distribution arbitrarily close to the observed data distribution (and, thus, indistinguishable in practice on the basis of a finite sample) for which RR true and LB or UB are arbitrarily close. So, the bounds are arbitrarily close to being logically possible. Recall that DV’s bounds are not always logically possible [6]. Note also that

LB p ( E = 1 ) M + p ( E = 0 ) m p ( E = 0 ) m + p ( E = 1 ) M = 1 ,

and likewise, UB 1 . Thus, our interval in equation (7) always includes the null causal effect RR true = 1 . Since our bounds are arbitrarily sharp, the null causal effect is practically attainable whenever our lower or upper bound equals 1. DV’s interval does not necessarily include the null causal effect. However, when DV’s lower or upper bound equals 1, the null causal effect is also attainable [6]. The fact that our interval always includes the null causal effect and DV’s may not does not mean that the latter are always closer to RR true , as the experiments in Section 6 show.

DV’s method requires the analyst to describe the relationship between E and U with two sensitivity parameters and the relationship between U and D with one parameter, whereas our method requires the analyst to describe only the relationship between U and D with two parameters. Therefore, our method has one parameter less than DV’s method. As a consequence of not describing the relationship between E and U , our interval always includes the null causal effect. In other words, the undescribed relationship between E and U may be so strong as to nullify the causal effect, i.e., explain away the observed association between E and D . If we define the variation in p ( D = 1 E , U ) as M m , then the pair of values M = M and m = m can be interpreted as the minimum variation in p ( D = 1 E , U ) that is needed to nullify the causal effect, regardless of the relationship between E and U . This resembles the interpretation of DV’s E -value, which is precisely defined as the minimum values of DV’s parameters that nullify the causal effect. See Appendix B for a recap of DV’s sensitivity analysis. There is, however, a major difference between both interpretations. DV’s parameter values that are smaller than the E -value are insufficient to nullify the causal effect. There is no analogous result for our method, since we cannot consider less variation in p ( D = 1 E , U ) than M m . In other words, no parameter values are insufficient to nullify the causal effect because our interval always includes the null causal effect in order to be valid regardless of the relationship between E and U .

Note that LB is decreasing in M and increasing in m , while the opposite is true for UB . Therefore, using conservative estimates of M and m (i.e., a value larger than the true M value and a value smaller than the true m value) results in a wider interval that still contains RR true . Note also that AS’ bounds are a special case of our bounds when M = 1 and m = 0 . See Appendix C for a recap of AS’ sensitivity analysis. Therefore, our bounds are always tighter than AS’, because our interval is widest when M = 1 and m = 0 .

Note also that if RR obs 1 , then M = p ( D = 1 E = 1 ) and m = p ( D = 1 E = 0 ) , and thus, LB = 1 when we set M = M and m = m . Likewise, UB = 1 if RR obs 1 and we set M = M and m = m .

Finally, our method requires to specify two probabilities, whereas DV’s method requires to specify three probability ratios. Which set of parameter values the analyst finds easier to specify may well depend on the domain under study. So, we will not argue in favor of either of them. However, we do want to describe a hypothetical scenario where setting our parameters may be easier. Let the causal graph to the left in Figure 1 model the effect of exercise ( E ) on cholesterol ( D ) when confounded by junior vs. senior age ( U ). The three random variables are binary, and U is unmeasured. Suppose that, although the exact probabilities are unknown, it is known that p ( D = 1 E = 1 , U = u ) < p ( D = 1 E = 0 , U = u ) for u { 0 , 1 } , and p ( D = 1 E = e , U = 1 ) > p ( D = 1 E = e , U = 0 ) for e { 0 , 1 } . In other words, exercise decreases the probability of high cholesterol among juniors and seniors, and seniority increases the probability of high cholesterol among those who do not do and do exercise. In other words, these relationships show no qualitative effect modification by one covariate when keeping the other fixed. The work [9] argues that such relationships are common in epidemiology. Then, our sensitivity parameters simplify to M = p ( D = 1 E = 0 , U = 1 ) and m = p ( D = 1 E = 1 , U = 0 ) , whereas DV’s parameter RR UD reduces to

RR UD = max p ( D = 1 E = 0 , U = 1 ) p ( D = 1 E = 0 , U = 0 ) , p ( D = 1 E = 1 , U = 1 ) p ( D = 1 E = 1 , U = 0 ) .

Suppose that most juniors exercise, whereas most seniors do not. Then, the analyst may find easier setting our parameters than DV’s, since the latter involves speculating about the rare cases of seniors who exercise and juniors who do not.

## 3 Bounds on the risk difference

The previous works [1,6] show that their bounds on the risk ratio can be adapted to bound the risk difference. Ours can also be adapted, as we show next. The true risk difference is defined as follows:

RD true = p ( D 1 = 1 ) p ( D 0 = 1 ) = u p ( D = 1 E = 1 , U = u ) p ( U = u ) u p ( D = 1 E = 0 , U = u ) p ( U = u ) .

Therefore, combining equations (5) and (6), we have that

(8) LB RD true UB ,

with

LB = p ( D = 1 , E = 1 ) + p ( E = 0 ) m p ( D = 1 , E = 0 ) p ( E = 1 ) M ,

and

UB = p ( D = 1 , E = 1 ) + p ( E = 0 ) M p ( D = 1 , E = 0 ) p ( E = 1 ) m .

Theorem 2 in Appendix A shows that our bounds for RD true are arbitrarily sharp. Finally, see Appendix D for an account of the relationship between our bounds and Manski’s bounds [7].

## 4 Conditioning and averaging over measured covariates

So far, our results have concerned the whole population. However, they also hold for the subpopulation C = c , where C is a set of measured covariates, provided that the causal graph to the left in Figure 1 is valid in that subpopulation. Note that U previously represented all the confounders between E and D , while it now represents all the confounders for the subpopulation C = c . To adapt our results to the subpopulation C = c , it suffices to condition on C = c in all the previous expressions. For instance, the true risk ratio in the subpopulation C = c is defined as follows:

(9) RR c true = p ( D 1 = 1 C = c ) p ( D 0 = 1 C = c ) .

Arguing as before, we have that

LB c RR c true UB c ,

where

LB c = p ( D = 1 , E = 1 C = c ) + p ( E = 0 C = c ) m c p ( D = 1 , E = 0 C = c ) + p ( E = 1 C = c ) M c ,

and

UB c = p ( D = 1 , E = 1 C = c ) + p ( E = 0 C = c ) M c p ( D = 1 , E = 0 C = c ) + p ( E = 1 C = c ) m c ,

with sensitivity parameters

M c = max e , u p ( D = 1 E = e , U = u , C = c ) ,

and

m c = min e , u p ( D = 1 E = e , U = u , C = c ) ,

and feasible region M c M c 1 and 0 m c m c , where

M c = max e p ( D = 1 E = e , C = c ) ,

and

m c = min e p ( D = 1 E = e , C = c ) .

Finally, we show that RR true can be bounded as follows:

(10) min c LB c RR true max c UB c

by averaging over C in the numerator and denominator of equation (9). Specifically, assume for simplicity that C is categorical. Then,

RR true = p ( D 1 = 1 ) p ( D 0 = 1 ) = c p ( D 1 = 1 C = c ) p ( C = c ) c p ( D 0 = 1 C = c ) p ( C = c ) = c RR c true p ( D 0 = 1 C = c ) p ( C = c ) c p ( D 0 = 1 C = c ) p ( C = c ) ,

which implies the desired result. Analogous results can be derived for the true risk difference. We omit the details. These derivations have previously been reported for DV’s bounds [1, eAppendix 2.5]. We include them here for completeness. Which of the bounds in equations (7) and (10) is tightest depends of the sensitivity parameter values chosen. Of course, the analyst has to set more parameters in the latter case. In some domains, it may be reasonable to assume that some parameters are approximately constant across subpopulations.

## 5 Real data example

In this section, we illustrate our method for sensitivity analysis on the real data provided by ref. [10]. This work studied the association between smoking and mortality. The works [1,6] also used these data to illustrate their methods. Specifically, we use the same data as ref. [6], which correspond to the association between smoking and total mortality, and for which RR obs = 1.28 . See ref. [6] for a detailed description of the data. We extend the R code provided in ref. [6] with our method. The resulting code is available https://www.dropbox.com/s/4rxfux9tt95ldjz/sensitivityAnalysis3.R?dl=0.

Table 1 displays our interval for different M and m values in the feasible region 0.12 = M M 1 and 0 m m = 0.1 . Figure 2 complements the table with the contour plot of LB as a function of M and m . A similar plot can be produced for UB . An analyst can use the table and plot to determine a lower and/or upper bound for RR true , given plausible values of M and m . The table and plot illustrate some of the observations made before. Specifically, the null causal effect RR true = 1 is included in all the intervals. The lower bound of the intervals is decreasing in M and increasing in m , where the opposite is true for the upper bound. The narrowest interval is achieved when M = M and m = m , and the widest when M = 1 and m = 0 . The lower bound of the narrowest interval is 1, because RR obs 1 . Moreover, all the bounds in the table and plot are arbitrarily sharp (see Theorem 1 in Appendix A).

Table 1

Intervals for different values of the sensitivity parameters M and m in the feasible region M M 1 and 0 m m . Recall that M = max e , u p ( D = 1 E = e , U = u ) , m = min e , u p ( D = 1 E = e , U = u ) , M = max e p ( D = 1 E = e ) , and m = min e p ( D = 1 E = e ) . In this case, M = 0.12 and m = 0.1

M
0.12 0.34 0.56 0.78 1
m 0.1 (1.00, 1.28) (0.41, 1.76) (0.26, 2.25) (0.19, 2.73) (0.15, 3.22)
0.07 (0.96, 1.59) (0.39, 2.19) (0.25, 2.79) (0.18, 3.40) (0.14, 4.00)
0.05 (0.91, 2.10) (0.37, 2.89) (0.23, 3.69) (0.17, 4.49) (0.13, 5.29)
0.02 (0.87, 3.09) (0.35, 4.27) (0.22, 5.45) (0.16, 6.63) (0.13, 7.80)
0 (0.82, 5.90) (0.34, 8.15) (0.21, 10.4) (0.15, 12.6) (0.12, 14.9)
Figure 2

Contour plot of the lower bound LB as a function of the sensitivity parameters M and m in the feasible region M M 1 and 0 m m . Recall that M = max e , u p ( D = 1 E = e , U = u ) , m = min e , u p ( D = 1 E = e , U = u ) , M = max e p ( D = 1 E = e ) , and m = min e p ( D = 1 E = e ) . In this case, M = 0.12 and m = 0.1 .

## 6 Simulated experiments

The previous work [6] compares DV’s and AS’ bounds through simulations. In this section, we add our bounds to the comparison by extending the R code provided in ref. [6]. The resulting code is available https://www.dropbox.com/s/4rxfux9tt95ldjz/sensitivityAnalysis3.R?dl=0. Therefore, we follow [6] and consider a single binary confounder U , and generate distributions p ( D , E , U ) from the model

p ( E = 1 ) = expit ( ϕ ) p ( U = 1 E ) = expit ( α + β E ) p ( D = 1 E , U ) = expit ( γ + δ E + ψ U ) ,

where expit ( x ) = 1 / ( 1 + exp ( x ) ) is the inverse logit (a.k.a. logistic) function, and { β , δ , ψ } are independently distributed as N ( 0 , σ 2 ) . We consider σ = 1 , 3 in the experiments. Note that σ determines the probability of having confounding and causal effects of large magnitude. When this occurs, the sensitivity parameters may take large values, which results in wide intervals. In the experiments, the parameters { ϕ , α , γ } are set to obtain certain marginal probabilities { p ( U = 1 ) , p ( E = 1 ) , p ( D = 1 ) } specified below. For each combination of parameters, we generate 1,000 distributions p ( D , E , U ) from the aforementioned model.

Tables 2 and 3 summarize the results. Our bounds are more conservative than DV’s but less than AS’, as it can be seen in the columns Δ ˜ , Δ ¯ , and Δ . DV’s bounds are usually tighter than AS’ and ours, as it can be appreciated in the columns p ˜ and p ¯ . However, our bounds are tighter than DV’s in a substantial fraction of the runs for some settings, e.g., see p ¯ for the upper bound with σ = 3 . We do not compare AS’ and our bounds directly because, as discussed earlier, our bounds are always tighter than AS’. A plausible explanation of why DV’s bounds are usually tighter than ours is that the former include information about the association between E and U through one of the sensitivity parameters, while the latter do not. A plausible explanation of why our bounds are sometimes tighter than DV’s is the following: When the confounding and causal effects are large in magnitude (something that is more likely to occur with σ = 3 than with σ = 1 ), DV’s parameters may take large values and, thus, DV’s intervals may be too wide, since DV’s bounds are not always sharp. This is less of a problem for our intervals as they cannot be arbitrarily wide, because our bounds are arbitrarily sharp. Therefore, if possible, it may be advantageous to compute both DV’s and our bounds and report the tightest of them.

Table 2

Simulation results with σ = 1 . p ˜ and p ¯ are the proportions of times that AS’ bounds and our bounds are tighter than DV’s bounds, respectively. Δ ˜ , Δ ¯ and Δ are the mean absolute distance between the log of the bound and the log of the true risk ratio for AS’ bounds, our bounds and DV’s bounds, respectively

Lower bound Upper bound
p ( U = 1 ) p ( E = 1 ) p ( D = 1 ) p ˜ p ¯ Δ ˜ Δ ¯ Δ p ˜ p ¯ Δ ˜ Δ ¯ Δ
0.05 0.05 0.05 0 0.10 3.67 0.79 0.15 0 0.16 3.04 0.71 0.17
0.05 0.05 0.20 0 0.12 3.18 0.62 0.13 0 0.16 1.71 0.59 0.15
0.05 0.20 0.05 0 0.10 3.23 0.73 0.13 0 0.15 3.10 0.72 0.17
0.05 0.20 0.20 0 0.12 2.22 0.59 0.13 0 0.14 1.76 0.58 0.13
0.20 0.05 0.05 0 0.06 3.68 0.78 0.14 0 0.13 3.05 0.67 0.17
0.20 0.05 0.20 0 0.08 3.18 0.65 0.12 0 0.15 1.71 0.59 0.16
0.20 0.20 0.05 0 0.04 3.26 0.77 0.14 0 0.10 3.07 0.69 0.17
0.20 0.20 0.20 0 0.06 2.23 0.63 0.13 0 0.14 1.71 0.55 0.14
Table 3

Simulation results with σ = 3 . p ˜ and p ¯ are the proportions of times that AS’ bounds and our bounds are tighter than DV’s bounds, respectively. Δ ˜ , Δ ¯ and Δ are the mean absolute distance between the log of the bound and the log of the true risk ratio for AS’ bounds, our bounds and DV’s bounds, respectively

Lower bound Upper bound
p ( U = 1 ) p ( E = 1 ) p ( D = 1 ) p ˜ p ¯ Δ ˜ Δ ¯ Δ p ˜ p ¯ Δ ˜ Δ ¯ Δ
0.05 0.05 0.05 0.00 0.14 3.78 1.97 0.59 0.13 0.33 3.39 1.78 0.82
0.05 0.05 0.20 0.01 0.17 3.18 1.43 0.50 0.21 0.36 2.16 1.44 0.70
0.05 0.20 0.05 0.01 0.18 3.60 1.87 0.50 0.05 0.24 3.61 1.88 0.77
0.05 0.20 0.20 0.05 0.22 2.35 1.34 0.56 0.12 0.25 2.24 1.46 0.58
0.20 0.05 0.05 0.00 0.11 3.93 2.07 0.59 0.12 0.34 3.44 1.67 0.78
0.20 0.05 0.20 0.00 0.13 3.25 1.60 0.49 0.25 0.40 2.19 1.47 0.78
0.20 0.20 0.05 0.00 0.12 3.81 2.06 0.59 0.05 0.28 3.57 1.78 0.76
0.20 0.20 0.20 0.03 0.17 2.42 1.49 0.56 0.19 0.38 2.13 1.40 0.72

The experiments mentioned earlier assume that the analyst knows the true sensitivity parameter values, which is rarely the case. More realistic experiments make use of parameter values that are more conservative than the true values. Following ref. [6], we repeat the aforementioned experiments using DV’s parameter values that are 15% larger than the true values. Likewise, we use M and m values that are, respectively, 15% larger and 15% smaller than the true values. As discussed in Section 2, this should make our bounds more conservative. Likewise for DV’s bounds [6]. Tables 4 and 5 report exactly how much more conservative the bounds become. Specifically, the columns Δ ¯ and Δ show that the bounds are slightly more conservative than before but not much, which leads us to conclude that neither DV’s nor our bounds are overly sensitive to conservative estimates of the parameters. Still, our bounds are tighter than DV’s in a considerable fraction of the runs, as shown in the column p ¯ in Tables 4 and 5. That p ¯ is slightly smaller in these tables than in Tables 2 and 3 can arguably be attributed to the experimental setting being advantageous for DV’s bounds. Our argument is as follows. One of DV’s sensitivity parameters (see Appendix B) is expressed as follows:

RR UD = max e max u p ( D = 1 E = e , U = u ) min u p ( D = 1 E = e , U = u ) .

Then, RR UD M m . Consider those simulations, where RR UD = M m . In those simulations, M and m are replaced by the conservative estimates 1.15 M and 0.85 m to compute our bounds. So, M m corresponds to 1.15 0.85 M m in those simulations. Thus, one may argue that RR UD should be replaced by 1.15 0.85 RR UD to compute DV’s bounds in those simulations. However, it is replaced by the less conservative 1.15 RR UD . Alternatively, one may argue that using 1.15 RR UD in those simulations corresponds to using 1.15 M and m , instead of the more conservative 0.85 m .

Table 4

Simulation results with σ = 1 , and parameter values that are 15% more conservative than the true values. p ˜ and p ¯ are the proportions of times that AS’ bounds and our bounds are tighter than DV’s bounds, respectively. Δ ˜ , Δ ¯ and Δ are the mean absolute distance between the log of the bound and the log of the true risk ratio for AS’ bounds, our bounds and DV’s bounds, respectively

Lower bound Upper bound
p ( U = 1 ) p ( E = 1 ) p ( D = 1 ) p ˜ p ¯ Δ ˜ Δ ¯ Δ p ˜ p ¯ Δ ˜ Δ ¯ Δ
0.05 0.05 0.05 0 0.07 3.67 0.95 0.23 0.00 0.13 3.04 0.85 0.26
0.05 0.05 0.20 0 0.08 3.18 0.78 0.21 0.01 0.12 1.71 0.73 0.23
0.05 0.20 0.05 0 0.06 3.23 0.88 0.21 0.00 0.12 3.10 0.86 0.26
0.05 0.20 0.20 0 0.08 2.22 0.74 0.21 0.00 0.10 1.76 0.72 0.21
0.20 0.05 0.05 0 0.00 3.68 0.94 0.22 0.00 0.10 3.05 0.81 0.26
0.20 0.05 0.20 0 0.01 3.18 0.80 0.20 0.01 0.10 1.71 0.73 0.24
0.20 0.20 0.05 0 0.01 3.26 0.92 0.22 0.00 0.08 3.07 0.83 0.25
0.20 0.20 0.20 0 0.01 2.23 0.78 0.21 0.00 0.09 1.71 0.69 0.23
Table 5

Simulation results with σ = 3 , and parameter values that are 15% more conservative than the true values. p ˜ and p ¯ are the proportions of times that AS’ bounds and our bounds are tighter than DV’s bounds, respectively. Δ ˜ , Δ ¯ and Δ are the mean absolute distance between the log of the bound and the log of the true risk ratio for AS’ bounds, our bounds and DV’s bounds, respectively

Lower bound Upper bound
p ( U = 1 ) p ( E = 1 ) p ( D = 1 ) p ˜ p ¯ Δ ˜ Δ ¯ Δ p ˜ p ¯ Δ ˜ Δ ¯ Δ
0.05 0.05 0.05 0.01 0.14 3.78 2.10 0.71 0.14 0.30 3.39 1.91 0.94
0.05 0.05 0.20 0.01 0.15 3.18 1.56 0.61 0.27 0.37 2.16 1.55 0.82
0.05 0.20 0.05 0.01 0.17 3.60 2.01 0.62 0.06 0.23 3.61 2.02 0.89
0.05 0.20 0.20 0.06 0.21 2.35 1.45 0.67 0.14 0.25 2.24 1.58 0.69
0.20 0.05 0.05 0.00 0.07 3.93 2.20 0.70 0.14 0.32 3.44 1.81 0.90
0.20 0.05 0.20 0.00 0.08 3.25 1.72 0.61 0.29 0.39 2.19 1.59 0.90
0.20 0.20 0.05 0.01 0.09 3.81 2.19 0.70 0.06 0.25 3.57 1.91 0.88
0.20 0.20 0.20 0.04 0.13 2.42 1.61 0.68 0.21 0.35 2.13 1.52 0.84

## 7 Bounds for mediation

So far, we have focused on bounding the total causal effect of the exposure E on the outcome D . However, if the relationship between E and D is mediated by some measured covariates Z , then it may also be interesting to bound the natural direct and indirect effects. This section adapts our sensitivity analysis method for this purpose. Specifically, we have previously considered the causal graph to the left in Figure 1. We now consider the refined causal graph to the right in the figure. Note that there is unmeasured exposure–outcome confounding ( U ), but no unmeasured exposure–mediator or mediator–outcome confounding.[1] We defer the study of the latter two cases to a future work. The previous work [3] adapted DV’s method to bound the natural direct and indirect effects under unmeasured mediator–outcome confounding but no unmeasured exposure–mediator or exposure–outcome confounding, which is always justified when the exposure is randomly assigned. In nonrandomized studies like our work, the type of unmeasured confounding assumed can be justified only by substantive knowledge.

As before, let D and E be binary, and Z and U be categorical. The true natural direct effect is defined as follows:

RR NDE true = p ( D 1 Z 0 = 1 ) p ( D 0 Z 0 = 1 ) ,

where Z e denotes the counterfactual value of the mediator when the exposure is set to level E = e , and D e Z e denotes the counterfactual outcome when the exposure and mediator are set to levels E = e and Z e , respectively. Note that the causal graph to the right in Figure 1 implies cross-world counterfactual independence, i.e., D e z Z e for all e , e and z . The work [11] shows that we can then write

RR NDE true = z p ( D 1 z = 1 ) p ( Z 0 = z ) z p ( D 0 z = 1 ) p ( Z 0 = z ) .

Since there is no exposure–mediator confounding, we have that Z e E for all e and, thus, we can write

(11) RR NDE true = z p ( D 1 z = 1 ) p ( Z = z E = 0 ) z p ( D 0 z = 1 ) p ( Z = z E = 0 ) ,

using the law of counterfactual consistency. Since there is no unmeasured confounding besides U , we have that D e z ( E , Z ) U for all e and z and, thus, we can write

RR NDE true = z u p ( D = 1 E = 1 , Z = z , U = u ) p ( U = u ) p ( Z = z E = 0 ) z u p ( D = 1 E = 0 , Z = z , U = u ) p ( U = u ) p ( Z = z E = 0 ) ,

using first the law of total probability, then D e z ( E , Z ) U and, finally, the law of counterfactual consistency. For the previous quantity to be well defined, we make the positivity assumption that if p ( U = u ) > 0 , then p ( E = e U = u ) > 0 and p ( Z = z E = e ) > 0 . Still, the previous quantity is incomputable. We give below bounds on RR NDE true in terms of the observed data distribution and two sensitivity parameters.

We start by noting that

p ( D 1 z = 1 ) = p ( D 1 z = 1 E = 1 ) p ( E = 1 ) + p ( D 1 z = 1 E = 0 ) p ( E = 0 ) = p ( D 1 z = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( D 1 z = 1 E = 0 , Z = z ) p ( E = 0 ) = p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( D 1 z = 1 E = 0 , Z = z ) p ( E = 0 ) ,

where the second equality follows D e , z Z E for all e and z , and the third from counterfactual consistency. Moreover,

p ( D 1 z = 1 E = 0 , Z = z ) = u p ( D 1 z = 1 E = 0 , Z = z , U = u ) p ( U = u E = 0 , Z = z ) = u p ( D = 1 E = 1 , Z = z , U = u ) p ( U = u E = 0 , Z = z ) max e , z , u p ( D = 1 E = e , Z = z , U = u ) ,

where the second equality follows from D e z ( E , Z ) U for all e and z , and the third from counterfactual consistency. Likewise,

p ( D 1 z = 1 E = 0 , Z = z ) min e , z , u p ( D = 1 E = e , Z = z , U = u ) .

Now, let us define

M = max e , z , u p ( D = 1 E = e , Z = z , U = u ) ,

and

m = min e , z , u p ( D = 1 E = e , Z = z , U = u ) .

Then,

(12) p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) m p ( D 1 z = 1 ) p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) M .

Likewise,

p ( D 0 z = 1 ) = p ( D 0 z = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( D = 1 E = 0 , Z = z ) p ( E = 0 ) ,

and thus,

(13) p ( D = 1 E = 0 , Z = z ) p ( E = 0 ) + p ( E = 1 ) m p ( D 0 z = 1 ) p ( D = 1 E = 0 , Z = z ) p ( E = 0 ) + p ( E = 1 ) M .

Therefore, combining equations (11)–(13), we have that

(14) LB NDE RR NDE true UB NDE ,

where

LB NDE = z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) m ] p ( Z = z E = 0 ) z [ p ( D = 1 E = 0 , Z = z ) p ( E = 0 ) + p ( E = 1 ) M ] p ( Z = z E = 0 ) ,

and

UB NDE = z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) M ] p ( Z = z E = 0 ) z [ p ( D = 1 E = 0 , Z = z ) p ( E = 0 ) + p ( E = 1 ) m ] p ( Z = z E = 0 ) .

As mentioned earlier, M and m are two sensitivity parameters whose values has to be set by the analyst. The feasible region for these parameters is M M 1 and 0 m m , where

M = max e , z p ( D = 1 E = e , Z = z ) ,

and

m = min e , z p ( D = 1 E = e , Z = z ) .

Finally, the true natural indirect effect is defined as follows:

RR NIE true = p ( D 1 Z 1 = 1 ) p ( D 1 Z 0 = 1 ) .

Under the aforementioned assumptions, we can write

RR NDE true = z u p ( D = 1 E = 1 , Z = z , U = u ) p ( U = u ) p ( Z = z E = 1 ) z u p ( D = 1 E = 1 , Z = z , U = u ) p ( U = u ) p ( Z = z E = 0 ) .

Repeating the aforementioned reasoning, we can bound the incomputable RR NIE true in terms of the observed data distribution and the sensitivity parameters M and m as follows:

LB NIE RR NIE true UB NIE ,

where

LB NIE = z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) m ] p ( Z = z E = 1 ) z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) M ] p ( Z = z E = 0 )

and

UB NIE = z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) M ] p ( Z = z E = 1 ) z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) m ] p ( Z = z E = 0 ) .

Theorem 3 in Appendix A shows that our bounds for RR NDE true are arbitrarily sharp. One can prove the result for RR NIE true in much the same way. One can also extend our bounds to the risk difference scale and to conditioning and averaging over measured covariates. We omit the details.

## 8 Discussion

In this work, we have introduced a new method for assessing the sensitivity of the risk ratio to unmeasured confounding. Our method requires the analyst to set two intuitive parameters. Otherwise, our method makes no parametric or modelling assumptions about the causal relationships under consideration. The resulting bounds of the risk ratio are guaranteed to be arbitrarily sharp. Moreover, we have adapted our method to bound the risk difference and the natural direct and indirect effects, even when conditioning or averaging over measured covariates. We have illustrated our method on real data, and shown via simulations that it can produce tighter bounds than DV’s method [1]. Therefore, it may be a good practice to apply both methods and report the tightest bounds obtained. This presumes that the analyst knows the true sensitivity parameter values for both methods or, more realistically, some conservative estimates of them. Otherwise, there is no reason to prefer the tightest bounds, as they may exclude the true risk value. For the same reason, if the analyst can confidently produce conservative estimates for the one method but not for the other, then it may be sensible to just use the former method. Recall that our method requires to estimate two probabilities, whereas DV’s method requires to estimate three probability ratios. For which method the analyst can confidently produce conservative estimates may well depend on the domain under study. Therefore, we believe that our method and DV’s complement each other, combined or separately.

Our bounds on the natural direct and indirect effects assume that there is only unmeasured exposure–outcome confounding. In the future, we would like to extend them to other types of confounding.

## Acknowledgements

We thank the associate editor and reviewers for their comments, which helped us to improve our work. We gratefully acknowledge the financial support from the Swedish Research Council (ref. 2019-00245).

1. Conflict of interest: Author states no conflict of interest.

## Theorem 1

The bounds in equation (7) are arbitrarily sharp.

## Proof

Let the set { M , m , p ( D , E ) } represent the observed data distribution and sensitivity parameter values at hand. We assume that M and m belong to the feasible region. To show that the lower bound is arbitrarily sharp, we construct a distribution p ( D , E , U ) that marginalizes to the set { M , m , p ( D , E ) } such that (i) { M , m , p ( D , E ) } and { M , m , p ( D , E ) } are arbitrarily close, and (ii) LB and RR true are arbitrarily close.

• Let p ( E ) = p ( E ) .

• Let U be binary with p ( U = 1 E = 1 ) = p ( U = 0 E = 0 ) = 1 ε , where ε is an arbitrary number such that 0 < ε < 1 . The purpose of ε is to ensure that the positivity assumption holds.

• Let

p ( D = 1 E = 1 , U = 1 ) = p ( D = 1 E = 1 ) p ( D = 1 E = 1 , U = 0 ) = m p ( D = 1 E = 0 , U = 1 ) = M p ( D = 1 E = 0 , U = 0 ) = p ( D = 1 E = 0 ) .

Note that M max e p ( D = 1 E = e ) and m min e p ( D = 1 E = e ) , because M and m belong to the feasible region. Then, M = M and m = m . Note also that

p ( U = 1 ) = e p ( U = 1 E = e ) p ( E = e ) = ε p ( E = 0 ) + ( 1 ε ) p ( E = 1 ) ,

and thus, p ( U = 1 ) can be made arbitrarily close to p ( E = 1 ) by choosing ε sufficiently close to 0. Likewise,

p ( D = 1 E = 1 ) = u p ( D = 1 E = 1 , U = u ) p ( U = u E = 1 ) = ε m + ( 1 ε ) p ( D = 1 E = 1 ) ,

and thus, p ( D = 1 E = 1 ) can be made arbitrarily close to p ( D = 1 E = 1 ) by choosing ε sufficiently close to 0. Likewise, for p ( D = 1 E = 0 ) and p ( D = 1 E = 0 ) . Therefore, LB and RR true can be made arbitrarily close by choosing ε sufficiently close to 0:

LB = p ( D = 1 E = 1 ) p ( E = 1 ) + p ( E = 0 ) m p ( D = 1 E = 0 ) p ( E = 0 ) + p ( E = 1 ) M p ( D = 1 E = 1 ) p ( U = 1 ) + p ( U = 0 ) m p ( D = 1 E = 0 ) p ( U = 0 ) + p ( U = 1 ) M = p ( D = 1 E = 1 , U = 1 ) p ( U = 1 ) + p ( U = 0 ) p ( D = 1 E = 1 , U = 0 ) p ( D = 1 E = 0 , U = 0 ) p ( U = 0 ) + p ( U = 1 ) p ( D = 1 E = 0 , U = 1 ) = RR true .

That the upper bound is arbitrarily sharp can be proven analogously, after the swap p ( D = 1 E = 1 , U = 0 ) = M and p ( D = 1 E = 0 , U = 1 ) = m .□

## Theorem 2

The bounds in equation (8) are arbitrarily sharp.

## Proof

Consider the same distribution as in the proof of Theorem 1. To show that the lower bound is arbitrarily sharp, it suffices to note that LB and RD true can be made arbitrarily close by choosing ε sufficiently close to 0:

LB = [ m p ( D = 1 E = 0 ) ] p ( E = 0 ) + p ( E = 1 ) [ p ( D = 1 E = 1 ) M ] [ m p ( D = 1 E = 0 ) ] p ( U = 0 ) + p ( U = 1 ) [ p ( D = 1 E = 1 ) M ] = [ p ( D = 1 E = 1 , U = 0 ) p ( D = 1 E = 0 , U = 0 ) ] p ( U = 0 ) + p ( U = 1 ) [ p ( D = 1 E = 1 , U = 1 ) p ( D = 1 E = 0 , U = 1 ) ] = RD true .

That the upper bound is arbitrarily sharp can be proven analogously, after the swap p ( D = 1 E = 1 , U = 0 ) = M and p ( D = 1 E = 0 , U = 1 ) = m .□

## Theorem 3

The bounds in equation (14) are arbitrarily sharp.

## Proof

Let the set { M , m , p ( D , E , Z ) } represent the observed data distribution and sensitivity parameter values at hand. We assume that M and m belong to the feasible region. To show that the lower bound is arbitrarily sharp, we construct a distribution p ( D , E , U , Z ) that marginalizes to the set { M , m , p ( D , E , Z ) } , such that (i) { M , m , p ( D , E , Z ) } and { M , m , p ( D , E , Z ) } are arbitrarily close, and (ii) LB NDE and RR NDE true are arbitrarily close.

• Let p ( E , Z ) = p ( E , Z ) .

• Let U be binary with p ( U = 1 E = 1 , Z = z ) = p ( U = 0 E = 0 , Z = z ) = 1 ε for all z , where ε is an arbitrary number such that 0 < ε < 1 . The purpose of ε is to ensure that the positivity assumption holds.

• For all z , let

p ( D = 1 E = 1 , Z = z , U = 1 ) = p ( D = 1 E = 1 , Z = z ) p ( D = 1 E = 1 , Z = z , U = 0 ) = m p ( D = 1 E = 0 , Z = z , U = 1 ) = M p ( D = 1 E = 0 , Z = z , U = 0 ) = p ( D = 1 E = 0 , Z = z ) .

Note that M max e , z p ( D = 1 E = e , Z = z ) and m min e , z p ( D = 1 E = e , Z = z ) , because M and m belong to the feasible region. Then, M = M and m = m . Note also that

p ( U = 1 ) = e p ( U = 1 E = e ) p ( E = e ) = ε p ( E = 0 ) + ( 1 ε ) p ( E = 1 ) ,

and thus, p ( U = 1 ) can be made arbitrarily close to p ( E = 1 ) by choosing ε sufficiently close to 0. Likewise,

p ( D = 1 E = 1 , Z = z ) = u p ( D = 1 E = 1 , Z = z , U = u ) p ( U = u E = 1 , Z = z ) = ε m + ( 1 ε ) p ( D = 1 E = 1 , Z = z ) ,

and, thus, p ( D = 1 E = 1 , Z = z ) can be made arbitrarily close to p ( D = 1 E = 1 , Z = z ) by choosing ε sufficiently close to 0. Likewise, for p ( D = 1 E = 0 , Z = z ) and p ( D = 1 E = 0 , Z = z ) . Therefore, LB NDE and RR NDE true can be made arbitrarily close by choosing ε sufficiently close to 0:

LB NDE = z [ p ( D = 1 E = 1 , Z = z ) p ( E = 1 ) + p ( E = 0 ) m ] p ( Z = z E = 0 ) z [ p ( D = 1 E = 0 , Z = z ) p ( E = 0 ) + p ( E = 1 ) M ] p ( Z = z E = 0 ) z [ p ( D = 1 E = 1 , Z = z ) p ( U = 1 ) + p ( U = 0 ) m ] p ( Z = z E = 0 ) z [ p ( D = 1 E = 0 , Z = z ) p ( U = 0 ) + p ( U = 1 ) M ] p ( Z = z E = 0 ) = z [ p ( D = 1 E = 1 , Z = z , U = 1 ) p ( U = 1 ) + p ( U = 0 ) m ] p ( Z = z E = 0 ) z [ p ( D = 1 E = 0 , Z = z , U = 0 ) p ( U = 0 ) + p ( U = 1 ) M ] p ( Z = z E = 0 ) = RR NDE true ,

because p ( D = 1 E = 1 , Z = z , U = 0 ) = m and p ( D = 1 E = 0 , Z = z , U = 1 ) = M .

That the upper bound is arbitrarily sharp can be proven analogously, after the swap p ( D = 1 E = 1 , Z = z , U = 0 ) = M and p ( D = 1 E = 0 , Z = z , U = 1 ) = m .□

## B DV’s sensitivity analysis

The previous work [1] proves that RR true can be bounded in terms of RR obs and the sensitivity parameters RR UD , RR E 0 U , and RR E 1 U , whose values the analyst has to specify. Specifically and using the notation in ref. [6] for conciseness, ref. [1] proves that

RR obs / BF 1 RR true RR obs BF 0 ,

with

BF e = RR EeU RR UD RR EeU + RR UD 1 ,

where

RR UD = max e max u p ( D = 1 E = e , U = u ) min u p ( D = 1 E = e , U = u ) .

and

RR EeU = max u p ( U = u E = e ) p ( U = u E = 1 e ) .

Moreover, assume that RR obs > 1 . Otherwise, consider 1 / RR obs . The previous work [4] defines the E -value as follows:

E -value = min { RR E 1 U , RR UD } : BF 1 RR obs max { RR E 1 U , RR UD }

and show that

E -value = RR obs + RR obs ( RR obs 1 ) .

## C AS’ sensitivity analysis

The previous work [6] proposes the following parameter-free bounds of RR true in terms of RR obs and the observed data distribution:

RR obs / BF ˜ 1 RR true RR obs BF ˜ 0 ,

where

BF ˜ e = p ( D = 1 E = 1 e ) p ( E = 1 e ) + p ( E = e ) p ( D = 1 E = 1 e ) p ( E = e ) .

The previous work [6] also adapts the previous bounds to the risk difference scale:

(15) RD obs BF ˜ 1 RD true RD obs + BF ˜ 0 ,

where

RD obs = p ( D = 1 E = 1 ) p ( D = 1 E = 0 ) ,

and

BF ˜ e = p ( E = 1 e ) p ( D = 1 E = e ) + p ( E = e ) ( 1 p ( D = 1 E = 1 e ) ) .

## D Manski’s Sensitivity Analysis

The previous work [7] bounds RD true under the assumption that D 0 and D 1 take values in known intervals. The bounds apply to nonbinary outcomes. So, we momentarily drop the assumption that D is binary. The bounds in ref. [7] are derived as follows. Suppose it is known that D 1 takes value in the interval [ K 10 , K 11 ] . Then,

K 10 E [ D 1 E = 0 ] K 11 .

Consequently,

E [ D 1 ] = E [ D 1 E = 0 ] p ( E = 0 ) + E [ D 1 E = 1 ] p ( E = 1 ) ,

can be bounded as follows:

K 10 p ( E = 0 ) + E [ D E = 1 ] p ( E = 1 ) E [ D 1 ] K 11 p ( E = 0 ) + E [ D E = 1 ] p ( E = 1 ) ,

by counterfactual consistency. Analogous bounds can be derived for E [ D 0 ] under the assumption that D 0 takes values within the interval [ K 00 , K 01 ] . Consequently,

RD true = E [ D 1 ] E [ D 0 ] ,

can be bounded as follows:

(16) K 10 p ( E = 0 ) + E [ D E = 1 ] p ( E = 1 ) E [ D E = 0 ] p ( E = 0 ) K 01 p ( E = 1 ) RD true K 11 p ( E = 0 ) + E [ D E = 1 ] p ( E = 1 ) E [ D E = 0 ] p ( E = 0 ) K 00 p ( E = 1 ) .

When the outcome is binary, as in this work, D 0 and D 1 are definitionally bounded with K 00 = K 10 = 0 and K 01 = K 11 = 1 , and the bounds take a simpler form:

(17) p ( D = 1 E = 1 ) p ( E = 1 ) p ( D = 1 E = 0 ) p ( E = 0 ) p ( E = 1 ) RD true p ( E = 0 ) + p ( D = 1 E = 1 ) p ( E = 1 ) p ( D = 1 E = 0 ) p ( E = 0 ) .

Note that these bounds coincide with AS’ bounds (equation (15)), and thus, with our bounds when M = 1 and m = 0 (equation (8)).

Note also that, when D is binary, equations (8) and (16) coincide if we let M = K 01 = K 11 and m = K 00 = K 10 . Therefore, one may say that our bounds are an adaptation of Manski’s bounds in equation (16) to binary outcomes. An adaptation that retains the sensitivity parameters (unlike the direct application of Manski’s bounds to binary outcomes in equation (17)) albeit with a different meaning (they now bound p ( D E , U ) rather than the support of D 0 and D 1 ). Retaining the sensitivity parameters is important because, recall, it is due to these parameters that our bounds can be made tighter than AS’ and, thus, than those in equation (17).

## References

[1] Ding P, VanderWeele TJ. Sensitivity analysis without assumptions. Epidemiology. 2016;27:368–77. 10.1097/EDE.0000000000000457Search in Google Scholar PubMed PubMed Central

[2] Blum MR, Tan YJ, Ioannidis JPA. Use of E-values for addressing confounding in observational studies – an empirical assessment of the literature. Int J Epidemiol. 2020;49:1482–94. 10.1093/ije/dyz261Search in Google Scholar PubMed

[3] Ding P, VanderWeele TJ. Sharp sensitivity bounds for mediation under unmeasured mediator–outcome confounding. Biometrika. 2016;103:483–90. 10.1093/biomet/asw012Search in Google Scholar PubMed PubMed Central

[4] VanderWeele TJ, Ding P. Sensitivity analysis in observational research: introducing the e-value. Annals Internal Med. 2017;167:268–74. 10.7326/M16-2607Search in Google Scholar PubMed

[5] VanderWeele TJ, Ding P, Mathur M. Technical considerations in the use of the e-value. J Causal Infer. 2019;7:1–11. 10.1515/jci-2018-0007Search in Google Scholar

[6] Sjölander A. A note on a sensitivity analysis for unmeasured confounding, and the related e-value. J Causal Infer. 2020;8:229–48. 10.1515/jci-2020-0012Search in Google Scholar

[7] Manski CF. Nonparametric bounds on treatment effects. Am Economic Rev. 1990;80:319–23. Search in Google Scholar

[8] Hernán MA, Robins JM. Causal Inference: What If. Boca Raton: Chapman & Hall/CRC; 2020. Search in Google Scholar

[9] Ogburn EL, VanderWeele TJ. On the nondifferential misclassification of a binary confounder. Epidemiology 2012;23:433–9. 10.1097/EDE.0b013e31824d1f63Search in Google Scholar PubMed PubMed Central

[10] Hammond EC, Horn D. Smoking and death rates – report on forty four-months of follow-up of 187,783 men. J Am Med Assoc. 1958;166:1159–72, 1294–308. 10.1001/jama.1958.02990100047009Search in Google Scholar PubMed

[11] Pearl J. Direct and indirect effects. In Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence; 2001. p. 411–20. 10.1145/3501714.3501736Search in Google Scholar