# Novel bounds for causal effects based on sensitivity parameters on the risk difference scale

Arvid Sjölander and Ola Hössjer

# Abstract

Unmeasured confounding is an important threat to the validity of observational studies. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Recently, bounds have been proposed that are based on sensitivity parameters, which quantify the degree of unmeasured confounding on the risk ratio scale. These bounds can be used to compute an E-value, that is, the degree of confounding required to explain away an observed association, on the risk ratio scale. We complement and extend this previous work by deriving analogous bounds, based on sensitivity parameters on the risk difference scale. We show that our bounds can also be used to compute an E-value, on the risk difference scale. We compare our novel bounds with previous bounds through a real data example and a simulation study.

MSC 2010: 92B15

## 1 Introduction

The estimation of causal effects in observational (non-randomized) studies is often hampered by unmeasured confounding. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Such bounds have, for instance, been derived for causal effects in randomized trials with non-compliance [1], causal effects in the presence of truncation by death [2], controlled and natural direct effects [3,4] and causal interactions [5,6].

A common feature of these bounds is that they are typically “assumption-free,” in the sense that they make no parametric assumptions about the relations between observed and unobserved variables. As a consequence, the bounds are often relatively wide and may thus not be very informative. In contrast, Ding and VanderWeele (DV) [7] proposed to parametrize the degree of unmeasured confounding, using two sensitivity parameters that quantify the maximal strength of association between the exposure and the confounders, and between the outcome and the confounders. DV derived bounds for the causal exposure effect, as functions of the sensitivity parameters and the observed data distribution. By using subject matter knowledge to set the sensitivity parameters to plausible values, bounds are obtained that can be substantially narrower than the assumption-free bounds. Many other sensitivity analyses have been proposed in the literature, but these typically require rather special conditions, such as a single binary confounder [e.g., refs 8,9,10] or no exposure–confounder interaction [e.g., refs 11,12,13]; we refer to ref. [7] for a thorough review.

The sensitivity parameters proposed by DV are defined on the risk ratio scale. However, if the target causal effect is a risk difference, then it may be natural to specify the sensitivity parameters on the risk difference scale as well. In their eAppendix, DV [7] provided bounds for the causal risk difference using sensitivity parameters on the risk difference scale. However, these bounds are restricted to a categorical confounder with a known number of levels. This is an important limitation, since in practice one would almost never know the dimension of the unknown confounders, and these may also contain a combination of categorical and continuous variables. Furthermore, some subject matter experts may find it more intuitive to speculate about the degree of unmeasured confounding on the risk difference scale, regardless of the chosen scale for the target causal effect.

In this article we address these limitations. We derive bounds for causal effects that can be written as a contrast between the counterfactual probability of the outcome if the exposure is present and absent, respectively, for everybody in the population. The bounds that we derive are functions of sensitivity parameters that quantify the degree of unmeasured confounding on the risk difference scale.

The article is organized as follows. In Section 2, we establish basic notation, definitions and assumptions and define the target causal effect. In Section 3, we derive assumption-free bounds in our setting; these serve as a benchmark to which other bounds can be compared. In Section 4, we briefly review the key results of ref. [7], and in Section 5 we present our novel bounds. Both DV’s and our bounds are functions of several sensitivity parameters. In Section 6 we discuss how the parameter space can be reduced, to facilitate interpretation and communication. In Section 7, we illustrate the theory with a real data example on smoking and mortality, and in Section 8, we compare DV’s and our bounds in a small simulation study. We provide concluding remarks in Section 9.

## 2 Notation, definitions and assumptions

We adopt the notation of ref. [7], with some modifications. Let E and D denote the binary exposure and outcome, respectively. Let p ( E = e ) = p e denote the marginal probability of E = e and let p ( D = 1 E = e ) = q e denote the conditional probability of D = d , given E = e . Since p 0 + p 1 = 1 , the observed data distribution p ( D , E ) consists of three free parameters: p 1 , q 0 and q 1 . The statistical association between E and D is defined as some contrast between q 1 and q 0 , e.g.,

ψ = g ( q 1 ) g ( q 0 ) ,

where g is a monotone link function. The identity, log and logit links give the risk difference, log risk ratio and log odds ratio, respectively.

Let D ( e ) be the potential outcome for a given subject, with the exposure been set to level E = e for that subject [14,15]. The potential outcome D ( e ) is connected to the observed outcome D through the relation

(1) E = e D ( e ) = D ,

which is often referred to as “consistency” [15]. Let p { D ( e ) = 1 } = q e be the counterfactual probability of the outcome, with the exposure been set to E = e for everyone [14,15]. The target parameter is the causal effect of the exposure, which is defined as some contrast between q 1 and q 0 , e.g.,

(2) ψ = g ( q 1 ) g ( q 0 ) .

Generally, ψ ψ due to unmeasured confounding. Let U denote a set of unmeasured confounders, which (together with the measured covariates C ) is assumed to be sufficient for confounding control. Formally, we assume that

(3) D ( e ) E U ,

which is often referred to as “conditional (given U ) exchangeability” [15]. In terms of the joint distribution p ( D , E , U ) , the parameter q e is given by

(4) q e = E [ p { D ( e ) = 1 U } ] = E [ p { D ( e ) = 1 E = e , U } ] = E { p ( D = 1 E = e , U ) } ,

where the first equality follows from the law of total probability, the second from conditional exchangeability (3) and the third from consistency (1).

## 3 Assumption-free bounds

Manski [16] and Robins [17] derived assumption-free bounds for causal effects as follows. Using the law of total probability we can decompose q e into

(5) q e = p { D ( e ) = 1 E = e } p e + p { D ( e ) = 1 E = 1 e } p 1 e = q e p e + p { D ( e ) E = 1 e } p 1 e ,

where the second equality follows from consistency (1), cf. also ref. [18]. In this expression only p { D ( e ) E = 1 e } is unknown. Setting this counterfactual probability to its lower and upper limits, 0 and 1, gives

(6) q e p e q e q e p e + p 1 e .

These bounds are assumption-free in the sense that they are valid regardless of the relations between U and ( D , E ) ; we refer to these bounds as the “AF bounds.” Replacing q 0 and q 1 in (2) with their upper and lower bounds, respectively, gives a lower bound for ψ . Similarly, replacing q 0 and q 1 in (2) with their lower and upper bounds, respectively, gives an upper bound for ψ . For instance, the implied AF bounds for the causal risk difference q 1 q 0 are

q 1 p 1 ( q 0 p 0 + p 1 ) q 1 q 0 ( q 1 p 1 + p 0 ) q 0 p 0 .

The width of the AF bounds for the causal risk difference is constant and equal to

{ ( q 1 p 1 + p 0 ) q 0 p 0 } { q 1 p 1 ( q 0 p 0 + p 1 ) } = p 0 + p 1 = 1 ,

regardless of the observed data distribution. This is not the case for other effect measures, e.g., the causal risk ratio.

## 4 DV’s bounds

We notify the reader that several of the results reviewed below were not presented in the main text of ref. [7], but in their eAppendix. Some of the results were also not stated explicitly, but follow more or less implicitly from the derivations and arguments in that eAppendix.

DV proposed to compute bounds for q 0 , q 1 and ψ by specifying sensitivity parameters that quantify the maximal strength of association between E and U , and between U and D , respectively. Formally, these parameters are defined as

(7) RR E U e = max u p ( U = u E = e ) p ( U = u E = 1 e )

and

RR U D e = max u { p ( D = 1 E = e , U = u ) } min u { p ( D = 1 E = e , U = u ) } .

Sjölander [19] showed that, given p 1 , q 0 and q 1 , the parameters RR E U e and RR U D e may take any value in the range [ 1 , ) . Define

(8) B e e = RR E U e × RR U D e RR E U e + RR U D e 1 .

Ding and VanderWeele [7] showed (Section 5 of their eAppendix) that, given RR E U e and RR U D e , q e is bounded by

(9) q e ( p e + p 1 e / B e e ) q e q e ( p e + p 1 e B ( 1 e ) e ) .

We refer to the bounds in (9) as the “DV bounds.” By providing guesses of RR E U e and RR U D e for e { 0 , 1 } , the analyst can use the relation in (9) to compute bounds for q 0 and q 1 .

Notably, the bounds in (9) are not sharp. To see this, note that the upper bound is monotonically increasing in B ( 1 e ) e , which in turn is monotonically increasing in RR E U ( 1 e ) and RR U D e . When RR E U ( 1 e ) and RR U D e approach infinity, both B ( 1 e ) e and the upper bound for q e approach infinity as well. However, since q e is a probability it is logically bounded above by 1. In fact, Sjölander [19] showed that the DV bounds in (9) may occasionally be wider than the AF bounds in (6), even when the DV bounds do not exceed the a priori possible range [ 0 , 1 ] .

As noted by Sjölander [19], these problems can easily be solved by replacing the DV bounds with the AF bounds whenever the latter are narrower. We thus obtain the modified bounds

(10) max { q e p e , q e ( p e + p 1 e / B e e ) } q e min { q e p e + p 1 e , q e ( p e + p 1 e B ( 1 e ) e ) } ,

which we refer to as the “DVS bounds.”

As before, replacing q 0 and q 1 in (2) with their upper and lower bounds, respectively, gives a lower bound for ψ . Similarly, replacing q 0 and q 1 in (2) with their lower and upper bounds, respectively, gives an upper bound for ψ . For instance, the implied DVS bounds for the causal risk difference q 1 q 0 are

q 1 q 0 max { q 1 p 1 , q 1 ( p 1 + p 0 / B 11 ) } min { q 0 p 0 + p 1 , q 0 ( p 0 + p 1 B 10 ) } q 1 q 0 min { q 1 p 1 + p 0 , q 1 ( p 1 + p 0 B 01 ) } max { q 0 p 0 , q 0 ( p 0 + p 1 / B 00 ) } .

In the absence of unmeasured confounding, RR E U e = RR U D e = 1 for e { 0 , 1 } , so that B e e = 1 for ( e , e ) { 0 , 1 } and q e = q e for e { 0 , 1 } and ψ = ψ .

In many situations, one may observe a positive association between the exposure and the outcome ( q 1 > q 0 ), and one may wonder how much confounding is required to fully “explain away” this association. VanderWeele and Ding [20] labeled this degree of confounding as the “E-value.” Formally, VanderWeele and Ding [20] defined the E-value as the smallest common value RR E U 1 = RR U D 0 = RR U D 1 such that the lower bound for q 1 in (9) is equal to the upper bound for q 0 in (9). They showed that the E-value thus defined is equal to q 1 / q 0 + q 1 / q 0 ( q 1 / q 0 1 ) .

Ding and VanderWeele [7] also derived bounds that use sensitivity parameters on the risk difference scale (Section 6 of their eAppendix). However, these bounds are restricted to the causal risk difference, and they require that the confounder U is categorical with a known number of levels. In the special case when U is binary, the bounds are given by

(11) q 1 q 0 q 1 q 0 RD E U binary ( RD U D 0 binary p 1 + RD U D 1 binary p 0 )

(12) q 1 q 0 q 1 q 0 + RD E U binary ( RD U D 0 binary p 1 + RD U D 1 binary p 0 ) ,

where the sensitivity parameters RD E U binary and RD U D e binary are defined as

RD E U binary = p ( U = 1 E = 1 ) p ( U = 1 E = 0 )

and

RD U D e binary = p ( D = 1 E = e , U = 1 ) p ( D = 1 E = e , U = 0 ) .

We refer to the bounds in (11) and (12) as the “binary DV” bounds.

## 5 Novel bounds based on sensitivity parameters on the risk difference scale

Similar to DV we propose to compute bounds for q 0 , q 1 and ψ by specifying sensitivity parameters that quantify the maximal strength of association between E and U and between U and D . However, our parameters measure these associations on the risk difference scale rather than the risk ratio scale, and our bounds are not restricted to the causal risk difference and do not require the confounder U to be categorical with a known number of levels. We define

RD E U = max u { p ( E = e U = u ) } min u { p ( E = e U = u ) }

and

RD U D e = max u { p ( D = 1 E = e , U = u ) } min u { p ( D = 1 E = e , U = u ) } .

If E and U are independent, then p ( E = e U = u ) = p ( E = e ) , so that RD E U = 0 . Conversely, the larger the discrepancy between max u { p ( E = e U = u ) } and min u { p ( E = e U = u ) } , the larger the potential for a strong association between E and U . In this sense, RD E U measures the “maximal association” between E and U , on the risk difference scale. In a similar sense, RD U D e measures the maximal conditional association between D and U , given E .

The parameter RD U D e is an obvious analogue to RR U D e and a natural generalization of RD U D e binary . However, the parameter RD E U differs from RR E U e and RD E U binary in that it compares conditional probabilities on the form p ( E U ) rather than on the form p ( U E ) . This modification is mainly for technical reasons, to make the calculations of bounds feasible. However, specifying (functions of) p ( E U ) may also be easier for a subject matter expert than specifying (functions of) p ( U E ) , since the direction of causality goes from U to E , not the other way around. We note that RD E U is not a function of e , since

max u { p ( E = 1 U = u ) } min u { p ( E = 1 U = u ) } = max u { 1 p ( E = 0 U = u ) } min u { 1 p ( E = 0 U = u ) } = max u { p ( E = 0 U = u ) } min u { p ( E = 0 U = u ) } = min u { p ( E = 0 U = u ) } + max u { p ( E = 0 U = u ) } .

By definition, RD E U and RD U D e are restricted to the range [ 0 , 1 ] . However, when speculating about plausible values for these parameters it is also important to know whether the observed data further restricts the range of possible parameter values, or whether the parameters restrict each other. As a simple example of how such restrictions may arise, consider the multinominal distribution. Even though the individual cell probabilities in a multinomial distribution can be anywhere between 0 and 1, they cannot simultaneously be anywhere between 0 and 1, since they sum to 1. For instance, knowing that one cell probability is equal to 0.9 restricts the other cell probabilities to the range [0, 0.1]. In contrast, the mean and the variance in the normal distribution do not impose any such restrictions on each other. The following theorem, which we prove in Appendix A, establishes that the observed data distribution does not restrict the sensitivity parameters, and that the sensitivity parameters do not restrict each other; this property is referred to as “variation independence” [21,22].

## Theorem 1

RD E U , RD U D 0 and RD U D 1 are variation independent of each other, and of the observed data distribution parameters p 1 , q 0 and q 1 .

The following theorem, which we prove in Appendix B, shows how our proposed sensitivity parameters can be used to construct bounds for q e .

## Theorem 2

Given ( p e , q e , RD E U , RD U D e ) , q e is bounded by

(13) q e + min b e , d e max a e d e d e + p e , b e c e c e + p e q e q e + max b e , d e min b e d e d e + p e , a e c e c e + p e ,

where

a e = b e RD U D e c e = d e RD E U

and

max ( 0 , RD U D e q e ) b e min ( 1 q e , RD U D e ) max ( 0 , RD E U p e ) d e min ( 1 p e , RD E U ) .

We refer to the bounds in (13) as the SH bounds. Minimizing and maximizing the expressions in (13) with respect to ( b e , d e ) is a constrained optimization problem, which can be solved with standard software, e.g., the optim function in R. However, since unconstrained optimization is computationally simpler than constrained optimization, it may be preferable to transform ( b e , d e ) into unconstrained parameters. One example of such a transformation is given by

b e = logit b e min ( b e ) max ( b e ) min ( b e ) d e = logit d e min ( d e ) max ( d e ) min ( d e ) ,

with the inverse relations

b e = expit ( b e ) { max ( b e ) min ( b e ) } + min ( b e ) d e = expit ( d e ) { max ( d e ) min ( d e ) } + min ( d e ) ,

where ( b e , d e ) ( , ) , expit is the inverse of the logit function, and the limits ( min ( b e ) , max ( b e ) , min ( d e ) , max ( d e ) ) are given in Theorem 2.

As before, replacing q 0 and q 1 in (2) with their upper and lower bounds, respectively, gives a lower bound for ψ . Similarly, replacing q 0 and q 1 in (2) with their lower and upper bounds, respectively, gives an upper bound for ψ . For instance, the implied SH bounds for the causal risk difference q 1 q 0 are

q 1 q 0 q 1 + min b e , d e max a 1 d 1 d 1 + p 1 , b 1 c 1 c 1 + p 1 q 0 + max b e , d e min b 0 d 0 d 0 + p 0 , a 0 c 0 c 0 + p 0 , q 1 q 0 q 1 + max b e , d e min b 1 d 1 d 1 + p 1 , a 1 c 1 c 1 + p 1 q 0 + min b e , d e max a 0 d 0 d 0 + p 0 , b 0 c 0 c 0 + p 0 .

When setting the sensitivity parameters RD E U and RD U D e to the extreme value 0 (minimal confounding), the upper and lower limits of b e and d e collapse into 0, so that a e and c e become equal to 0 as well. Hence, both the lower and upper bounds in (13) become equal to q e , so that the statistical association ψ becomes equal to the causal effect ψ , as expected. At the other extreme, when setting RD E U and RD U D e to 1 (maximal confounding), the lower and upper limits of b e collapse into 1 q e and the lower and upper limits of d e collapse into 1 p e , so that a e and c e become equal to q e and p e , respectively. After a little algebra the bounds in (13) become equal to the AF bounds in (6). This is also expected, since the AF bounds can never be exceeded, regardless of the amount of confounding.

Using the bounds in (13) we may define the E-value analogously to ref. [20]. Specifically, for a positive association between E and D ( q 1 > q 0 ) we define the E-value as the smallest common value RD E U = RD U D 0 = RD U D 1 such that the lower bound for q 1 in (13) is equal to the upper bound for q 0 in (13). This E-value does not have a simple analytic expression, but it can easily be found numerically.

## 6 Reduction of the parameter space

Both the DVS bounds, the binary DV bounds and the SH bounds require specification of certain sensitivity parameters, intended to quantify the degree of unmeasured confounding. Specifying such parameters becomes challenging, even for a subject matter expert, if the number of parameters is large. Furthermore, one may often want to vary the sensitivity parameters over plausible ranges, and present the bounds as functions of the parameters within these ranges, to convey the sensitivity of the observed associations with various degrees of confounding. In order for such sensitivity analysis to be feasible and transparent, the number of sensitivity parameters has to be small.

Computing the DVS bounds for both q 0 and q 1 requires specification of four sensitivity parameters ( RR E U 0 , RR E U 1 , RR U D 0 and RR U D 1 ), whereas the binary DV bounds and the SH bounds require only three sensitivity parameters ( RD E U binary , RD U D 0 binary and RD U D 1 binary for the binary DV bounds, and RD E U , RD U D 0 and RD U D 1 for the SH bounds). This may be viewed as a relative advantage of the binary DV bounds and the SH bounds. However, even three sensitivity parameters may be awkward to handle from a practical perspective. To further reduce the dimensionality of the sensitivity parameter space one may replace RR E U 0 and RR E U 1 with

RR E U = max ( RR E U 0 , RR E U 1 ) ,

RR U D 0 and RR U D 1 with

RR U D = max ( RR U D 0 , RR U D 1 ) ,

RD U D 0 binary and RD U D 1 binary with

RD U D binary = max ( RD U D 0 binary , RD U D 1 binary )

and RD U D 0 and RD U D 1 with

RD U D = max ( RD U D 0 , RD U D 1 ) .

In fact, even though Ding and VanderWeele [7] used the parameters RR U D 0 , RR U D 1 , RD U D 0 binary and RD U D 1 binary in their derivations (e.g., in Sections 2.2 and 6 of their eAppendix), they replaced these with RR U D and RD U D binary when presenting their final result.

The replacements above reduce the number of sensitivity parameters to two, both for the DVS bounds, the binary DV bounds and the SH bounds. It is easy to show that the bounds remain valid under these replacements, but that they may become wider. For instance, define e = argmax e { RR U D e } . Replacing RR U D 0 and RR U D 1 with RR U D has no effect on the DVS bounds for q e , since RR U D e = RR U D by definition. However, the replacement widens the DVS bounds for q 1 e , and thus also the DVS bounds for any contrast ψ . This can be seen by noting that B e e is monotonically increasing in RR U D e for e { 0 , 1 } , and that the lower and upper bounds in (10) are monotonically decreasing and increasing in B e e and B ( 1 e ) e , respectively. Similar arguments can be made for the other replacements above.

## 7 Real data example

We use the same example as in ref. [19] to illustrate the theory. The data for this example are borrowed from the studies of Hammond and Horn [23,24,25], who studied the association between smoking and mortality. These authors carried out several different analyses; in particular, they found an extremely strong association between smoking and lung cancer. We focus here on the association between smoking and total mortality, which is more moderate. We provide R code for the analysis in Appendix C.

Hammond and Horn stratified all their analyses on age. We here re-analyze the data from their oldest stratum, which consists of 29,105 subjects who were between 65 and 69 years old at enrollment into the study. Among these, 6,287 subjects reported that they had never smoked. We consider these 6,287 subjects as being unexposed ( E = 0 ) and the remaining 22,818 subjects as being exposed ( E = 1 ). During a total follow-up period of 44 months, the number of subjects who died ( D = 1 ) were 613 and 2,837 among the unexposed and exposed subjects, respectively. These figures give p 1 = 22,818/29,105 = 0.784 , q 1 = 2,837/22,818 = 0.124 and q 0 = 613 /6,287 = 0.098 .

The risk difference for these data is q 1 q 0 = 0.027 ; hence, smoking appears to elevate the risk of death during follow-up with 2.7 percentage points. However, this statistical association is only adjusted for (e.g., stratified by) age and may thus be partly or fully explained by unmeasured confounding. To appreciate the maximum possible impact of confounding, we compute the AF bounds for the causal risk difference q 1 q 0 , which are equal to ( 0.708 , 0.292 ) . Hence, without making any assumptions about the degree of unmeasured confounding, the causal risk difference can be as small as 0.708 and as large as 0.292.

The AF bounds are very wide. To narrow down we may consider a range of plausible values for the degree of unmeasured confounding. The contour plots in the top row of Figure 1 illustrate the DVS lower (left panel) and upper (right panel) bounds for the causal risk difference, as functions of log RR E U and log RR U D . For a common value of the logarithmized sensitivity parameters up to 0.625 (indicated by the dashed lines in the left panel) the lower bound exceeds 0, thus implying the presence of a positive causal effect. The E-value, in terms of RR E U and RR U D , is thus equal to exp ( 0.625 ) = 1.868 .

### Figure 1

Top row: The DVS lower (left panel) and upper (right panel) bounds for the causal risk difference, as functions of log RR E U and log RR U D . Bottom row: The SH lower (left panel) and upper (right panel) bounds for the causal risk difference, as functions of RD E U and RD U D . The E-values are indicated by the dashed lines in the left panel.

The contour plots in the bottom row of Figure 1 illustrate the SH lower (left panel) and upper (right panel) bounds for the causal risk difference, as functions of RD E U and RD U D . The E-value, in terms of RD E U and RD U D , is equal to 0.13; up to this degree of unmeasured confounding the observed data imply the presence of a positive causal effect.

## 8 Simulation

We carried out a small simulation study to compare the AF, DVS, binary DV and SH bounds. We considered a categorical confounder U with K levels. We generated distributions p ( D , E , U ) from the model

r u exp ( 1 ) for u { 0 , , K 1 } p ( U = u ) = r u / u = 0 K 1 r u p ( E = 1 U = u ) Unif(0,1) for u { 0 , , K 1 } p ( D = 1 E = e , U = u ) Unif(0,1) for e { 0 , 1 } , u { 0 , , K 1 } .

Technically, p ( U = u ) has a Dirichlet distribution with K parameters all equal to 1. In this way, the distribution { p ( U = 0 ) , , p ( U = K 1 ) } is drawn uniformly from the K -dimensional unit simplex [26]. We initially considered a binary confounder, i.e., K = 2 . In this special case, p ( U = 1 ) and p ( U = 0 ) = 1 p ( U = 1 ) are drawn uniformly over the interval ( 0 , 1 ) . We generated 10,000 distributions p ( D , E , U ) . For each generated distribution we computed the AF, DVS, binary DV and SH bounds for the causal risk difference q 1 q 0 , and the AF, DVS and SH bounds for the causal risk ratio q 1 / q 0 . We used the true values of all sensitivity parameters in the computation of the bounds. The simulation was carried out twice, without and with the parameter reduction described in Section 6.

The top row of Figure 2 shows the width distribution of the bounds for the causal risk difference (left panel) and risk ratio (right panel), without parameter reduction. We observe that the DVS, binary DV and SH bounds are generally much narrower than the AF bounds. We further observe that the width distributions of the DVS and SH bounds are quite similar. Thus, with correct specification of their respective sensitivity parameters, the DVS and SH bounds seem to be equally informative. The binary DV bounds are generally narrower than both the DVS bounds and the SH bounds for the causal risk difference. This is not surprising, given that the binary DV bounds use the strong additional information/assumption that U is binary, and are only valid under this assumption.

### Figure 2

Width of the bounds for the causal risk difference (left panel) and the causal risk ratio (right panel), without and with parameter reduction (top row and bottom row, respectively).

The bottom row of Figure 2 shows the results with parameter reduction. We observe that the DVS, binary DV and SH bounds are now wider, but still much narrower than the AF bounds, in general. We further observe that the SH bounds tend to be narrower than the DVS bounds. This is expected, since the parameter reduction eliminates two sensitivity parameters from the DVS bounds, but only one sensitivity parameter from the SH bounds. Thus, one would expect that the parameter reduction generally has a stronger influence on the DVS bounds than on the SH bounds.

We finally repeated the simulation for each K in 2 , , 10 . For this part of the simulation we omitted the binary DV bounds, since these only apply when the confounder U is binary. Figure 3 shows the median width of the AF bounds (solid lines), DVS bounds (dashed lines) and SH bounds (dotted lines) for the causal risk difference (left panel) and the causal risk ratio (right panel), when not using the parameter reduction (top row) and using the parameter reduction (bottom row). We observe that the width of the AF bounds is constant for the causal risk difference (=1) and appears fairly constant for the causal risk ratio as the number of confounder levels increases. However, the width of the DVS and SH bounds tends to increase with the number of confounder levels, both for the causal risk difference and risk ratio. This is not surprising, given that the sensitivity parameters for these bounds are defined by contrasting the confounder levels that are most extreme, in the sense that these levels maximize/minimize the exposure and outcome prevalences. The more levels there are, the larger the discrepancy between the most extreme values (in a probabilistic sense), and the larger the sensitivity parameters and the wider the bounds.

### Figure 3

Median width of the AF bounds (solid lines), DVS bounds (dashed lines) and SH bounds (dotted lines) for the causal risk difference (left panel) and the causal risk ratio (right panel), without and with parameter reduction (top row and bottom row, respectively), as functions of the number of confounder levels K .

## 9 Conclusions

In this article, we have derived bounds for causal effects, using sensitivity parameters that quantify the degree of unmeasured confounding on the risk difference scale. These bounds can subsequently be used to compute an “E-value,” that is, a minimal amount of confounding required to explain away an observed statistical association. Thus, our work complements and extends the work of Ding and VanderWeele [7], who derived similar bounds using sensitivity parameters on the risk ratio scale.

Our work has important practical implications. When the target causal effect is a risk difference, it may be more natural to use sensitivity parameters on the risk difference scale than on the risk ratio scale. Furthermore, some subject matter experts may find it more intuitive to speculate about the degree of unmeasured confounding on the risk difference scale, regardless of the chosen scale for the target causal effect.

Our proposed bounds are functions of the observed data distribution parameters ( p 1 , q 0 , q 1 ) . In practice, these are not known but have to be estimated from data, which gives estimated bounds. An important extension would thus be to develop methods that take the sampling variability of the estimated bounds into account. This could either be done with analytic arguments, as in ref. [3], or with bootstrap simulation, as in refs [5,6].

1. Conflict of interest: Authors state no conflicts of interest.

## Appendix A Proof of Theorem 1

Let { p 1 , q 0 , q 1 , RD E U , RD U D 0 , RD U D 1 } be a fixed and arbitrary set of parameter values. To prove Theorem 1, we need to show that it is always possible to construct a valid distribution p ( D , E , U ) for which RD E U = RD E U , RD U D 0 = RD U D 0 and RD U D 0 = RD U D 1 , which marginalizes to { p 1 , q 0 , q 1 } . We construct one such distribution p ( D , E , U ) in the following steps.

1. Let U be categorical with levels u 1 , , u 6 , with distribution

p ( U = u 1 ) = ( 1 p 1 ) / 3 p ( U = u 2 ) = p 1 / 3 p ( U = u 3 ) = ( 1 q 0 ) / 3 p ( U = u 4 ) = q 0 / 3 p ( U = u 5 ) = ( 1 q 1 ) / 3 p ( U = u 6 ) = q 1 / 3 .

This distribution is valid, since 0 p ( U = u i ) 1 for i { 1 , , 6 } and i = 1 6 p ( U = u i ) = 1 .

2. Let

p ( E = 1 U = u 1 ) = p 1 p 1 RD E U p ( E = 1 U = u 2 ) = p 1 + ( 1 p 1 ) RD E U p ( E = 1 U = u i ) = p 1 for i { 3 , 4 , 5 , 6 } .

This distribution is valid, since 0 p ( E = 1 U = u i ) 1 for i { 1 , , 6 } . We also have that

RD E U = p ( E = 1 U = u 2 ) p ( E = 1 U = u 1 ) = RD E U

and

p ( E = 1 ) = i = 1 6 p ( E = 1 U = u i ) p ( U = u i ) = p 1 .

3. Let

p ( D = 1 E = 0 , U = u 3 ) = q 0 q 0 RD U D 0 p ( D = 1 E = 0 , U = u 4 ) = q 0 + ( 1 q 0 ) RD U D 0 p ( D = 1 E = 0 , U = u i ) = q 0 for i { 1 , 2 , 5 , 6 } .

This distribution is valid, since 0 p ( D = 1 E = 0 , U = u i ) 1 for i { 1 , , 6 } . We also have that

RD U D 0 = p ( D = 1 E = 0 , U = u 4 ) p ( D = 1 E = 0 , U = u 3 ) = RD U D 0

and

p ( D = 1 E = 0 ) = i = 1 6 p ( D = 1 E = 0 , U = u i ) p ( U = u i ) = q 0 .

4. Let

p ( D = 1 E = 1 , U = u 5 ) = q 1 q 1 RD U D 1 p ( D = 1 E = 1 , U = u 6 ) = q 1 + ( 1 q 1 ) RD U D 1 p ( D = 1 E = 1 , U = u i ) = q 1 for i { 1 , 2 , 3 , 4 } .

This distribution is valid, since 0 p ( D = 1 E = 1 , U = u i ) 1 for i { 1 , , 6 } . We also have that

RD U D 1 = p ( D = 1 E = 1 , U = u 6 ) p ( D = 1 E = 1 , U = u 5 ) = RD U D 1

and

p ( D = 1 E = 1 ) = i = 1 6 p ( D = 1 E = 1 , U = u i ) p ( U = u i ) = q 1 .

## B Proof of Theorem 2

Define p e u = p ( E = e U = u ) and q e u = p ( D = 1 E = e , U = u ) , δ e u = p e u p e and Δ e u = q e u q e . To prove Theorem 2, we minimize and maximize q e under the constraints on the joint distribution p ( D , E , U ) implied by setting { p e , q e , RD E U , RD U D e } to fixed values. These constraints are as follows:

(14) E ( p e U ) = p e ,

(15) E ( q e U E = e ) = q e ,

(16) max u ( q e u ) min u ( q e u ) = RD U D e ,

(17) max u ( p e u ) min u ( p e u ) = RD E U .

From (4) we have that

(18) q e = E ( q e U ) = E ( Δ e U ) + q e ,

so minimizing/maximizing q e is equivalent to minimizing/maximizing E ( Δ e U ) . We have that

(19) E ( δ e U ) = E ( p e U ) p e = 0 ,

where the second equality follows from (14). We also have that

(20) q e = E ( q e U E = e ) = q e u p ( U = u E = e ) d u = q e u p e u p ( U = u ) p e 1 d u = p e 1 E ( q e U p e U ) ,

where the first equality follows from (15) and the third from Bayes’ rule. We now have that

(21) E ( Δ e U ) = p e 1 E ( q e U p e ) p e 1 E ( q e U p e U ) = p e 1 E ( q e U δ e U ) = p e 1 cov ( Δ e U , δ e U ) ,

where the first equality follows from (4) and (20), and the third from (19). Thus, minimizing/maximizing E ( Δ e U ) is equivalent with maximizing/minimizing the covariance cov ( Δ e U , δ e U ) . Define

a e = min u ( Δ e u ) = min u ( q e u ) q e , b e = max u ( Δ e u ) = max u ( q e u ) q e , c e = min u ( δ e u ) = min u ( p e u ) p e , d e = max u ( δ e u ) = max u ( p e u ) p e .

For fixed values of ( a e , b e , c e , d e , E ( Δ e U ) , E ( δ e U ) ) it follows from Theorem 1 of [27], and (19), that

cov ( Δ e U , δ e U ) max [ { b e E ( Δ e U ) } { d e E ( δ e U ) } , { E ( Δ e U ) a e } { E ( δ e U ) c e } ] = max { b e d e + d e E ( Δ e U ) , a e c e + c e E ( Δ e U ) }

and

cov ( Δ e U , δ e U ) min [ { E ( Δ e U ) a e } { d e E ( δ e U ) } , { b e E ( Δ e U ) } { E ( δ e U ) c e } ] = min { a e d e + d e E ( Δ e U ) , b e c e + c e E ( Δ e U ) } .

Combining with (21) gives the following constraints on E ( Δ e U ) :

(22) E ( Δ e U ) p e 1 max { b e d e + d e E ( Δ e U ) , a e c e + c e E ( Δ e U ) }

and

(23) E ( Δ e U ) p e 1 min { a e d e + d e E ( Δ e U ) , b e c e + c e E ( Δ e U ) } .

Consider first the constraint in (22). Let x 1 and x 2 be the solutions to the equations

E ( Δ e U ) = p e 1 { b e d e + d e E ( Δ e U ) }

and

E ( Δ e U ) = p e 1 { a e c e + c e E ( Δ e U ) } ,

which are given by

x 1 = b e d e d e + p e

and

x 2 = a e c e c e + p e .

For fixed values of ( a e , b e , c e , d e ) we thus have that

E ( Δ e U ) min ( x 1 , x 2 ) .

Consider now the constraint in (23). Let x 3 and x 4 be the solutions to the equations

E ( Δ e U ) = p e 1 { a e d e + d e E ( Δ e U ) }

and

E ( Δ e U ) = p e 1 { b e c e + c e E ( Δ e U ) } ,

which are given by

x 3 = a e d e d e + p e

and

x 4 = b e c e c e + p e .

For fixed values of ( a e , b e , c e , d e ) we thus have that

E ( Δ e U ) max ( x 3 , x 4 ) .

Accounting for the uncertainty in ( a e , b e , c e , d e ) gives that

min b e , d e { max ( x 3 , x 4 ) } E ( Δ e U ) max b e , d e { min ( x 1 , x 2 ) } .

The outer min/max in this expression are only taken over ( b e , d e ) , since it follows from (16) and (17) that a e = b e RD U D e and c e = d e RD E U . Combining with (18) gives the bounds in Theorem 2.

We end the proof by deriving the limits of ( b e , d e ) given in Theorem 2. Since q e u is a probability we have that min u ( q e u ) 0 . Combining with (16) gives that max u ( q e u ) = RD U D e + min u ( q e u ) RD U D e . However, since the expected value of a random variable cannot be larger than the maximal value of the variable, it follows from (15) that max u ( q e u ) q e . We thus have that

max u ( q e u ) max ( q e , RD U D e ) ,

so that

b e max ( 0 , RD U D e q e ) .

Similarly, since q e u is a probability we have that max u ( q e u ) 1 . However, since the expected value of a random variable cannot be smaller than the minimal value of the variable, it follows from (15) that min u ( q e u ) q e . Combining with (16) gives that max u ( q e u ) = RD U D e + min u ( q e u ) RD U D e + q e . We thus have that

max u ( q e u ) min ( 1 , RD U D e + q e ) ,

so that

b e min ( 1 q e , RD U D e ) .

By analogous arguments we have that

d e max ( 0 , RD E U p e )

and

d e min ( 1 p e , RD E U ) .

## C R code

#- - -UTILITY FUNCTIONS- - -

 boundsAFfun <- function(p0, p1, q0, q1) { l0 <- q0*p0 u0 <- q0*p0+p1 l1 <- q1*p1 u1 <- q1*p1+p0 c(l0, u0, l1, u1) }
 boundsDVSfun <- function(p0, p1, q0, q1, RREU0, RREU1, RRUD0, RRUD1) { B00 <- RREU0*RRUD0/(RREU0+RRUD0-1) B01 <- RREU0*RRUD1/(RREU0+RRUD1-1) B10 <- RREU1*RRUD0/(RREU1+RRUD0-1) B11 <- RREU1*RRUD1/(RREU1+RRUD1-1) l0 <- q0*(p0+p1/B00) u0 <- q0*(p0+p1*B10) l1 <- q1*(p1+p0/B11) u1 <- q1*(p1+p0*B01) b <- boundsAFfun(p0, p1, q0, q1) c(max(l0, b[1]), min(u0, b[2]), max(l1, b[3]), min(u1, b[4])) }
 boundsSHfun <- function(p0, p1, q0, q1, RDEU, RDUD0, RDUD1) { x12 <- function(par, RDUDe, RDEU, pe, bmin, bmax, dmin, dmax) { be <- plogis(par[1])*(bmax-bmin)+bmin de <- plogis(par[2])*(dmax-dmin)+dmin ae <- be-RDUDe ce <- de-RDEU x1 <- be*de/(de+pe) x2 <- ae*ce/(ce+pe) x <- c(x1, x2) min(x) } x34 <- function(par, RDUDe, RDEU, pe, bmin, bmax, dmin, dmax) { be <- plogis(par[1])*(bmax-bmin)+bmin de <- plogis(par[2])*(dmax-dmin)+dmin ae <- be-RDUDe ce <- de-RDEU x3 <- ae*de/(de+pe) x4 <- be*ce/(ce+pe) x <- c(x3, x4) max(x) } maxit <- 500 reltol <- sqrt(.Machine$double.eps) b0min <- max(-q0+RDUD0, 0) b0max <- min(1-q0, RDUD0) d0min <- max(-p0+RDEU, 0) d0max <- min(1-p0, RDEU) opt120 <- optim(par=c(0, 0), fn=x12, RDUDe=RDUD0, RDEU=RDEU, pe=p0, bmin=b0min, bmax=b0max, dmin=d0min, dmax=d0max, control=list(fnscale=-1, maxit=maxit, reltol=reltol)) opt340 <- optim(par=c(0, 0), fn=x34, RDUDe=RDUD0, RDEU=RDEU, pe=p0, bmin=b0min, bmax=b0max, dmin=d0min, dmax=d0max, control=list(maxit=maxit, reltol=reltol)) b1min <- max(-q1+RDUD1, 0) b1max <- min(1-q1, RDUD1) d1min <- max(-p1+RDEU, 0) d1max <- min(1-p1, RDEU) opt121 <- optim(par=c(0, 0), fn=x12, RDUDe=RDUD1, RDEU=RDEU, pe=p1, bmin=b1min, bmax=b1max, dmin=d1min, dmax=d1max, control=list(fnscale=-1, maxit=maxit, reltol=reltol)) opt341 <- optim(par=c(0, 0), fn=x34, RDUDe=RDUD1, RDEU=RDEU, pe=p1, bmin=b1min, bmax=b1max, dmin=d1min, dmax=d1max, control=list(maxit=maxit, reltol=reltol)) l0 <- q0+opt340$value u0 <- q0+opt120$value l1 <- q1+opt341$value u1 <- q1+opt121$value c(l0, u0, l1, u1) }  boundsDVbinaryfun <- function(p0, p1, q0, q1, RDEUbinary, RDUD0binary, RDUD1binary) { l <- q1-q0-RDEUbinary*(RDUD1binary*p0+RDUD0binary*p1) u <- q1-q0+RDEUbinary*(RDUD1binary*p0+RDUD0binary*p1) c(l, u) } EDVfun <- function(q0, q1) q1/q0+sqrt(q1/q0*(q1/q0-1)) ESHfun <- function(p0, p1, q0, q1) { rootfun <- function(E) { b <- boundsSHfun(p0, p1, q0, q1, E, E, E) b[3]-b[2] } uniroot(f=rootfun, lower=0, upper=0.3)$root }

#- - -REAL DATA ANALYSIS- - -

 #Data from Hammond and Horn, 1954 and 1958 #Only age-group 65-69 n <- 29105 #From Hammond and Horn 1954, Table 3 nE0 <- 6287 #From Hammond and Horn 1954, Table 3 nE1 <- n-nE0 p0 <- nE0/n p1 <- 1-p0 nD1 <- 3450 #From From Hammond and Horn 1958, Table 2 nD1.E0 <- 613 #From From Hammond and Horn 1958, Table 2 nD1.E1 <- nD1-nD1.E0 q0 <- nD1.E0/nE0 q1 <- nD1.E1/nE1 print("p1") print(p1) print("q1") print(q1) print("q0") print(q0) print("Risk difference") print(q1-q0) b <- boundsAFfun(p0, p1, q0, q1) print("Assumption-free bounds") print(c(b[3]-b[2], b[4]-b[1])) RREU <- seq(1, 35, 0.1) RRUD <- seq(1, 35, 0.1) DVSl <- matrix(nrow=length(RREU), ncol=length(RRUD)) DVSu <- matrix(nrow=length(RREU), ncol=length(RRUD)) for(i in 1:length(RREU)) { for(j in 1:length(RRUD)) { b <- boundsDVSfun(p0, p1, q0, q1, RREU[i], RREU[i], RRUD[j], RRUD[j]) DVSl[i, j] <- b[3]-b[2] DVSu[i, j] <- b[4]-b[1] } } EDV <- EDVfun(q0, q1) print("E-value DV") print(EDV)
 RDEU <- seq(0, 0.3, 0.01) RDUD <- seq(0, 0.3, 0.01) SHl <- matrix(nrow=length(RDEU), ncol=length(RDUD)) SHu <- matrix(nrow=length(RDEU), ncol=length(RDUD)) for(i in 1:length(RDEU)) { for(j in 1:length(RDUD)) { b <- boundsSHfun(p0, p1, q0, q1, RDEU[i], RDUD[j], RDUD[j]) SHl[i, j] <- b[3]-b[2] SHu[i, j] <- b[4]-b[1] } } ESH <- ESHfun(p0, p1, q0, q1) print("E-value SH") print(ESH)
 windows() op <- par(mfrow=c(2,2)) contour(x=log(RREU), y=log(RRUD), z=DVSl, xlab=expression(logRR[EU]), ylab=expression(logRR[UD]), labcex=0.7) lines(x=c(log(EDV), log(EDV)), y=c(-1, log(EDV)), lty="dashed") lines(x=c(-1, log(EDV)), y=c(log(EDV), log(EDV)), lty="dashed") contour(x=log(RREU), y=log(RRUD), z=DVSu, xlab=expression(logRR[EU]), ylab=expression(logRR[UD]), labcex=0.7) contour(x=RDEU, y=RDUD, z=SHl, xlab=expression(RD[EU]), ylab=expression(RD[UD]), labcex=0.7) lines(x=c(ESH, ESH), y=c(-1, ESH), lty="dashed") lines(x=c(-1, ESH), y=c(ESH, ESH), lty="dashed") contour(x=RDEU, y=RDUD, z=SHu, xlab=expression(RD[EU]), ylab=expression(RD[UD]), labcex=0.7) par(op)

#- - -SIMULATION- - -

 set.seed(1) N <- 10000 AF <- matrix(nrow=N, ncol=4) DVS <- matrix(nrow=N, ncol=4) SH <- matrix(nrow=N, ncol=4) DVbinary <- matrix(nrow=N, ncol=2) DVS2 <- matrix(nrow=N, ncol=4) SH2 <- matrix(nrow=N, ncol=4) DVbinary2 <- matrix(nrow=N, ncol=2) qstar <- matrix(nrow=N, ncol=2) K <- 2 for(i in 1:N) { r <- -log(runif(K)) r <- r/sum(r) p1U <- runif(K) p0U <- 1-p1U q0U <- runif(K) q1U <- runif(K) p1 <- sum(p1U*r) p0 <- 1-p1 q0 <- sum(q0U*p0U*r/p0) q1 <- sum(q1U*p1U*r/p1) qstar[i, ] <- c(sum(q0U*r), sum(q1U*r)) AF[i, ] <- boundsAFfun(p0, p1, q0, q1) RREU0 <- max((p0U*r/p0)/(p1U*r/p1)) RREU1 <- max((p1U*r/p1)/(p0U*r/p0)) RRUD0 <- max(q0U)/min(q0U) RRUD1 <- max(q1U)/min(q1U) DVS[i, ] <- boundsDVSfun(p0, p1, q0, q1, RREU0, RREU1,RRUD0, RRUD1) RREU <- max(RREU0, RREU1) RRUD <- max(RRUD0, RRUD1) DVS2[i, ] <- boundsDVSfun(p0, p1, q0, q1, RREU, RREU,RRUD, RRUD) RDEU <- max(p1U)-min(p1U) RDUD0 <- max(q0U)-min(q0U) RDUD1 <- max(q1U)-min(q1U) SH[i, ] <- boundsSHfun(p0, p1, q0, q1, RDEU, RDUD0, RDUD1) RDUD <- max(RDUD0, RDUD1) SH2[i, ] <- boundsSHfun(p0, p1, q0, q1, RDEU, RDUD, RDUD) RDEUbinary <- max(p1U*r/p1-p0U*r/p0) DVbinary[i, ] <- boundsDVbinaryfun(p0, p1, q0, q1, RDEUbinary, RDUD0, RDUD1) DVbinary2[i, ] <- boundsDVbinaryfun(p0, p1, q0, q1, RDEUbinary, RDUD, RDUD) }
 widthAFRD <- (AF[, 4]-AF[, 1])-(AF[, 3]-AF[, 2]) widthDVSRD <- (DVS[, 4]-DVS[, 1])-(DVS[, 3]-DVS[, 2]) widthDVbinaryRD <- DVbinary[, 2]-DVbinary[, 1] widthSHRD <- (SH[, 4]-SH[, 1])-(SH[, 3]-SH[, 2]) widthsRD <- data.frame(width=c(widthAFRD, widthDVSRD, widthDVbinaryRD, widthSHRD), bounds=c(rep("AF", N), rep("DVS", N), rep("binary DV", N), rep("SH", N))) widthAFRR <- (AF[, 4]/AF[, 1])-(AF[, 3]/AF[, 2]) widthDVSRR <- (DVS[, 4]/DVS[, 1])-(DVS[, 3]/DVS[, 2]) widthSHRR <- (SH[, 4]/SH[, 1])-(SH[, 3]/SH[, 2]) widthsRR <- data.frame(width=c(widthAFRR, widthDVSRR, widthSHRR), bounds=c(rep("AF", N), rep("DVS", N), rep("SH", N)))
 widthDVS2RD <- (DVS2[, 4]-DVS2[, 1])-(DVS2[, 3]-DVS2[, 2]) widthDVbinary2RD <- DVbinary2[, 2]-DVbinary2[, 1] widthSH2RD <- (SH2[, 4]-SH2[, 1])-(SH2[, 3]-SH2[, 2]) widths2RD <- data.frame(width=c(widthAFRD, widthDVS2RD, widthDVbinary2RD, widthSH2RD), bounds=c(rep("AF", N), rep("DVS", N), rep("binary DV", N), rep("SH", N))) widthDVS2RR <- (DVS2[, 4]/DVS2[, 1])-(DVS2[, 3]/DVS2[, 2]) widthSH2RR <- (SH2[, 4]/SH2[, 1])-(SH2[, 3]/SH2[, 2]) widths2RR <- data.frame(width=c(widthAFRR, widthDVS2RR, widthSH2RR), bounds=c(rep("AF", N), rep("DVS", N), rep("SH", N)))
 windows(width=10) op <- par(mfrow=c(2,2)) boxplot(formula=width bounds, data=widthsRD, ylim=c(0,1), outline=FALSE) boxplot(formula=width bounds, data=widthsRR, ylim=c(0,12), outline=FALSE) boxplot(formula=width bounds, data=widths2RD, ylim=c(0,1), outline=FALSE) boxplot(formula=width bounds, data=widths2RR, ylim=c(0,12), outline=FALSE) par(op)
 KK <- 2:10 widthAFRD <- vector(length=length(KK)) widthDVSRD <- vector(length=length(KK)) widthSHRD <- vector(length=length(KK)) widthAFRR <- vector(length=length(KK)) widthDVSRR <- vector(length=length(KK)) widthSHRR <- vector(length=length(KK)) widthDVS2RD <- vector(length=length(KK)) widthSH2RD <- vector(length=length(KK)) widthDVS2RR <- vector(length=length(KK)) widthSH2RR <- vector(length=length(KK)) for(j in 1:length(KK)) { K <- KK[j] AF <- matrix(nrow=N, ncol=4) DVS <- matrix(nrow=N, ncol=4) SH <- matrix(nrow=N, ncol=4) DVS2 <- matrix(nrow=N, ncol=4) SH2 <- matrix(nrow=N, ncol=4) qstar <- matrix(nrow=N, ncol=2) for(i in 1:N) { r <- -log(runif(K)) r <- r/sum(r) p1U <- runif(K) p0U <- 1-p1U q0U <- runif(K) q1U <- runif(K) p1 <- sum(p1U*r) p0 <- 1-p1 q0 <- sum(q0U*p0U*r/p0) q1 <- sum(q1U*p1U*r/p1) qstar[i, ] <- c(sum(q0U*r), sum(q1U*r)) AF[i, ] <- boundsAFfun(p0, p1, q0, q1) RREU0 <- max((p0U*r/p0)/(p1U*r/p1)) RREU1 <- max((p1U*r/p1)/(p0U*r/p0)) RRUD0 <- max(q0U)/min(q0U) RRUD1 <- max(q1U)/min(q1U) DVS[i, ] <- boundsDVSfun(p0, p1, q0, q1, RREU0, RREU1,RRUD0, RRUD1) RREU <- max(RREU0, RREU1) RRUD <- max(RRUD0, RRUD1) DVS2[i, ] <- boundsDVSfun(p0, p1, q0, q1, RREU, RREU,RRUD, RRUD) RDEU <- max(p1U)-min(p1U) RDUD0 <- max(q0U)-min(q0U) RDUD1 <- max(q1U)-min(q1U) SH[i, ] <- boundsSHfun(p0, p1, q0, q1, RDEU, RDUD0, RDUD1) RDUD <- max(RDUD0, RDUD1) SH2[i, ] <- boundsSHfun(p0, p1, q0, q1, RDEU, RDUD, RDUD) } widthAFRD[j] <- median((AF[, 4]-AF[, 1])-(AF[, 3]-AF[, 2])) widthDVSRD[j] <- median((DVS[, 4]-DVS[, 1])-(DVS[, 3]-DVS[, 2])) widthSHRD[j] <- median((SH[, 4]-SH[, 1])-(SH[, 3]-SH[, 2])) widthAFRR[j] <- median((AF[, 4]/AF[, 1])-(AF[, 3]/AF[, 2])) widthDVSRR[j] <- median((DVS[, 4]/DVS[, 1])-(DVS[, 3]/DVS[, 2])) widthSHRR[j] <- median((SH[, 4]/SH[, 1])-(SH[, 3]/SH[, 2])) widthDVS2RD[j] <- median((DVS2[, 4]-DVS2[, 1])-(DVS2[, 3]-DVS2[, 2])) widthSH2RD[j] <- median((SH2[, 4]-SH2[, 1])-(SH2[, 3]-SH2[, 2])) widthDVS2RR[j] <- median((DVS2[, 4]/DVS2[, 1])-(DVS2[, 3]/DVS2[, 2])) widthSH2RR[j] <- median((SH2[, 4]/SH2[, 1])-(SH2[, 3]/SH2[, 2])) }
 windows() op <- par(mfrow=c(2,2)) matplot(KK, cbind(widthAFRD, widthDVSRD,widthSHRD), type="l", lty=1:3, col="black", ylab="median width", xlab="K") matplot(KK, cbind(widthAFRR, widthDVSRR,widthSHRR), type="l", lty=1:3, col="black", ylab="median width", xlab="K") matplot(KK, cbind(widthAFRD, widthDVS2RD,widthSH2RD), type="l", lty=1:3, col="black", ylab="median width", xlab="K") matplot(KK, cbind(widthAFRR, widthDVS2RR,widthSH2RR), type="l", lty=1:3, col="black", ylab="median width", xlab="K") par(op)

### References

[1] Balke A , Pearl J . Bounds on treatment effects from studies with imperfect compliance. J Am Stat Assoc. 1997;92(439):1171–6. Search in Google Scholar

[2] Zhang JL , Rubin DB . Estimation of causal effects via principal stratification when some outcomes are truncated by death. J Edu Behav Stat. 2003;28(4):353–68. Search in Google Scholar

[3] Cai Z , Kuroki M , Pearl J , Tian J . Bounds on direct effects in the presence of confounded intermediate variables. Biometrics. 2008;64(3):695–701. Search in Google Scholar

[4] Sjölander A . Bounds on natural direct effects in the presence of confounded intermediate variables. Statist Med. 2009;28(4):558–71. Search in Google Scholar

[5] Sjölander A , Lee W , Källberg H , Pawitan Y . Bounds on sufficient-cause interaction. Europ J Epidemiol. 2014;29(11):813–20. Search in Google Scholar

[6] Sjölander A , Lee W , Källberg H , Pawitan Y . Bounds on causal interactions for binary outcomes. Biometrics. 2014;70(3):500–5. Search in Google Scholar

[7] Ding P , VanderWeele TJ . Sensitivity analysis without assumptions. Epidemiology. 2016;27(3):368–77. Search in Google Scholar

[8] Rosenbaum PR , Rubin DB . Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J R Stat Soc B (Methodological). 1983;45(2):212–8. Search in Google Scholar

[9] Yanagawa T . Case-control studies: assessing the effect of a confounding factor. Biometrika. 1984;71(1):191–94. Search in Google Scholar

[10] Imbens GW . Sensitivity to exogeneity assumptions in program evaluation. Am Econ Rev. 2003;93(2):126–32. Search in Google Scholar

[11] Schlesselman JJ . Assessing effects of confounding variables. Am J Epidemiol. 1978;108(1):3–8. Search in Google Scholar

[12] Lin DY , Psaty BM , Kronmal RA . Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics. 1998;54(3):948–63. Search in Google Scholar

[13] VanderWeele TJ , Arah OA . Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders. Epidemiology. 2011;22(1):42–52. Search in Google Scholar

[14] Rubin DB . Estimating causal effects of treatments in randomized and nonrandomized studies. J Edu Psychol. 1974;66(5):688–701. Search in Google Scholar

[15] Pearl J . Causality: models, reasoning, and inference. 2nd edition. New York: Cambridge University Press; 2009. Search in Google Scholar

[16] Manski CF . Nonparametric bounds on treatment effects. Am Econ Rev. 1990;80(2):319–23. Search in Google Scholar

[17] Robins JM . The analysis of randomized and non-randomized aids treatment trials using a new approach to causal inference in longitudinal studies. In Sechrest L , Freeman H , Mulley A , editors. Health service research methodology: a focus on AIDS. US Public Health Service, National Center for Health Services Research; 1989. p. 113–59. Search in Google Scholar

[18] Tian J , Pearl J . Probabilities of causation: bounds and identification. Ann Math Artif Intel. 2000;28(1):287–313. Search in Google Scholar

[19] Sjölander A . A note on a sensitivity analysis for unmeasured confounding, and the related E-value. J Causal Infer. 2020;8(1):229–48. Search in Google Scholar

[20] VanderWeele TJ , Ding P . Sensitivity analysis in observational research: introducing the E-value. Ann Internal Med. 2017;167(4):268–74. Search in Google Scholar

[21] Vansteelandt S , VanderWeele TJ , Tchetgen EJ , Robins JM . Multiply robust inference for statistical interactions. J Am Stat Assoc. 2008;103(484):1693–704. Search in Google Scholar

[22] Tchetgen Tchetgen EJ , Robins JM , Rotnitzky A . On doubly robust estimation in a semiparametric odds ratio model. Biometrika. 2010;97(1):171–80. Search in Google Scholar

[23] Hammond EC , Horn D . The relationship between human smoking habits and death rates: a follow-up study of 187,766 men. J Am Med Assoc. 1954;155(15):1316–28. Search in Google Scholar

[24] Hammond EC , Horn D . Smoking and death rates: report on forty-four months of follow-up of 187,783 men. part I: total mortality. J Am Med Assoc. 1958;166(10):1159–72. Search in Google Scholar

[25] Hammond EC , Horn D . Smoking and death rates: report on forty-four months of follow-up of 187,783 men. part II: death rates by cause. J Am Med Assoc. 1958;166(11):1294–308. Search in Google Scholar

[26] Kroese DP , Taimre T , Botev ZI . Handbook of Monte Carlo methods. New Jersey: John Wiley & Sons; 2013. Search in Google Scholar

[27] Hössjer O , Sjölander A . Sharp lower and upper bounds for the covariance of bounded random variables. 2021. arXiv, page 2106.10037. Search in Google Scholar