Yasutaka Chiba

# Abstract

In randomized trials in which two treatment arms are compared with a binary outcome, the causal effect can be identified by assuming that the two treatment arms are exchangeable. In trials with an ordinal outcome, which is categorized as more than two, the causal effect can be identified by assuming that the potential outcomes are independent and that the two treatment arms are exchangeable. In this article, we propose a Bayesian approach to causal inference that does not rely on these two assumptions. To achieve this purpose, we use a randomization-based approach and response type. Then, the likelihood function is derived by physical randomization in which subjects who belong to a response type are randomly assigned to the treatment or control, with no modeling assumption on the outcome. Our approach can derive not only the posterior distribution of the causal effect but also that of the number of subjects in each response type. The proposed approach is illustrated with two examples from randomized clinical trials.

## 1 Introduction

The main purpose of randomized trials is to draw inferences regarding causal effects. When two treatment arms are compared with binary outcomes, causal effects can be identified by assuming that the two treatment arms are exchangeable. This assumption means that the risk of the event in the treatment arm would have been the same as the risk of the event in the control arm had subjects in the treatment arm been assigned to the control arm [1], [2], and it is often assumed under random assignment. In trials with an ordinal outcome, which is categorized as more than two, we can identify causal effects by making the assumption that the potential outcomes are independent, in addition to the assumption that the two treatment arms are exchangeable. The assumption of independent potential outcomes means that the potential outcome if a subject was assigned to the treatment arm is independent of the potential outcome if the subject was assigned to the control arm [3]. In general, it is impossible to identify causal effects under the frequentist approach without making these two assumptions or other strict assumptions. Therefore, in this article, we propose a Bayesian approach to causal inference that does not rely on these assumptions, with no modeling assumption. To achieve this purpose, we apply the randomization-based approach, in which the trial subjects are viewed as a finite population of interest and probabilities arise only through the random assignment [4], [5]. Therefore, we do not require the observed data to be a random sample from an infinite population and do not need to apply the large sample approximation. We further apply the response types, which is the pair of potential outcomes for a subject under treatment and control conditions.

For the case of a binary outcome, Ding and Miratrix [6] developed a Bayesian approach, using the randomization-based approach and response types. Their approach requires that researchers fix the number of subjects who belong to one of four response types. In this article, we discuss not only a binary outcome but also an ordinal outcome with more than two categories. Furthermore, we do not fix the number of subjects who belong to one of the response types. Therefore, the approach proposed here can be regarded as an extension of Ding and Miratrix [6].

In Section 2, we introduce notation and definitions. The Bayesian approach to causal inference is described in Section 3. In Section 4, we illustrate our approach using data from two randomized clinical trials. We conclude the article with a discussion in Section 5.

## 2 Notation and definitions

Throughout this article, we denote X for the assigned treatment; X=1 if the subject was assigned to the treatment arm and X=0 if the subject was assigned to the control arm. Y denotes the ordinal outcome with J categories labeled 0,,J1, where 0 and J1 represent the worst and best categories, respectively. Furthermore, we denote the potential outcome for a subject with X=x as Y(x) [7], [8], which is the outcome that would occur if the subject were assigned to a specific type of treatment. It is not possible to know the values of both Y(1) and Y(0). If the subject is assigned to the treatment arm (X=1), then we observe Y(1) but not Y(0). Conversely, if the subject is assigned to the control arm (X=0), then Y(1) is not observed and Y(0) is observed; i. e., Y=Y(1)X+Y(0)(1X).

We define the causal parameters for an ordinal outcome in terms of the response type, which is a pair of potential outcomes for a subject under treatment and control conditions; i. e., {Y(1),Y(0)}. Thus, Pr{Y(1)=k,Y(0)=l} (k,l=0,,J1) denotes the proportion of subjects whose potential outcome is k under the treatment condition and l under the control condition.

Recently, some new causal parameters were developed (e. g., [9], [10], [11]). Volfovsky et al. [9] studied the following conditional median as a causal measure for an ordinal outcome:

Ml=median{Y(1)Y(0)=l},

which yields J conditional medians. There may be some cases in which Ml>l and Ml<l (ll). In such cases, it is difficult to determine whether the treatment is superior to the control. Although this causal measure may be valuable as a local causal measure, as Lu et al. [10] noted, it is not a direct measure of the treatment effect.

Lu et al. [10] proposed the use of the following two causal parameters:

τ=Pr{Y(1)Y(0)}andη=Pr{Y(1)>Y(0)}.

The use of τ may be misleading because τ=1 (and η=0) under the sharp causal null hypothesis [12] that Y(1)=Y(0) for all subjects. Nevertheless, 1τ=Pr{Y(1)<Y(0)}=0 under the sharp causal null hypothesis. Then, 1τ indicates the probability that the control is beneficial over the treatment, whereas η indicates the probability that the treatment is beneficial over the control. Owing to the symmetry of the treatment and control labels, Lu et al. [10] considered that τ and η are equally useful. Consequently, they suggested using both τ and η in practice. Surely, if η=1τ=0.5, it is not concluded that a treatment is more beneficial than the control. Nevertheless, if η=0.2 and 1τ=0, it may be concluded that the treatment is more beneficial than the control.

Chiba [11] proposed the following causal parameter:

θx=Pr{Y(x)Y(1x)}Pr{Y(x)=Y(1x)=0}

for x=0,1, which is similar to the causal parameter proposed by Lu et al. [10], but slightly different. The causal parameter θx can be interpreted as the proportion of subjects for whom the treatment would not be more harmful than the control for X=1, and, similarly, the proportion of subjects for whom the control would not be more harmful than the treatment for X=0. For the case of a binary outcome (J=2), which is the simplest ordinal outcome, θx is equivalent to the well-defined causal risk Pr{Y(X)=1}, while τ and η are not. Chiba [11] proposed comparing θ1 and θ0 as the causal measure, such as in the case of a binary outcome.

### Table 1

2×J contingency table constructed from the unobserved number nkl,x.

 Treatment Outcome Total Y=0 Y=j Y=J−1 X=1 ∑ln0l,1 … ∑lnjl,1 … ∑ln(J−1)l,1 ∑k∑lnkl,1 X=0 ∑knk0,0 … ∑knkj,0 … ∑knk(J−1),0 ∑k∑lnkl,0

Again, Lu et al. [10] suggested using the two symmetric parameters to evaluate the causal effect of treatment. However, to determine whether the treatment is beneficial over the control, it is more efficient to use one causal measure rather than using two parameters. Therefore, we consider the relative treatment effect that can be expressed as follows on the difference scale:

(1)Pr{Y(1)>Y(0)}Pr{Y(1)<Y(0)},

which is equal to θ1θ0 and η(1τ). This can be interpreted as a causal quantity to indicate how much larger the proportion of subjects for whom the treatment would be more beneficial (or not more harmful) than the control is than the proportion of subjects for whom the control would be more beneficial (or not more harmful) than the treatment. Obviously, (1)>0 if the treatment is superior to the control, and (1)<0 if the treatment is inferior to the control. Under the sharp causal null hypothesis, (1)=0. By using the relative treatment effect of (1), researchers can clearly consider whether the treatment is beneficial in comparison with the control. We note that (1) can also be expressed as

Pr{Y(1)>Y(0)}Pr{Y(1)<Y(0)}=k=1J1l=0k1[Pr{Y(1)=k,Y(0)=l}Pr{Y(1)=l,Y(0)=k}].

For a binary outcome (J=2), (1) degenerates into Pr{Y(1)=1}Pr{Y(0)=1}. Therefore, under the exchangeability assumption that Pr{Y(x)=j}=Pr{Y(x)=jX=1}=Pr{Y(x)=jX=0} for x=0,1 [1], [2], (1) can be identified as Pr(Y=1X=1)Pr(Y=1X=0). However, for J3, the exchangeability assumption is not sufficient for (1) to be identified since Pr{Y(1)=k,Y(0)=l} cannot be identified only under this assumption. In general, to identify (1), we further must make the assumption of independent potential outcomes that Y(1) is independent of Y(0) [3], or other strict assumptions. Under the assumptions of exchangeability and independent potential outcomes, (1) can be identified as

(2)k=1J1l=0k1{Pr(Y=kX=1)Pr(Y=lX=0)Pr(Y=lX=1)Pr(Y=kX=0)}.

This is equivalent to the relative treatment effect version of stochastic superiority [13], [14], [15].

In this article, we consider the sample causal effect version of (1). Let n denote the total number of subjects in a sample and nkl denote the unobserved number of subjects with response type {Y(1),Y(0)}=(k,l) (k,l=0,,J1), where k=0J1l=0J1nkl=n. The sample causal effect corresponding to (1) can then be expressed as

(3)1nk=1J1l=0k1(nklnlk).

In Section 3, we present a model-free Bayesian approach to the causal inference of (3) that does not rely on the assumptions of exchangeability or independent potential outcomes. In randomized trials, the exchangeability assumption is a standard assumption, and it is often taken for granted. Nevertheless, as Hernán and Robins [2] noted, we are generally unable to determine whether Pr{Y(x)=jX=1}=Pr{Y(x)=jX=0} holds in a sample.

Here, we assume that nkl,x of nkl subjects are randomly assigned to the arm with X = x. We then construct a 2×J contingency table for the unobserved number nkl,x, as shown in Table 1. We also construct a 2×J contingency table for the observed number, as shown in Table 2. In this 2×J contingency table for the sample, mxj is the observed number of subjects in the category with (X,Y)=(x,j). Finally, we define N(n00,,nkl,,n(J1)(J1)) and Nx(n00,x,,nkl,x,,n(J1)(J1),x). In the following section, a combination of (n00,,nkl,,n(J1)(J1)) is noted as “an N.”

### Table 2

2×J contingency table constructed from the observed number mxj.

 Treatment Outcome Total Y=0 Y=j Y=J−1 X=1 m10 … m1j … m1(J–1) ∑jm1j X=0 m00 … m0j … m0(J–1) ∑jm0j

## 3 Bayesian inference of causal effects

In Section 3.1, we present the region of N, in which the likelihood function is nonzero. In Section 3.2, we present a Bayesian approach to make inferences about (3), using the region given in Section 3.1.

### 3.1 Region of N

The 2×J contingency table for the unobserved number nkl,x (Table 1) corresponds to the table for the observed number mxj (Table 2). Therefore, as shown in Chiba [11], nkl must satisfy the following inequality:

(4)a=1jb=1jjnkalba=1jm1ka+b=1jjm0lb,

where 2j2J1 and 1jmin{j1,J}. ka is an arbitrary j in the j categories with X=1, and the tuple (k1,,kj) takes on values in {0,,J1} with k1kj. Similarly, lb is an arbitrary j in the (jj) categories with X=0, and the tuple (l1,,ljj) takes on values in {0,,J1} with l1ljj. Inequality (4) has to hold for all possible choices of (k1,,kj) and (l1,,ljj). The right hand side of (4) is the sum of mxj for j categories, with j categories for X=1 and (jj) categories for X=0. The left hand side is the sum of nkalb=nkalb,1+nkalb,0 made in these j categories. For example, for two categories (j=2) with one category for X=1 (j=1) and one category for X=0 (jj=1),

m1k1+m0l1=b=1Jnk1lb,1+a=1Jnkal1,0=nk1l1+b=2Jnk1lb,1+a=2Jnkal1,0nk1l1,

where 0k1,l1J1. This inequality implies

J1×J1=J2

inequalities with n00m10+m00,,nklm1k+m0l,,n(J1)(J1)m1(J1)+m0(J1). Similarly, for three categories (j=3) with two categories for X=1 (j=2) and one category for X=0 (jj=1),

m1k1+m1k2+m0l1=b=1Jnk1lb,1+b=1Jnk2lb,1+a=1Jnkal1,0nk1l1+nk2l1,

which implies

J2×J1=J2(J1)2

inequalities. As the above inequalities are derived, inequalities that nkl must satisfy are derived for all j satisfying 2j2J1 and all j satisfying 1jminj1,J. Equation (4) expresses the inequalities by one formula.

In the case of a binary outcome (J=2), (4) has to hold for j,j=(2,1), (3,1), and (3,2), because 2j3 and 1jminj1,2. Then, (4) implies the following three inequalities:

1. nk1l1m1k1+m0l1 for j,j=(2,1),

2. nk1l1+nk1l2m1k1+m0l1+m0l2 (l1l2) for j,j=(3,1),

3. nk1l1+nk2l1m1k1+m1k2+m0l1 (k1k2) for j,j=(3,2),

where k1,k2,l1,l2=0,1; thus,

(5)n00m10+m00n01m10+m01n10m11+m00n11m11+m01andn00+n01m10+m00+m01n10+n11m11+m00+m01n00+n10m10+m11+m00n01+n11m10+m11+m01.

Consequently, (n00,n01,n10,n11) must exist in the region satisfying these eight inequalities and n00+n01+n10+n11=n. Using this region of (n00,n01,n10,n11), we can derive

m10+m01nn10n01nm11+m00n,

which are the sharp nonparametric bounds of the causal risk difference, stated in previous papers [16], [17]. In general, for a 2×J contingency table, the region of N is determined as NF with

F=N:k=0J1l=0J1nkl=nand(4).

Even in the case of J3, we can derive the sharp nonparametric bounds of (3) by examining all Ns in this region numerically [11].

### 3.2 Proposed Bayesian approach

We assume that the number of subjects assigned to each arm, k=0J1l=0J1nkl,x, is fixed to the actual assigned number mx+=j=0J1mxj. Then, for an N in the region NF, the probability that nkl,1 of the nkl subjects are randomly assigned to the treatment arm (X=1) can be expressed as

k=0J1l=0J1nklnkl,1/nm1+.

This can be regarded as a natural extension of the hypergeometric distribution to the response type version. Unfortunately, as we cannot know the value of nkl,1, even if N is fixed to a set of the values of (n00,,nkl,,n(J1)(J1)) in the region NF, we cannot calculate this probability from the observed data. However, if we limit N1 to the region N1F1 with

(6)F1=N1:l=0J1nkl,1=m1kandk=0J1nkl,1=k=0J1nklm0l,

where the set F1 is conditional on N, then we can make Bayesian inferences about (3). This region is derived because each category in Table 2 corresponds to that in Table 1. The latter equation is derived from m0l=k=0J1nkl,0=k=0J1(nklnkl,1).

Using this region of N1, we can express the likelihood function for N1, f(N1N), as

(7)fN1N=N1F1k=0J1l=0J1nklnkl,1/nm1+

in the region NF, and f(N1N)=0 outside NF. After the prior probability Pr(N) is determined, the posterior probability Pr(NN1) is calculated from

PrNN1=fN1NPr(N)NFfN1NPr(N).

The posterior distribution of (3) can be derived by summing the posterior probability Pr(NN1) for all Ns that equal a value in (3). For example, the posterior probability of (3) = 0 is derived by summing Pr(NN1) for all N with a combination of nkl and nlk equaling k=1J1l=0k1(nklnlk)/n0+=0. Similarly, we can derive the posterior distribution of nkl, which is the number of subjects who belong to the response type with {Y(1),Y(0)}=(k,l). It is important to note that the likelihood is completely derived from the physical randomization with no modeling assumption on the outcome, and we do not require the assumptions of exchangeability and independent potential outcomes.

To complement the above explanation of the proposed approach, let us consider the simple hypothetical example of a 2×2 contingency table with (m00,m01,m10,m11)=(2,2,1,3). For this, we have 60 combinations of (n00,n01,n10,n11) that satisfy n00+n01+n10+n11=8 and (4) (i. e., eight inequalities in (5)). For example, one of the 60 combinations is (n00,n01,n10,n11)=(2,0,2,4), which yields the causal risk difference of (n10n01)/n=(20)/8=0.25. In other words, the 60 combinations of (n00,n01,n10,n11) including (2,0,2,4) are in the region N(n00,n01,n10,n11)F, and the other combinations are outside the region NF. For each combination in the region NF, we search combinations of (n00,1,n01,1,n10,1,n11,1) satisfying the two equations in (6). For (n00,n01,n10,n11)=(2,0,2,4), we have one combination of (n00,1,n01,1,n10,1,n11,1)=(1,0,1,2). This implies that only (n00,1,n01,1,n10,1,n11,1)=(1,0,1,2) is in the region N1(n00,1,n01,1,n10,1,n11,1)F1 for (n00,n01,n10,n11)=(2,0,2,4). Then, we calculate the likelihood f(N1N) for (n00,n01,n10,n11)=(2,0,2,4) and (n00,1,n01,1,n10,1,n11,1)=(1,0,1,2) from (7). Similarly, the likelihood is calculated for all combinations of (n00,n01,n10,n11), where the likelihood is zero outside NF. After the prior probability Pr(N) is determined, the posterior probability Pr(NN1) is calculated for 60 combinations in the region NF. When we assume the non-informative prior distribution that the probability for a combination of (n00,n01,n10,n11) is equal to that for the other combination, the posterior probability for (n00,n01,n10,n11)=(2,0,2,4) is calculated as 0.035. The posterior probability for the causal risk difference (n10n01)/n=0.25 is calculated by summing the posterior probabilities for all combinations of (n00,n01,n10,n11) with (n10n01)/n=0.25. After calculating the posterior probabilities for the other values of (n10n01)/n, we obtain the posterior distribution of (n10n01)/n.

Finally, we note that Chiba [11] discussed the setting of Bernoulli trials in the context of the frequentist approach. In their setting, the number of subjects who were assigned to an arm was not fixed. Instead, the number depended on the ratio of random assignment. In the context of our Bayesian approach, when the assignment ratio is 1:r, the likelihood function for N1, g(N1N), can be expressed as

g(N1N)=N1F1k=0J1l=0J1nklnkl,111+rnkl,1r1+rnklnkl,1

in the region NF, and g(N1N)=0 outside NF. Although the likelihood function g(N1N) is not equal to f(N1N), it is simple to verify that both functions yield the same posterior probability.

## 4 Illustration

We will now illustrate our proposed Bayesian approach using data from two randomized clinical trials. We analyze a trial with a binary outcome in Section 4.1 and a trial with an ordinal outcome with three categories in Section 4.2.

### 4.1 Example 1: Trial with a binary outcome

Harms et al. [18] reported the results of a randomized clinical trial of preventive antibacterial therapy for patients who had suffered an acute ischemic stroke. The purpose of this trial was to evaluate the effectiveness of moxifloxacin for preventing post-stroke infections. Seventy-nine patients were randomly assigned to either the moxifloxacin arm (X=1) or the placebo arm (X=0), with an assignment ratio of 1:1. The primary endpoint was infection within 11 days. The results of this trial are summarized in Table 3.

### Table 3

Results from the randomized clinical trial of preventive antibacterial therapy: infection within 11 days.

 Arm Infection status Total No infection (Y=0) Infection (Y=1) Moxifloxacin (X=1) 33 6 39 Placebo (X=0) 27 13 40

### Table 4

Estimates calculated from the data in Table 3.

 Measure Estimate Maximum a posteriori estimator –13/79 = –0.165 95 % credible interval (–23/79, 0/79) = (–0.291, 0.000) Expected a posteriori estimator –0.152 95 % highest density region (–23/79, –1/79) = (–0.291, –0.013) Crude risk difference (2) 6/39 – 13/40 = –0.171 Exact 95 % confidence interval (–27/79, 2/79) = (–0.342, 0.025)
1. a Crude risk difference (2) and the 95 % confidence interval were calculated under the exchangeability assumption.

### Figure 1

Posterior distribution of the causal risk difference (3), derived from data in Table 3.

Figure 1 shows the posterior distribution of (3) in the region NF, derived by applying the Bayesian approach to the data in Table 3 with a non-informative prior distribution, so that the probability for each N was equal. In Table 4, we show specific estimated statistical measures, such as the maximum a posteriori estimate (MAP), 95 % credible interval (CI), expected a posteriori estimate (EAP), and 95 % highest density region (HDR). As a reference, in this table, we also show the crude risk difference under the exchangeability assumption, i. e., the estimate of Pr(Y=1X=1)Pr(Y=1X=0), which is equal to (2) for J=2, and the exact 95 % confidence interval [19]. The MAP and EAP estimates were more conservative than the crude risk difference, and the widths of the CI and HDR were narrower than the width of the exact confidence interval.

We also show the posterior distributions of nkl (k,l=0,1) in the appendix.

### 4.2 Example 2: Trial with an ordinal outcome with three categories

Fox et al. [20] reported the results of a randomized clinical trial of preventive antiemetic therapy for patients with germ cell tumors or small-cell lung cancer. The purpose of this trial was to evaluate the effectiveness of combining ondansetron (OND) with dexamethasone and chlorpromazine (ODC) for preventing emetic episodes in patients receiving cisplatin. Forty-four patients were randomly assigned to either the ODC (X=1) or OND alone (X=0) arms, with a 1:1 assignment ratio. The responses were classified into three categories: complete response (if no emesis occurred during the study period, the response was classified as complete; Y=2); major response (if one to two emetic episodes occurred during the study period, the response was classified as major; Y=1); and minor or no response (if at least three emetic episodes occurred during the study period, the response was classified as minor; Y=0). The results are summarized in Table 5.

### Table 5

Results from the randomized clinical trial of preventive antiemetic therapy: antiemetic response throughout the study period.

 Arm Level of response Total Minor or no (Y=0) Major (Y=1) Complete (Y=2) ODC(X=1) 3 7 12 22 OND(X=0) 12 3 7 22
1. a Combination of ondansetron, dexamethasone, and chlorpromazine.

2. b Ondansetron alone.

### Table 6

Estimates calculated from the data in Table 5.

 Measure Estimate Maximum a posteriori estimator 14/44 = 0.318 95 % credible interval (4/44, 23/44) = (0.091, 0.523) Expected a posteriori estimator 0.313 95 % highest density region (4/44, 23/44) = (0.091, 0.523) Crude difference (2) 0.382 Exact 95 % confidence interval (–5/44, 34/44) = (–0.114, 0.773)
1. a Crude difference (2) and the 95 % confidence interval were calculated under the assumption of exchangeability and independent potential outcomes.

### Figure 2

Posterior distribution of the causal effect on the difference scale (3), derived from data in Table 5.

Figure 2 shows the posterior distribution of (3) in the region NF, derived by applying the Bayesian approach to the data in Table 5 with a non-informative prior distribution, so that the probability for each N was equal. In Table 6, we show the MAP estimate, 95 % CI, EAP estimate, and 95 % HDR. As a reference, in this table, we also show the estimate of (2) for J = 3 under the assumptions of exchangeability and independent potential outcomes, and the exact 95 % confidence interval [11]. As with the results from Example 1, the MAP and EAP estimates were more conservative than the crude difference (2), and the widths of the CI and HDR were narrower than the width of the exact confidence interval. The differences between the widths were more notable than in Example 1.

We also show the posterior distributions of nkl (k,l=0,1,2) in the appendix.

## 5 Discussion

We have developed a new Bayesian method for making causal inferences from the results of randomized trials with ordinal outcomes, including binary outcomes. The advantage of this approach is that we do not have to place the assumptions of exchangeability, which is often assumed under random assignment, or independent potential outcomes, which is often assumed to identify causal effects for J3. We also do not require any modeling assumptions. This advantage is realized in comparison with approaches used to make the inference of the other causal measures introduced in Section 2. Volfovsky et al. [9] required a full parametric model and fixed the correlation between Y(1) and Y(0) to make a Bayesian inference of the conditional median Ml=median{Y(1)Y(0)=l}. Lu et al. [10] required the exchangeability assumption to derive the closed forms of sharp bounds for τ=Pr{Y(1)Y(0)} and η=Pr{Y(1)>Y(0)}. In comparison, our approach does not require these constraints to make a Bayesian inference of (3).

Our approach also has the advantage that it can make an inference about the number of subjects in each response type. Such an inference is potentially useful for making the detailed consideration of the characteristics of the treatment in the target population. For example, in the population for Example 1 in Section 4, we can infer that roughly 60 % of subjects might not be infected regardless of whether they received moxifloxacin or the placebo (see Appendix).

One disadvantage of our proposed approach is that the computational effort increases dramatically with the number of categories for the outcome. In Example 1 in Section 4 with a 2×2 contingency table for which the sample size was 79, there were 23,798 Ns in the region NF, and we required less than one second to derive the posterior distribution shown in Figure 1. However, in Example 2 with a 2×3 contingency table, although the sample size was 44, there were 104 million Ns in the region NF, and we required 100 minutes to derive Figure 2. The computational effort increases dramatically with the sample size and the number of categories. Although our approach is feasible for a small sample, it may not be feasible for a large sample, especially for an ordinal outcome with more than two categories. Further studies are required to develop a calculation method with less computational effort, for example, by proposing an efficient algorithm and by generating an approximation formula. An immediate approach would sample the Ns rather than enumerating all Ns, although this could still be a difficult process as there may be no simple uniform sampler.

Funding source: Japan Society for the Promotion of Science

Award Identifier / Grant number: 15K00057

Funding statement: This work was supported partially by Grant-in-Aid for Scientific Research (No. 15K00057) from Japan Society for the Promotion of Science.

# Acknowledgment

## Appendix

In this appendix, we present the posterior distributions of components of N derived from the data in Tables 3 and 5. Figure A.1 shows the posterior distributions of nkl (k,l=0,1) derived from the data in Table 3. The EAP estimates were

n00n01n10n11=46.319.07.06.8;

the percentages for all 79 subjects were

58.6%24.0%8.8%8.6%.

Figure A.2 show the posterior distributions of nkl (k,l=0,1,2) derived from the data in Table 5. The posterior distributions of n21, n11, and n22 were the same as those of n00, n02, and n10, respectively. This is because, in Table 5, the numbers in the categories with X,Y=(1,0), (1,1), and (1,2) are the same as those with X,Y=(0,1), (0,2), and (0,0), respectively. Therefore, in Figure A.2, we show the posterior distribution of n21, n11, and n22 in the same figure as the posterior distribution of n00, n02, and n10, respectively. The EAP estimates were

n00n01n02n10n11n12n20n21n22=3.21.62.56.82.54.812.63.26.8;

the percentages for all 44 subjects were

7.3%3.5%5.7%15.5%5.7%10.8%28.7%7.3%15.5%.

### Figure A.1

Posterior distributions of nkl (k,l=0,1) derived from the data in Table 3; (a) n00, (b) n01, (c) n10, and (d) n11.

### Figure A.2

Posterior distributions of nkl (k,l=0,1,2) derived from the data in Table 5; (a) n00 and n21, (b) n01, (c) n02 and n11, (d) n10 and n22, (e) n12, and (f) n20.

### References

1. Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Epidemiology. 1986;15:413–9.10.1093/ije/15.3.413Search in Google Scholar

2. Hernán MA, Robins JM. Causal inference. Boca Raton: Chapman and Hall/CRC; 2018.Search in Google Scholar

3. Hayden D, Pauler DK, Schoenfeld D. An estimator for treatment comparisons amongst survivors in randomized trials. Biometrics. 2005;61:305–10.10.1111/j.0006-341X.2005.030227.xSearch in Google Scholar

4. Rosenbaum P. Observational studies. New York: Springer; 2002.10.1007/978-1-4757-3692-2Search in Google Scholar

5. Rigdon J, Hudgens M. Randomization inference for treatment effects on a binary outcome. Stat Med. 2015;34:924–35.10.1002/sim.6384Search in Google Scholar

6. Ding P, Miratrix LW. Model-free causal inference of binary experimental data. Available at https://arxiv.org/abs/1705.08526.Search in Google Scholar

7. Rubin DB. Bayesian inference for causal effects: the role of randomization. Ann Stat. 1978;6:34–58.10.1214/aos/1176344064Search in Google Scholar

8. Rubin DB. Formal models of statistical inference for causal effects. J Stat Plan Inference. 1990;25:279–92.10.1016/0378-3758(90)90077-8Search in Google Scholar

9. Volfovsky A, Airoldi EM, Rubin DB. Causal inference for ordinal outcomes. Available at https://arxiv.org/abs/1501.01234.Search in Google Scholar

10. Lu J, Ding P, Dasgupta T. Treatment effects on ordinal outcomes: causal estimands and sharp bounds. Available at https://arxiv.org/abs/1507.01542.Search in Google Scholar

11. Chiba Y. Sharp nonparametric bounds and randomization inference for treatment effects on an ordinal outcome. Stat Med. 2017;36:3966–75.10.1002/sim.7400Search in Google Scholar

12. Greenland S. On the logical justification of conditional tests for two-by-two contingency tables. Am Stat. 1992;45:248–51.Search in Google Scholar

13. Klotz JH. The Wilcoxon ties, and the computer. J Am Stat Assoc. 1966;61:772–87.10.1080/01621459.1966.10480904Search in Google Scholar

14. Vargha A, Delaney HD. The Kruskal-Wallis test and stochastic homogeneity. J Educ Behav Stat. 1998;59:137–42.Search in Google Scholar

15. Agresti A. Analysis of ordinal categorical data. 2nd ed. New Jersey: John Wiley and Sons; 2010.10.1002/9780470594001Search in Google Scholar

16. Manski CF. Nonparametric bounds on treatment effects. Am Econ Rev. 1990;80:319–23.Search in Google Scholar

17. Pearl J. Causal inference from indirect experiments. Artif Intell Med. 1995;7:561–82.10.1016/0933-3657(95)00027-3Search in Google Scholar

18. Harms H, Prass K, Meisel C, et al.. Preventive antibacterial therapy in acute ischemic stroke: a randomized controlled trial. PLoS ONE. 2008;3:e2158.10.1371/journal.pone.0002158Search in Google Scholar

19. Chiba Y. Exact tests for the weak causal null hypothesis on a binary outcome in randomized trials. J Biometr Biostat. 2015;6:244.Search in Google Scholar

20. Fox SM, Einhorn LH, Cox E, Powell N, Abdy A. Ondansetron versus ondansetron, dexamethasone, and chlorpromazine in the prevention of nausea and vomiting associated with multiple-day cisplatin chemotherapy. J Clin Oncol. 1993;11:2391–5.10.1200/JCO.1993.11.12.2391Search in Google Scholar