Abstract
In causal inference, a variety of causal effect estimands have been studied, including the sample, uncensored, target, conditional, optimal subpopulation, and optimal weighted average treatment effects. Ad hoc methods have been developed for each estimand based on inverse probability weighting (IPW) and on outcome regression modeling, but these may be sensitive to model misspecification, practical violations of positivity, or both. The contribution of this article is twofold. First, we formulate the generalized average treatment effect (GATE) to unify these causal estimands as well as their IPW estimates. Second, we develop a method based on Kernel optimal matching (KOM) to optimally estimate GATE and to find the GATE most easily estimable by KOM, which we term the Kernel optimal weighted average treatment effect. KOM provides uniform control on the conditional mean squared error of a weighted estimator over a class of models while simultaneously controlling for precision. We study its theoretical properties and evaluate its comparative performance in a simulation study. We illustrate the use of KOM for GATE estimation in two case studies: comparing spine surgical interventions and studying the effect of peer support on people living with HIV.
1 Introduction
One of the primary goals of causal inference is to estimate the average causal effect of a treatment or intervention on an outcome under study. A common causal estimand of interest is the sample average treatment effect (SATE), which is the average effect of a treatment on an outcome among all individuals in the sample. Often, however, we may be interested in other averages. For example, Buchanan et al. [1] and Stuart [2] consider the target average treatment effect (TATE) on a population or sample distinct from the study sample and propose the use of inverse probability of sampling weights. Similarly, if outcome data are only available for some units, Cain and Cole [3], Robins and Finkelstein [4] propose the use of inverse probability of censoring weights to generalize the results to the whole sample. Other estimands of interest focus on particular subgroups of the sample such as the sample average treatment effect on the treated (SATT), the conditional average treatment effect [5,6], and the complete-case SATE [7]. In particular, Crump et al. [8] propose the optimal SATE (OSATE) and, as in ref. [9], the optimal weighted average treatment effect (OWATE) as the average treatment effect is restricted by or weighted by overlap in covariate distributions to make the estimation easier.
Ad hoc methods, such as those based on inverse probability weighting (IPW) [10,11, 12,13] and outcome regression modeling, have been widely used to estimate these causal estimands. However, due to their sensitivity to model misspecification, these methods may lead to biased estimates. In addition, IPW-based methods depend heavily on the positivity assumption, and practical violations of these methods lead to extreme weights and high variance [14,15, 16,17]. In Section S2.1 in the Supplementary Material, we thoroughly discuss these issues, some of the related work to overcome them and alternative methodologies to estimate the aforementioned causal estimands.
In this article, we start by presenting a general causal estimand, the generalized average treatment effect (GATE), which unifies all the causal estimands previously presented and motivates the formulation of new ones. We then present and apply Kernel optimal matching (KOM) [18,19] to optimally estimate GATE. KOM provides weights that simultaneously mitigates the possible effect of model misspecification and control for possible practical positivity violations [19]. We do that by minimizing the worst-case conditional mean squared error (CMSE) of the weighted estimator in estimating GATE over the space of weights. The proposed methodology has several attractive characteristics. First, KOM can be used to optimally estimate a variety of well-known causal estimands, as well as to find new ones such as the Kernel optimal weighted average treatment effect (KOWATE). In Section 3.3, we show that various causal estimands can be easily estimated by simply modifying the optimization problem formulation we give for KOM, which is fed to an off-the-shelf solver. Second, minimizing the worst-case CMSE of the weighted estimator leads to better accuracy, precision, and total error. We show this in our simulation study in Section 4. Third, by optimally balancing covariates, KOM mitigates the effect of possible model misspecification. In Section 4, we show that both absolute bias and root mean squared error (RMSE) of the weighted estimator that uses weights obtained by using KOM are consistently lower across levels of misspecification. Fourth, the weights are obtained by using off-the-shelf solvers for convex-quadratic optimization. Finally, KOM is implemented in an open source R package.[1]
In Section 2, we introduce notation, specify assumptions and define GATE, the estimand of interest, and its weighted estimator. We then introduce KOM for GATE, describe its theoretical properties, and present some practical guidelines on its use (Section 3). In Section 4, we present the results of a simulation study aimed at comparing the performance of KOM with IPW, overlap weights, truncated weights, and outcome regression modeling with respect to absolute bias and RMSE across levels of practical positivity violations and levels of misspecification. In Section 5, we apply KOM to the evaluation of the effect of spine surgical interventions on the Oswestry disability index (ODI) among patients with lumbar stenosis or lumbar spondylolisthesis, and on the evaluation of peer support on CD4 cell count in two target populations of healthier patients, using real-world data. We conclude with some remarks in Section 6.
2 Generalized average treatment effect
We consider an observational study consisting of
where
Examples of causal estimands, the corresponding weights
Estimand |
|
|
---|---|---|
SATE |
|
|
SATT |
|
|
TATE |
|
|
OWATE |
|
|
OSATE |
|
|
Notes:
To estimate GATE in equation (2.1), we propose to use the following weighted estimator:
For instance, the usual IPW estimator for SATE is given by plugging in
2.1 Identification of GATE by weighting
In this section, we provide a general formulation of the weights that make
Assumption 2.1
(Ignorable treatment assignment)
Assumption 2.2
(Ignorable sampling)
Assumption 2.3
(Treatment overlap) The propensity score
Assumption 2.4
(Selection overlap) The sampling probability
Letting
Assumption 2.5
(Honest weights)
Assumption 2.5 requires
In the next lemma, we define the generalized IPW weights,
Lemma 2.1
Define the generalized IPW weights
where
This is a well-known result for SATE [13] and TATE [22,1]. If we assume appropriate bounds on the norms of
In the next section, we introduce KOM for estimating GATE, which, instead of plugging estimated propensities into the weighted estimator, provides weights that minimizes the CMSE of
3 Kernel optimal matching for estimating GATE
In this section, we start by decomposing the CMSE of the weighted estimator,
3.1 Decomposing the CMSE of
τ
ˆ
W
Recall that, in Section 2, we defined
where
Theorem 3.1
Under consistency, noninterference and Assumptions 2.1–2.5,
In the next section, we show how to find weights that minimize equation (3.1). The main challenge in this task is that the functions
3.2 Worst-case CMSE
To overcome the issue that we do not know the
where
where
is the worst-case imbalance in the
There are many possible ways to choose the seminorm
Given a positive semidefinite (PSD) kernel
Theorem 3.2
Let
where
Based on Theorem 3.2, letting the RKHS given by the kernel
Note that there is freedom to scale this objective. In particular,
3.3 Minimizing the worst-case CMSE
In the previous two sections, we showed that the CMSE of
3.3.1 Fixed
V
1
:
n
Let
where
3.3.2 Variable
V
1
:
n
We can also let
When
The solution to the optimization problem (3.6) provides both weights
When we use
Summary of causal estimands, the corresponding of GATE weights
Estimand |
|
Type |
|
---|---|---|---|
SATE |
|
Fixed (
|
|
SATT |
|
Fixed (
|
|
TATE |
|
Fixed (
|
|
OWATE |
|
Fixed (
|
|
KOWATE |
|
Variable |
|
KOSATE |
|
Variable |
|
What target populations are KOWATE and KOSATE choosing? The idea is to pick the subpopulation that is easiest to estimate by KOM. This subpopulation will emphasize areas with better overlap, where overlap is characterized in terms of worst-case moment imbalances as defined by the kernels, rather than in terms of (unknown) propensity scores.
We illustrate this in a simple simulated example described in Figure 1. Specifically, Figure 1 shows scatterplots between two confounders, one on the vertical axis and one on the horizontal axis, weighted by the weights

Weigths
When targeting SATE, we consider a fixed
The effect of fusion-plus-laminectomy on ODI
SATE | KOSATE | KOWATE | Unadjusted | |
---|---|---|---|---|
|
1.33 (3.98) | 2.54 (2.56) | 3.03 (2.38) | 5.09* (2.31) |
3.4 Consistency
In this section, we study the consistency of the proposed weighted estimator with respect to the true causal estimand GATE (for
Theorem 3.3
Suppose
The aforementioned theorem shows that for any GATE estimand, under appropriate assumptions, the KOM estimate is root-
The assumption about the kernel can be automatically satisfied by using a bounded kernel, such as the Gaussian or Matern kernels. The assumption of
To apply Theorem 3.3 to the case of variable
4 Simulations
In this section, we present the results of a simulation study aimed at comparing KOM with IPW, overlap weights, truncated IPW, and outcome regression modeling in estimating GATE with respect to absolute bias and root MSE, across levels of practical positivity violations and across levels of misspecification. In summary, KOM showed a consistently low absolute bias and RMSE across all of the considered scenarios.
4.1 Setup
We considered a sample size of
4.1.1 Estimating GATE across levels of practical positivity violation
To evaluate the performance of the proposed methodology across levels of practical positivity violation, we let
4.1.2 Estimating GATE across levels of misspecification
We also evaluated the performance of the proposed methodology across levels of misspecification. To do so, we used the variables
4.2 Results
In this section, we discuss the results of our simulation study. In summary, KOM outperformed IPW, overlap, truncated weights, and outcome regression modeling with respect to absolute bias and RMSE in estimating GATE across levels of practical positivity violation under both moderate and strong misspecification.
4.2.1 Results across levels of practical positivity violations and model misspecification
Reference [19] presented KOM for SATE. The authors showed that KOM outperformed IPW, truncated IPW, propensity score matching, regression adjustment, CBPS, and SBW with respect to bias and MSE across most of the considered levels of practical positivity violation and considered scenarios. In addition, the authors showed that KOM for SATE outperformed the other methods especially under strong practical positivity violation. Figure 2 shows the absolute bias (left panels) and RMSE (right panels) of SATE estimated by using KOM (KOM-SATE; solid-black), KOSATE by using KOM (solid-dark-gray), KOWATE estimated by using KOM (solid-light-gray), SATE estimated by using IPW (long-dashed-black), OSATE estimated by using truncated weights (long-dashed-dark-gray), OWATE estimated by using overlap weights (long-dashed-light-gray), and SATE estimated by using outcome regression modeling (OM; dotted-black), and a simple mean difference (Crude; dotted-light-gray), with estimated optimal

(Estimated optimal
5 Application case studies
In this section, we present an empirical application of the proposed methodology. We apply KOM in the evaluation of two spine surgical interventions. In addition, in a second empirical application presented in Section S5 of the Supplementary Material, we apply KOM in the evaluation of peer support on CD4 cell count at 12 months after trial recruitment among patients affected by HIV, in two target populations where patients were healthier compared to those of the trial population.
5.1 The effect of fusion-plus-laminectomy on ODI
In this section, we apply KOM in the evaluation of two spine surgical interventions, laminectomy alone versus fusion-plus-laminectomy, on the Oswestry Disability Index (ODI), among patients with lumbar stenosis or lumbar spondylolisthesis. Briefly, lumbar stenosis is caused by the narrowing of the space around the spinal cord in the lumbar spine [23]. Lumbar spondylolisthesis is caused by the slippage of one vertebra on another. These pathologies lead to low back and leg pain, ultimately limiting the quality of life of those patients affected by them [24]. In case these pathologies are not anymore controlled by medications or physical therapy, surgical interventions may be needed. Typically, patients with lumbar stenosis are treated with laminectomy alone, while those with lumbar spondylolistheses with fusion-plus-laminectomy [23,25,26]. In addition, laminectomy alone is done to patients with leg pain, while fusion-plus-laminectomy to patients with mechanical back pain [23]. This surgical practice leads to a practical positivity violation.
Differently from other medical areas where randomized controlled trials are the gold standard to evaluate interventions, the use of randomized controlled trials to evaluate surgical interventions is rare. This is due to practical and methodological issues [27]. Lately, a number of large real-world observational datasets have collected information about surgical interventions and outcomes. However, these datasets are purely observational and confounding must be carefully taken into account. Furthermore, the assumption of the correct model specification is hardly ever met. To overcome these challenges, in this section, we evaluate the effect of fusion-plus-laminectomy on ODI by estimating SATE, KOWATE, and KOSATE using KOM.
5.1.1 Study population
We used data from a single-institutional subset of the Spine QOD registry [28]. QOD was launched in 2012 with the goal of evaluating the effectiveness of spine surgery interventions on the improvement quality of life, pain, and disability. The registry contains clinical and demographic information as well as patient-reported outcomes. We restrict our study to patients who had their first spine surgery intervention, i.e., primary surgery. Demographic and clinical information was collected at the time of the patient interview that happened before surgical intervention. The outcome under study, ODI, was collected at 3-month follow-up. The study subset was composed of 313 patients. Two-hundred forty-nine (79%) received laminectomy alone and 64 (21%) fusion-plus-laminectomy. We identified as potential confounders the following variables: biological sex (female vs male), lumbar stenosis (yes vs no), lumbar spondylolistheses (yes vs no), back pain (score from 0 to 10), leg pain (score from 0 to 10), activity at home (yes vs no), and activity outside home (yes vs no). As previously described, spine surgical practice may lead to a practical violation of the positivity assumption. For example, in our subset, less than 1% of patients with low-to-moderate leg pain were treated with fusion-plus-laminectomy.
5.1.2 Models setup
We estimate SATE by solving optimization problem (3.3) with
5.1.3 Results
In this section, we present the results of our analysis. Previous randomized trials showed no statistically significant difference between laminectomy alone versus fusion-plus-laminectomy on ODI [31,32]. The proposed methodology consistently showed similar results to those of [31,32]. Specifically, Table 3 shows point estimates and standard errors with respect to SATE, KOWATE, and KOSATE. While the unadjusted method, i.e., naive method regressing only the treatment on the outcome, shows a significant effect of fusion-plus-laminectomy on ODI, adjusted estimates from SATE, KOWATE, and KOSATE show a non statistically significant effect of it. Standard errors are lower for KOWATE and KOSATE compared to SATE. Figure 3 shows the covariate balance with respect to SATE (top panel), KOSATE (middle panel), and KOWATE (lower panel). The black dots show the level of balance after weighting, while the light-gray dots show the unadjusted balance. KOWATE provides the lowest covariate balance compared with SATE and KOSATE. Finally, on the basis of the results obtained by applying KOM, we conclude that fusion-plus-laminectomy has no statistically significant effect on ODI.

Covariate balance with respect to SATE (top panel), KOSATE (middle panel), and KOWATE (lower panel). The black dots reflect the level of balance after weighting for SATE, KOSATE, and KOWATE weights, while the light-gray dots show the unadjusted balance.
5.1.4 What populations are KOWATE and KOSATE choosing?
As discussed in Section 3.3.2, by changing the set of weights
Description of populations obtained when using SATE, KOWATE, and KOSATE
Sample size | SATE | KOWATE | KOSATE |
---|---|---|---|
313 | 313 | 249 | |
|
|
|
|
Lumbar spondylolisthese (Y) | 53.0 (16.9) | 83.7 (26.7) | 43.0 (17.3) |
Lumbar stenosis (Y) | 183.0 (58.5) | 155.3 (49.6) | 133.0 (53.4) |
Physical activity outside home (N) | 47.0 (15.0) | 16.7 ( 5.3) | 26.0 (10.4) |
Physical activity inside home (N) | 42.0 (13.4) | 14.2 ( 4.5) | 12.0 ( 4.8) |
Gender (M) | 188.0 (60.1) | 192.6 (61.5) | 152.0 (61.0) |
Mean (SD) | Mean (SD) | Mean (SD) | |
---|---|---|---|
Leg pain | 6.98 (2.87) | 6.88 (2.64) | 7.12 (2.51) |
Back pain | 6.01 (3.29) | 5.70 (3.17) | 6.03 (3.12) |
6 Conclusion
In this article, we presented a general causal estimand, GATE, that unified previously proposed causal estimand, such as SATE, OWATE, OSATE, and TATE among others and motivated the formulation of new ones. We also presented and applied KOM to optimally estimate GATE. KOM directly and optimally control both bias and variance, which leads to a successful mitigation of possible model misspecifications while controlling precision. In addition, by easily modifying the optimization problem that is fed to an off-the-shelf solver, the proposed method effectively targets different causal estimands of interest. Furthermore, by automatically learning the structure of the data, KOM allows to balance linear, nonlinear, additive, and nonadditive covariate relationships. One future direction may be to extend KOM for GATE in the longitudinal setting with time-dependent confounders, extending the work of [33] to more general estimands. Another future direction may be to extend the analysis (specifically, Lemma 2.1 and Theorem 3.3) to the stratified setting where we condition on
An issue of KOWATE and KOSATE is their interpretation. Here, we provide rationale for the consideration of KOWATE and KOSATE, especially in the presence of the lack of overlap. First, similar to truncated IPW and overlap weights, the target population obtained by KOWATE and KOSATE is clinically relevant. This is because it highlights the portion of the sample where the treatment is actually applied [37]. For instance, in the study population described in Section 5, less than 1% of patients with low-to-moderate leg pain were treated with fusion-plus-laminectomy. This would suggest that it would be more clinically relevant to target a population where subjects with low-to-moderate leg pain and those who receive fusion-plus-laminectomy were not included. This is particularly important when emulating a target randomized trial, where propensity scores close to 0 or 1 would suggest that important inclusion or exclusion criteria may not have been followed [38]. Second, in the case of homogeneous treatment effect, as in our simulation setting, the true effect of SATE, KOWATE, and KOSATE is the same. For instance, assuming that the effect of fusion-plus-laminectomy is homogeneous, a claim that it has no effect on ODI can be interpreted as that for SATE, the average effect of a treatment on an outcome among all individuals in the sample. Similar reasoning can be applied for the interpretation of confidence intervals. However, in the presence of lack of overlap, KOWATE and KOSATE may be helpful in providing more precise estimates (as shown in Section S4.4 of the Supplementary Material). In the case of heterogeneous effects, the problem of large weights exacerbate, and, consequently, conditional KOWATE and KOSATE could be used. Third, although this does not solve the issue of interpretation, deviation of estimands from SATE is common in statistical data analysis practice. For instance, many matching algorithms exclude subjects depending on some tuning parameters [39]. Another widely used technique, weight trimming/truncation also alters the estimand. While these deviations from SATE are intentionally done to deal with lack of overlap, weight truncation, for instance, may introduce bias. Similar to other recently proposed successful techniques, such as overlap weights, the proposed estimands and methodology provide a less ad hoc way to deal with these issues.
Acknowledgments
The Authors thank Mattias Larsson, Nguyen Thi Kim Chuc, Do Duy Cuong and Vu Van Tam for providing access to the HIV dataset. This article is based upon work supported by the National Science Foundation under Grants Nos. 1656996 and 1740822.
-
Funding information: This article is based upon work supported by the National Science Foundation under Grants Nos. 1656996 and 1740822.
-
Conflict of interest: Authors state no conflict of interest.
References
[1] Buchanan AL, Hudgens MG, Cole SR, Mollan KR, Sax PE, Daar ES, et al. Generalizing evidence from randomized trials using inverse probability of sampling weights. J R Statist Soc A (Statist Soc). 2018;181(4):1193–209. 10.1111/rssa.12357Search in Google Scholar PubMed PubMed Central
[2] Stuart EA. Matching methods for causal inference: a review and a look forward. Statist Sci. 2010;25(1):1–21. http://dx.doi.org/10.1214/09-STS313. Search in Google Scholar PubMed PubMed Central
[3] Cain LE, Cole SR. Inverse probability-of-censoring weights for the correction of time-varying noncompliance in the effect of randomized highly active antiretroviral therapy on incident AIDS or death. Statist Med. 2009;28(12):1725–38. 10.1002/sim.3585Search in Google Scholar PubMed
[4] Robins JM, Finkelstein DM. Correcting for noncompliance and dependent censoring in an AIDS clinical trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics. 2000;56(3):779–88. 10.1111/j.0006-341X.2000.00779.xSearch in Google Scholar PubMed
[5] Crump RK, Hotz VJ, Imbens GW, Mitnik OA. Nonparametric tests for treatment effect heterogeneity. Rev Econom Statist. 2008;90(3):389–405. 10.3386/t0324Search in Google Scholar
[6] Cai T, Tian L, Wong PH, Wei L. Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics. 2010;12(2):270–82. 10.1093/biostatistics/kxq060Search in Google Scholar PubMed PubMed Central
[7] Seaman SR, White IR. Review of inverse probability weighting for dealing with missing data. Statist Meth Med Res. 2013;22(3):278–95. 10.1177/0962280210395740Search in Google Scholar PubMed
[8] Crump RK, Hotz VJ, Imbens GW, Mitnik OA. Dealing with limited overlap in estimation of average treatment effects. Biometrika. 2009;96(1):187–99. 10.1093/biomet/asn055Search in Google Scholar
[9] Li F, Morgan KL, Zaslavsky AM. Balancing covariates via propensity score weighting. J Am Statist Assoc. 2018;113(521):390–400. 10.1080/01621459.2016.1260466Search in Google Scholar
[10] Horvitz DG, Thompson DJ. A generalization of sampling without replacement from a finite universe. J Am Statist Assoc. 1952;47(260):663–85. 10.1080/01621459.1952.10483446Search in Google Scholar
[11] Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89(427):846–66. 10.1080/01621459.1994.10476818Search in Google Scholar
[12] Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Statistical models in epidemiology, the environment, and clinical trials. New York, NY: Springer; 2000. p. 95–133. 10.1007/978-1-4612-1284-3_2Search in Google Scholar
[13] Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statist Med. 2004;23(19):2937–60. 10.1002/sim.7231Search in Google Scholar PubMed
[14] Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Statist Assoc. 1995;90(429):106–21. 10.1080/01621459.1995.10476493Search in Google Scholar
[15] Scharfstein DO, Rotnitzky A, Robins JM. Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Statist Assoc. 1999;94(448):1096–120. 10.1080/01621459.1999.10473862Search in Google Scholar
[16] Robins J, Sued M, Lei-Gomez Q, Rotnitzky A. Comment: Performance of double-robust estimators when “inverse probability” weights are highly variable. Statist Sci. 2007;22(4):544–59. 10.1214/07-STS227DSearch in Google Scholar
[17] Kang JD, Schafer JL. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statist Sci. 2007;22(4):523–39. Search in Google Scholar
[18] Kallus N. Generalized Optimal Matching Methods for Causal Inference. 2016. arXiv:http://arXiv.org/abs/arXiv:161208321. Search in Google Scholar
[19] Kallus N, Pennicooke B, Santacatterina M. More Robust Estimation of Sample Average Treatment Effects Using Kernel Optimal Matching in an Observational Study of Spine Surgical Interventions. 2018. arXiv:http://arXiv.org/abs/arXiv:181104274. Search in Google Scholar
[20] Imbens GW, Rubin DB. Causal inference in statistics, social, and biomedical sciences. Cambridge: Cambridge University Press; 2015. 10.1017/CBO9781139025751Search in Google Scholar
[21] Kallus N. More efficient policy learning via optimal retargeting. J Am Statist Assoc. 2021;116(534):646–58. 10.1080/01621459.2020.1788948Search in Google Scholar
[22] Stuart EA, Cole SR, Bradshaw CP, Leaf PJ. The use of propensity scores to assess the generalizability of results from randomized trials. J R Statist Soc A (Statist Soc). 2011;174(2):369–86. 10.1111/j.1467-985X.2010.00673.xSearch in Google Scholar PubMed PubMed Central
[23] Resnick DK, Watters WC, Mummaneni PV, Dailey AT, Choudhri TF, Eck JC, et al. Guideline update for the performance of fusion procedures for degenerative disease of the lumbar spine. Part 10: lumbar fusion for stenosis without spondylolisthesis. J Neurosurgery: Spine. 2014;21(1):62–6. 10.3171/2014.4.SPINE14275Search in Google Scholar PubMed
[24] Waterman BR, Belmont Jr PJ, Schoenfeld AJ. Low back pain in the United States: incidence and risk factors for presentation in the emergency setting. Spine J. 2012;12(1):63–70. 10.1016/j.spinee.2011.09.002Search in Google Scholar PubMed
[25] Eck JC, Sharan A, Ghogawala Z, Resnick DK, Watters III WC, Mummaneni PV, et al. Guideline update for the performance of fusion procedures for degenerative disease of the lumbar spine. Part 7: lumbar fusion for intractable low-back pain without stenosis or spondylolisthesis. J Neurosurgery: Spine 2014;21(1):42–7. 10.3171/2014.4.SPINE14270Search in Google Scholar PubMed
[26] Raad M, Donaldson CJ, El Dafrawy MH, Sciubba DM, Riley III LH, Neuman BJ, et al. Trends in isolated lumbar spinal stenosis surgery among working US adults aged 40–64 years, 2010–2014. J Neurosurgery: Spine 2018;29(2):169–75. 10.3171/2018.1.SPINE17964Search in Google Scholar PubMed
[27] Carey TS. Randomized controlled trials in surgery: an essential component of scientific progress. Spine. 1999;24(23):2553. 10.1097/00007632-199912010-00020Search in Google Scholar PubMed
[28] NeuroPoint Alliance I. QOD spine surgery registry; 2018. http://www.neuropoint.org/registries/qod-spine/. Search in Google Scholar
[29] Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the joint causal effect of nonrandomized treatments. J Am Statist Assoc. 2001;96(454):440–8. 10.1198/016214501753168154Search in Google Scholar
[30] Freedman DA. On the so-called Huber sandwich estimator and robust standard errors. Am Statistician. 2006;60(4):299–302. 10.1017/CBO9780511815874.019Search in Google Scholar
[31] Försth P, Ólafsson G, Carlsson T, Frost A, Borgström F, Fritzell P, et al. A randomized, controlled trial of fusion surgery for lumbar spinal stenosis. New England J Med. 2016;374(15):1413–23. 10.1056/NEJMoa1513721Search in Google Scholar PubMed
[32] Ghogawala Z, Dziura J, Butler WE, Dai F, Terrin N, Magge SN, et al. Laminectomy plus fusion versus laminectomy alone for lumbar spondylolisthesis. New England J Med. 2016;374(15):1424–34. 10.1056/NEJMoa1508788Search in Google Scholar PubMed
[33] Kallus N, Santacatterina M. Optimal balancing of time-dependent confounders for marginal structural models. 2018 June. arXiv e-prints. arXiv:1806.01083. Search in Google Scholar
[34] Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, et al. Double/debiased machine learning for treatment and structural parameters. Econometrics J. 2018;21(1):C1–C68. 10.1111/ectj.12097. Search in Google Scholar
[35] Hazlett C. Kernel balancing: a flexible non-parametric weighting procedure for estimating causal effects. 2018. Available at SSRN 2746753. Search in Google Scholar
[36] Wong RK, Chan KCG. Kernel-based covariate functional balancing for observational studies. Biometrika. 2018;105(1):199–213. 10.1093/biomet/asx069Search in Google Scholar PubMed PubMed Central
[37] Li F, Thomas LE, Li F. Addressing extreme propensity scores via the overlap weights. Am J Epidemiol. 2019;188(1):250–7. 10.1093/aje/kwy201Search in Google Scholar PubMed
[38] Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–64. 10.1093/aje/kwv254Search in Google Scholar PubMed PubMed Central
[39] Visconti G, Zubizarreta JR. Handling limited overlap in observational studies with cardinality matching. Observat Stud. 2018;4:217–49. 10.1353/obs.2018.0012Search in Google Scholar
© 2022 Nathan Kallus and Michele Santacatterina, published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.