Skip to content
Licensed Unlicensed Requires Authentication Published online by De Gruyter January 4, 2022

Doubly robust adaptive LASSO for effect modifier discovery

Asma Bahamyirou, Mireille E. Schnitzer ORCID logo, Edward H. Kennedy, Lucie Blais and Yi Yang

Abstract

Effect modification occurs when the effect of a treatment on an outcome differsaccording to the level of some pre-treatment variable (the effect modifier). Assessing an effect modifier is not a straight-forward task even for a subject matter expert. In this paper, we propose a two-stageprocedure to automatically selecteffect modifying variables in a Marginal Structural Model (MSM) with a single time point exposure based on the two nuisance quantities (the conditionaloutcome expectation and propensity score). We highlight the performance of our proposal in a simulation study. Finally, to illustrate tractability of our proposed methods, we apply them to analyze a set of pregnancy data. We estimate the conditional expected difference in the counterfactual birth weight if all women were exposed to inhaled corticosteroids during pregnancy versus the counterfactual birthweight if all women were not, using data from asthma medications during pregnancy.


Corresponding author: Mireille E. Schnitzer, Faculté de pharmacie, Université de Montréal, Pavillon Jean-Coutu, 2940 ch de la Polytechnique, Office #2236, Montreal, QC, Canada, E-mail:

Funding source: Natural Sciences and Engineering Research Council of CanadaHealth ResearchUniversité de Montréal

Award Identifier / Grant number: Unassigned

Award Identifier / Grant number: Unassigned

Award Identifier / Grant number: Unassigned

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: This work was supported by the Natural Sciences and Engineering Research Council of Canada (Discovery Grant and Accelerator Supplement to MES), the Canadian Institutes of Health Research (New Investigator Salary Award to MES) and the Faculté de pharmacie at Université de Montréal (funding for AB and MES).

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

Appendix

In the Appendix, we give the numerical results of the simulation study, the baseline characteristics of our pregnancy data, the results of our application and the proofs of the two lemmas (Table 9).

Figure 5: 
Percent coverage of the selective confidence interval associated to V
1 and V
3 for different sample size. Notation: Qcgc: Models for 





Q

̄




$\bar{Q}$



 and g are correctly specified, Qc: 





Q

̄




$\bar{Q}$



 is correctly specified, gc: g is correctly specified, HAL: 





Q

̄




$\bar{Q}$



 and g are estimated with HAL.

Figure 5:

Percent coverage of the selective confidence interval associated to V 1 and V 3 for different sample size. Notation: Qcgc: Models for Q ̄ and g are correctly specified, Qc: Q ̄ is correctly specified, gc: g is correctly specified, HAL: Q ̄ and g are estimated with HAL.

Table 1:

Simulation results (Data generating scenario 1).

Coef EM n = 1000 n = 10,000
β ̂ V %sel %Cov FCR β ̂ V %sel %Cov FCR
(1) Q ̄ & g model are correctly specified
V 1 T 0.46 98 96 5 0.49 100 95 6
V 2 F 0.00 21 0.00 12
V 3 T 0.98 100 95 0.99 100 95
V 4 F 0.00 21 0.00 13
(2) Q ̄ model is correctly specified
V 1 T 0.46 99 96 6 0.49 100 95 6
V 2 F 0.00 21 0.00 11
V 3 T 0.98 100 94 0.99 100 96
V 4 F 0.00 19 0.00 12
(3) g model is correctly specified
V 1 T 0.31 55 95 2 0.47 99 100 2
V 2 F 0.01 19 0.00 14
V 3 T 0.83 92 100 0.99 100 99
V 4 F 0.00 26 0.00 22
(4) Q ̄ & g model are estimated using HAL
V 1 T 0.46 99 95 6 0.49 100 95 6
V 2 F 0.00 21 0.00 12
V 3 T 0.98 100 94 1.00 100 95
V 4 F 0.00 22 0.00 13
(5) Naive linear model
V 1 T 0.69 95 83 19 0.69 100 11 65
V 2 F 0.15 12 88 0.15 67 33
V 3 T 1.35 100 56 1.36 100 0
V 4 F 0.01 37 96 0.00 47 95
(6) Linear model correctly specified
V 1 T 0.50 97 96 5 0.50 100 95 4
V 2 F 0.00 6 94 0.00 5 95
V 3 T 1.00 100 95 1.00 100 95
V 4 F 0.00 4 96 0.00 4 96

  1. Estimates taken over 1000 generated datasets. β ̂ V : average estimated value of the coefficients of the MSM, %Cov: percent coverage of the selective confidence interval × 100 (Standard CI for the linear model case), %sel: percent selection of variables × 100, FCR: False coverage rate × 100, EM: T (variable is an effect-modifier) and F (variable is not an effect-modifier). The true values of the coefficients are β V = (0.5, 0, 1, 0).

Table 2:

Simulation results (Data generating scenario 2).

Coef EM n = 1000 n = 10,000
β ̂ V %sel %Cov FCR β ̂ V %sel %Cov FCR
(1) Q & g model are correctly specified
V 1 T 0.47 99 96 5 0.49 100 95 5
V 2 F 0.00 20 0.00 13
V 3 T 0.98 100 95 1.00 100 95
V 4 F 0.00 23 0.00 12
(2) Q model is correctly specified
V 1 T 0.47 99 97 5 0.49 100 94 6
V 2 F 0.00 20 0.00 11
V 3 T 0.99 100 95 1.00 100 95
V 4 F 0.00 21. 0.00 11
(3) g model is correctly specified
V 1 T 0.32 55 99 2 0.47 99 99 2
V 2 F 0.01 19 0.00 14
V 3 T 0.85 94 98 0.99 100 99
V 4 F −0.01 24 0.00 21
(4) Q & g model are estimated using HAL
V 1 T 0.47 98 97 5 0.49 100 95 7
V 2 F 0.00 22 0.00 12
V 3 T 0.98 100 94 1.00 100 95
V 4 F 0.00 22 0.00 12
(6) Linear model correctly specified
V 1 T 0.50 89 96 5 0.50 100 95 5
V 2 F 0.00 6 94 0.00 6 94
V 3 T 1.00 100 94 1.00 100 95
V 4 F 0.00 4 97 0.00 4 96

  1. Estimates taken over 1000 generated datasets. β ̂ V : coefficients of the MSM, Cov: percent coverage of the selective confidence interval × 100, %sel: percent selection of variables × 100, FCR: False coverage rate × 100, EM: T (variable is an effect-modifier) and F (variable is not an effect-modifier). The true values of the coefficients are β V = (0.5, 0, 1, 0).

Table 3:

Simulation results (Data generating scenario 3).

Coef EM n = 1000 n = 10,000
β ̂ V %sel %Cov FCR β ̂ V %sel %Cov FCR
(1) Q & g model are correctly specified
V 1 T 0.44 94 97 5 0.49 100 96 5
V 2 F 0.00 23 0.00 16
V 3 T 0.97 100 95 1.00 100 97
V 4 F 0.00 23 0.00 17
(2) Q model is correctly specified
V 1 T 0.45 96 97 6 0.50 100 94 7
V 2 F 0.00 20 0.00 13
V 3 T 0.98 100 93 1.00 100 95
V 4 F 0.00 22 0.00 12
(3) g model is correctly specified
V 1 T 0.34 74 100 3 0.49 100 100 4
V 2 F 0.01 23 0.00 18
V 3 T 0.91 99 97 0.99 100 96
V 4 F 0.00 25 0.00 24
(4) Q & g model are estimated using HAL
V 1 T 0.45 95 95 6 0.49 100 95 5
V 2 F 0.00 24 0.00 16
V 3 T 0.98 100 94 1.00 100 96
V 4 F 0.00 23 0.00 16
(5) Naive linear model
V 1 T 0.60 89 93 10 0.59 100 63 43
V 2 F 0.10 76 92 0.10 35 65
V 3 T 1.21 100 81 1.21 100 58
V 4 F 0.01 38 96 −0.00 44 96
(6) Linear model correctly specified
V 1 T 0.50 98 96 5 0.50 100 95 5
V 2 F 0.00 4 96 0.00 5 95
V 3 T 1.00 100 95 1.00 100 95
V 4 F 0.00 5 95 0.00 5 95

  1. Estimates taken over 1000 generated datasets. β ̂ V : coefficients of the MSM, Cov: percent coverage of the selective confidence interval × 100, %sel: percent selection of variables × 100, FCR: False coverage rate × 100, EM: T (variable is an effect-modifier) and F (variable is not an effect-modifier). The true values of the coefficients are β V = (0.5, 0, 1, 0).

Table 4:

Simulation results for smaller sample size (n = 100).

Coef EM Scenario 1 Scenario 2 Scenario 3
β ̂ V %sel Cov FCR β ̂ V %sel Cov FCR β ̂ V %sel Cov FCR
(1) Q & g model are correctly specified
V 1 T 0.39 52 87 8 0.34 49 88 9 0.30 41 89 10
V 2 F −0.01 22 −0.01 25 0.02 24
V 3 T 0.85 86 94 0.78 80 96 0.78 71 93
V 4 F 0.01 28. 0.00 25 0.00 24
(2) Q model is correctly specified
V 1 T 0.38 53 91 7 0.36 50 88 8 0.29 41 89 10
V 2 F −0.03 27 0.00 21 0.01 20
V 3 T 0.83 85 98 0.79 8 97 0.76 72 93
V 4 F −0.02 25 0.00 27 0.00 21
(3) g model is correctly specified
V 1 T 0.24 20 97 9 0.24 25 98 6 0.26 25 91 9
V 2 F 0.04 16 0.04 1 0.04 26
V 3 T 0.51 29 90 0.59 45 95 0.68 47 88
V 4 F 0.01 21 0.02 23 0.00 25
(4) Q & g model are estimated using HAL
V 1 T 0.39 54 83 10 0.36 51 85 9 0.32 45 79 11
V 2 F 0.00 30 0.01 27 0.00 27
V 3 T 0.84 87 96 0.79 81 96 0.80 82 95
V 4 F 0.00 27 0.01 27 −0.02 24

  1. Estimates taken over 500 generated datasets. β ̂ V : coefficients of the MSM, Cov: percent coverage of the selective confidence interval × 100, %sel: percent selection of variables × 100, FCR: False coverage rate × 100, EM: T (variable is an effect-modifier) and F (variable is not an effect-modifier). The true values of the coefficients are β V = (0.5, 0, 1, 0).

Table 5:

Simulation results (Data generating scenario 1 with 50 noise covariates).

Coef EM n = 1000 n = 10,000
β ̂ V %sel %Cov FCR β ̂ V %sel %Cov FCR
(1) Estimates related to the potential EM that are not noise covariates
V 1 T 0.43 100 100 15 0.48 100 100 15
V 2 F 0.00 14 0.00 15
V 3 T 0.95 100 91 0.99 100 90
V 4 F 0.01 15 0.00 14
(2) Summary of the 50 potential EM that are noise covariates
Min −0.01 7.0 0.00 5
Q 1 0.00 12 0.00 11
Median 0.00 14 0.00 13
Q 3 0.00 16 0.00 15
Max 0.01 23 0.00 22

  1. Estimates taken over 100 generated datasets. β ̂ V : coefficients of the MSM, Cov: percent coverage of the selective confidence interval, %sel: percent selection of variables, FCR: False coverage rate, EM: T (variable is an effect-modifier) and F (variable is not an effect-modifier). The true values of the coefficients are β V = (0.5, 0, 1, 0, …, 0).

Table 6:

Baseline Characteristics of mothers in the cohort extraction (N = 4707).

Characteristics No ICS ICS
N (%) N (%)
Cohort size 2272 (100) 2435 (100)
Age
< 18 45 (1.9) 60 (2.4)
18–34 1958 (86.1) 2041 (83.8)
> 34 269 (11.8) 334(13.7)
Sex of the newborn 1149 (51.0) 1271 (52.0)
Welfare recipient 1126 (50.0) 1429 (59.0)
Urban residence 476 (18.0) 407 (20.0)
Hypertension 61 (3.0) 83 (3.0)
Diabetes 73 (3.0) 81 (3.0)
COPD 28 (1.0) 56 (2.0)
Cyanotic heart disease 7 (0.0) 8 (0.0)
Antiphospholipid syndrome 12 (1.0) 13 (1.0)
Uterine disorder 264 (12.0) 331 (14.0)
Epilepsy 18 (1.0) 23 (1.0)
Obesity 87 (4.0) 127 (5.0)
Lupus 1 (0.0) 2 (0.0)
Collagenous vascular disease 6 (0.0) 6 (0.0)
Cushing’s syndrome 4 (0.0) 4 (0.0)
Oral corticosteroids one year before pregnancy 234 (10.0) 281(12.0)
Oral SABA use one year before pregnancy 16 (1.0) 8 (0.0)
At least one dose of inhaled SABA taken per week 1523 (67.0) 1332 (55.0)
HIV 3 (0.0) 1 (0.0)
Cytomegalovirus infection 3 (0.0) 12 (0.0)
Leukotriene-receptor antagonists 33 (1.0) 30 (1.0)
Theophylline use one year before pregnancy 0 (0.0) 0 (0.0)
Intranasal corticosteroids 243 (11.0) 318 (13.0)
Folic acid one year before pregnancy 18 (1.0) 43 (2.0)
Teratogens taken one year before 0 (0.0) 0 (0.0)
Medication for epilepsy one year before pregnancy 29 (1.0) 48 (2.0)
Warfarin one year before pregnancy 7(0.0) 10 (0.0)
Use of beta-bloqueur one year before pregnancy 19 (1.0) 26 (1.0)
Asthma exacerbation one year before pregnancy 377 (17.0) 411 (17.0)
Hospitalization for asthma 1079 (47.0) 809 (33.0)
Chromosomal anomalies 6 (0.0) 4 (0.0)
Cumulative dose of ICS in days (mean (SD)) 51.6 (72.8) 54.0 (85.8)
One year cumulative dose of ICS before pregnancy (mean (SD)) 151 (32.0) 101.5 (126.3)
At least one emergency department visit for asthma 260 (7.0) 265 (19.0)
At least one hospitalization for asthma 5 (0.0) 8 (1.0)

Table 7:

Estimates of the coefficients associated with interaction terms using the naive linear model (n = 4707).

Variables Estimate ( β ̂ j ) STD p-Value
Intercept 3.153
CS:At least one dose of inhaled SABA taken per week −0.002 0.039 0.940
CS:Leukotriene-receptor antagonists −0.365 0.142 0.010*
CS:Intranasal corticosteroids 0.063 0.051 0.214
CS:Folic acid one year before pregnancy −0.129 0.159 0.415
CS:Medication for epilepsie −0.136 0.135 0.313
CS:Warfarin −0.386 0.277 0.164
CS:Beta-blockers −0.287 0.173 0.097
CS:Asthma exacerbation 0.062 0.069 0.368
CS:At least one hospitalization for asthma 0.017 0.036 0.624
CS:At least one emergency department visit for asthma 0.067 0.055 0.223
CS:COPD 0.141 0.130 0.280
CS:Cyanotic heart disease −0.345 0.292 0.237
CS:Oral corticosteroids one year before −0.081 0.081 0.319
CS:Obesity 0.053 0.080 0.508
CS:Uterine disorder −0.036 0.050 0.460
CS:Oral SABA use one year before −0.025 0.244 0.918
CS:Antiphospholipid syndrome 0.394 0.227 0.083
CS:Sex of new born −0.031 0.032 0.335
CS:Welfare recipient −0.043 0.033 0.187 1
CS:Rural/non-rural residence indicator 0.021 0.042 0.602
CS:Hypertension 0.028 0.098 0.774
CS:Diabetes −0.105 0.092 0.255
CS:Chromosomal anomalies −1.230 0.361 0.000 6*
CS:Cytomegalovirus infection 0.146 0.360 0.683

Table 8:

Estimates of the selected MSM coefficients using adaptive lasso (n = 4707) with 95% post selection interval for the selected variables.

Variables Estimate ( β ̂ j ) CI Low CI up
High adaptive LASSO for Q & g
Intercept 0.018
Leukotriene-receptor antagonists* −0.177 −0.502 −0.031
Warfarin −0.146 −0.745 0.311
Chromosomal anomalies* −0.777 −1.420 −0.285

  1. *: means interval excluded the null value.

Table 9:

Computation time in seconds for the simulation (run on a single dataset) and application.

Methods n = 1000 n = 4707 n = 10,000
Low-dimensional
Parametric regression for Q & g 0.16s 4.62s
Highly adaptive LASSO for Q & g 1.09s 9.06s
High-dimensional
Highly adaptive LASSO for Q & g 49.75s 600s
Data analysis
Highly adaptive LASSO for Q & g 115.2s

Proof of Lemma 1

Denote Q ̄ n (respectively g n ) an estimator of Q ̄ (respectively g). We have:

E P 0 ( D ( Q ̄ n , g n ) | V ) = E P 0 2 A 1 g n ( A | W ) ( Y Q ̄ n ( A , W ) ) + Q ̄ n ( 1 , W ) Q ̄ n ( 0 , W ) | V = E P 0 2 A 1 g n ( A | W ) Y Q ̄ n ( A , W ) | V + E P 0 { Q ̄ n ( 1 , W ) Q ̄ n ( 0 , W ) | V } + ψ 0 ( V ) ψ 0 ( V ) = ψ 0 ( V ) + E P 0 { Q ̄ n ( 1 , W ) Q ̄ n ( 0 , W ) } { Q ̄ 0 ( 1 , W ) Q ̄ 0 ( 0 , W ) } | V + E P 0 2 A 1 g n ( A | W ) ( Y Q ̄ n ( A , W ) ) | V = ψ 0 ( V ) + W { Q ̄ n ( 1 , W ) Q ̄ n ( 0 , W ) } { Q ̄ 0 ( 1 , W ) Q ̄ 0 ( 0 , W ) } + P 0 ( 1 | W ) g n ( 1 | W ) Q ̄ 0 ( 1 , W ) Q ̄ n ( 1 , W ) P 0 ( 0 | W ) g n ( 0 | W ) Q ̄ 0 ( 0 , W ) Q ̄ n ( 0 , W ) d P 0 ( W | V ) = ψ 0 ( V ) + W P 0 ( 1 | W ) g n ( 1 | W ) 1 Q ̄ 0 ( 1 , W ) Q ̄ n ( 1 , W ) + P 0 ( 0 | W ) g n ( 0 | W ) 1 Q ̄ 0 ( 0 , W ) Q ̄ n ( 0 , W ) d P 0 ( W | V )

Then E P 0 ( D ( Q ̄ n , g n ) | V ) ψ 0 ( V ) if g n (A| W ) or Q ̄ n ( A , W ) is consistently estimated.

Proof of Lemma 2

Let D n = D ( Q ̄ n , g n ) (respectively D 0 = D ( Q ̄ 0 , g 0 ) ) represent the estimated pseudo function (respectively the true pseudo-outcome). Our method minimizes the expected risk function below with respect to β:

D n j V ( j ) β j 2 + λ j = 1 p w ̂ j | β j |

where ω ̂ j = 1 / | β ̃ j | γ , j = 1, …, p, for some γ > 0.

Let ϵ n = D n − ∑ j V (j) β j be the residual of the penalized linear regression of D n on V . The proof follows essentially the one of Zou ([17]). We have to show that ϵ n T V / n follows a normal distribution with mean zero and a finite variance.

Indeed, one can write

ϵ n = ( D n D 0 ) + D 0 j V ( j ) β j .

ϵ n T V / n = n P n ( D n D 0 ) T V R 1 + n P n D 0 j V ( j ) β j T V R 2 .

with P n denotes the empirical measure. ϵ 0 = D 0 − ∑ j V (j) β j is the residual of the penalized linear regression of the oracle pseudo function D 0 on V . Therefore, if we assume 1 n V T V C with C a positive definite matrix, we have R 2 d N ( 0 , σ 2 C ) .

One can write

n P n ( D n D 0 ) T V n P n ( D n D 0 ) T V

Semenova and Chernozhukov ([20]) showed in Lemma A.3, given their Assumption 3.5, is that

n P n ( D n D 0 ) T V = o ( 1 )

Therefore, n P n ( D n D 0 ) T V = o ( 1 ) which yields the result.

References

1. Green, DP, Kern, HL. Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees. Publ Opin Q 2012;76:491–511. https://doi.org/10.1093/poq/nfs036.Search in Google Scholar

2. Chipman, HA, George, EI, McCulloch, RE. BART: Bayesian additive regression trees. Ann Appl Stat 2010;4:266–98. https://doi.org/10.1214/09-aoas285.Search in Google Scholar

3. Imai, K, Ratkovic, M. Estimating treatment effect heterogeneity in randomized program evaluation. Ann Appl Stat 2013;7:443–70. https://doi.org/10.1214/12-aoas593.Search in Google Scholar

4. Nie, X, Wager, S. Quasi-oracle estimation of heterogeneous treatment effects. 2017. arXiv:1712.04912.10.1093/biomet/asaa076Search in Google Scholar

5. Luo, W, Wu, W, Zhu, Y. Learning heterogeneity in causal inference using sufficient dimension reduction. J Causal Inference 2018;7:20180015. https://doi.org/10.1515/jci-2018-0015.Search in Google Scholar

6. Wager, S, Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. Ann Appl Stat 2018;112:1228–42. https://doi.org/10.1080/01621459.2017.1319839.Search in Google Scholar

7. Breiman, L. Random forests. Machine Learning, 2001;45:5–32.10.1023/A:1010933404324Search in Google Scholar

8. Powers, S, Qian, J, Jung, K, Schuler, A, Shah, N, Hastie, T, et al.. Some methods for heterogeneous treatment effect estimation in high dimensions. Stat Med 2018;2037:1767–87. https://doi.org/10.1002/sim.7623.Search in Google Scholar

9. Friedman, J. Multivariate adaptive regression splines. Ann Stat 1991;19:1–67. https://doi.org/10.1214/aos/1176347963.Search in Google Scholar

10. Zhao, Q, Small, DS, Ertefaie, A. Selective inference for effect modification via the lasso. 2018. arXiv:1705.08020.10.1111/rssb.12483Search in Google Scholar

11. Robinson, PM. Root-N-consistent semiparametric regression. Econometrica 1998;56:931–54. https://doi.org/10.2307/1912705.Search in Google Scholar

12. van der Laan, MJ, Rubin, D. Targeted maximum likelihood learning. Int J Biostat 2006;2. https://doi.org/10.2202/1557-4679.1043. 1016090934.Search in Google Scholar

13. van der Laan, MJ, Rose, S. Targeted learning: causal inference for observational and experimental data. In: Springer Series in Statistics. Springer, New York, NY; 2011.10.1007/978-1-4419-9782-1Search in Google Scholar

14. Scharfstein, DO, Rotnitzky, A, Robins, JM. Adjusting for nonignorable dropout using semiparametric nonresponse models, (with discussion and rejoinder). J Am Stat Assoc 1999;94:1096–1120. https://doi.org/10.1080/01621459.1999.10473862.Search in Google Scholar

15. Bang, H, Robins, JM. Doubly robust estimation in missing data and causal inference models. Biometrics 2005;61:962–72. https://doi.org/10.1111/j.1541-0420.2005.00377.x.Search in Google Scholar

16. Benkeser, D, Carone, M, van der Laan, MJ, Gilbert, P. Doubly robust nonparametric inference on the average treatment effect. Biometrika 2017;104:863–80. https://doi.org/10.1093/biomet/asx053.Search in Google Scholar

17. Lee, S, Okui, R, Whang, YJ. Doubly robust uniform confidence band for the conditional average treatment effect function. J Appl Econom 2017;32:1207–25. https://doi.org/10.1002/jae.2574.Search in Google Scholar

18. Zheng, W, Luo, Z, van der Laan, MJ. Marginal structural models with counterfactual effect modifiers. Int J Biostat 2018;14:20180039. https://doi.org/10.1515/ijb-2018-0039.Search in Google Scholar

19. Kennedy, EH. Optimal doubly robust estimation of heterogeneous causal effects. 2020. arXiv:2004.14497v1.Search in Google Scholar

20. Semenova, V, Chernozhukov, V. Debiased machine learning of conditional average treatment effects and other causal functions. Econom J 2020;24:1–49. https://doi.org/10.1093/ectj/utaa027.Search in Google Scholar

21. van der Laan, MJ. Targeted learning of an optimal dynamic treatment, and statistical inference for its mean outcome. In: U.C. Berkeley Division of Biostatistics Working Paper Series; 2013.Search in Google Scholar

22. Zhao, Y, Laber, EB, Ning, Y, Saha, S, Sands, B. Efficient augmentation and relaxation learning for individualized treatment rules using observational data. 2019. arXiv:1901.00663.Search in Google Scholar

23. Kennedy, EH, McHugh, MD, Small, DS. Non-parametric methods for doubly robust estimation of continuous treatment effects. J Roy Stat Soc B 2017;79:1229–45. https://doi.org/10.1111/rssb.12212.Search in Google Scholar

24. Zou, H. The adaptive LASSO and its oracle properties. J Am Stat Assoc 2006;101:1418–29. https://doi.org/10.1198/016214506000000735.Search in Google Scholar

25. Rubin, D. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 1974;66:688–701. https://doi.org/10.1037/h0037350.Search in Google Scholar

26. Cole, SR, Frangakis, CE. The consistency statement in causal inference: a definition or an assumption? Epidemiology 2009;20:3–5. https://doi.org/10.1097/ede.0b013e31818ef366.Search in Google Scholar

27. Hernan, MA, Robins, JM. Causal inference: what if. FL: Chapman and Hall-CRC; 2019.Search in Google Scholar

28. Zhao, Q, Hastie, T. Causal interpretations of black-box models. J Bus Econ Stat 2019;39:272–81. https://doi.org/10.1080/07350015.2019.1624293.Search in Google Scholar

29. Tibshirani, R. Regression shrinkage and selection via the lasso. J Roy Stat Soc B 1996;58:267–88. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.Search in Google Scholar

30. Benkeser, D, van der Laan, MJ. The highly adaptive LASSO estimator. In: 2016 IEEE international conference on data science and advanced analytics. IEEE; 2016:689–96 pp.10.1109/DSAA.2016.93Search in Google Scholar

31. Lee, JD, Sun, DL, Sun, Y, Taylor, JE. Exact post-selection inference, with application to the LASSO. Ann Stat 2016;44:907–27. https://doi.org/10.1214/15-aos1371.Search in Google Scholar

32. Rubin, D, van der Laan, MJ. A doubly robust censoring unbiased transformation. Int J Biostat 2007;3. https://doi.org/10.2202/1557-4679.1052. 22550646.Search in Google Scholar

33. Rubin, D, van der Laan, MJ. Extending marginal structural models through local, penalized, and additive learning. In: U.C. Berkeley Division of Biostatistics Working Paper Series; 2006.Search in Google Scholar

34. Yuan, M, Lin, Y. On the non-negative garrotte estimator. J Roy Stat Soc B 2007;69:143–61. https://doi.org/10.1111/j.1467-9868.2007.00581.x.Search in Google Scholar

35. Chernozhukov, V, Chetverikov, D, Demirer, M, Duflo, E, Hansen, C. Newey, W, et al.. Double/debiased machine learning for treatment and structural parameters. Econom J 2018;21:C1–C68. https://doi.org/10.1111/ectj.12097.Search in Google Scholar

36. Tibshirani, R, Taylor, J, Loftus, J, Reid, S. Selective inference: tools for post-selection inference. 2019. Available from: https://CRAN.R-project.org/package=selectiveInference. 2017b.Search in Google Scholar

37. Hejazi, NS, Coyle, JR, van der Laan, MJ. hal9001: the scalable highly adaptive lasso. 2020. Available from: https://github.com/tlverse/hal9001.10.21105/joss.02526Search in Google Scholar

38. Firoozi, F, Lemire, C, Beauchesne, MF, Forget, A, Blais, L. Development and validation of database indexes of asthma severity and control. Thorax 2007;62:581–7. https://doi.org/10.1136/thx.2006.061572.Search in Google Scholar

39. Cossette, B, Forget, A, Beauchesne, MF, Rey, E, Larivée, P, Battista, MC, et al.. Impact of maternal use of asthma-controller therapy on perinatal outcomes. Thorax 2013;68:724–30. https://doi.org/10.1136/thoraxjnl-2012-203122.Search in Google Scholar

40. Bahamyirou, A, Blais, L, Forget, A, Schnitzer, ME. Understanding and diagnosing the potential for bias when using machine learning methods with doubly robust causal estimators. Stat Methods Med Res 2018;28:1637–50. https://doi.org/10.1177/0962280218772065.Search in Google Scholar

41. Javanmard, A, Montanari, A. Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 2014;15:2869–909.Search in Google Scholar

42. Ju, C, Benkeser, D, van der Laan, MJ. Robust inference on the average treatment effect using the outcome highly adaptive lasso. Biometrics 2020; 76:109–18. https://doi.org/10.1111/biom.13121.Search in Google Scholar

43. VanderWeele, TJ, Knol, MJ. A tutorial on interaction. Epidemiology 2014;173:731–8. https://doi.org/10.1515/em-2013-0005.Search in Google Scholar

Received: 2020-05-22
Revised: 2021-10-08
Accepted: 2021-12-09
Published Online: 2022-01-04

© 2021 Walter de Gruyter GmbH, Berlin/Boston