Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter June 16, 2022

The optimal dynamic treatment rule superlearner: considerations, performance, and application to criminal justice interventions

  • Lina M. Montoya ORCID logo EMAIL logo , Mark J. van der Laan , Alexander R. Luedtke , Jennifer L. Skeem , Jeremy R. Coyle and Maya L. Petersen


The optimal dynamic treatment rule (ODTR) framework offers an approach for understanding which kinds of patients respond best to specific treatments – in other words, treatment effect heterogeneity. Recently, there has been a proliferation of methods for estimating the ODTR. One such method is an extension of the SuperLearner algorithm – an ensemble method to optimally combine candidate algorithms extensively used in prediction problems – to ODTRs. Following the ``causal roadmap,” we causally and statistically define the ODTR and provide an introduction to estimating it using the ODTR SuperLearner. Additionally, we highlight practical choices when implementing the algorithm, including choice of candidate algorithms, metalearners to combine the candidates, and risk functions to select the best combination of algorithms. Using simulations, we illustrate how estimating the ODTR using this SuperLearner approach can uncover treatment effect heterogeneity more effectively than traditional approaches based on fitting a parametric regression of the outcome on the treatment, covariates and treatment-covariate interactions. We investigate the implications of choices in implementing an ODTR SuperLearner at various sample sizes. Our results show the advantages of: (1) including a combination of both flexible machine learning algorithms and simple parametric estimators in the library of candidate algorithms; (2) using an ensemble metalearner to combine candidates rather than selecting only the best-performing candidate; (3) using the mean outcome under the rule as a risk function. Finally, we apply the ODTR SuperLearner to the ``Interventions” study, an ongoing randomized controlled trial, to identify which justice-involved adults with mental illness benefit most from cognitive behavioral therapy to reduce criminal re-offending.

Corresponding author: Lina M. Montoya, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, E-mail:

Award Identifier / Grant number: F31AI140962

Award Identifier / Grant number: R01AI074345

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.


1. Dahabreh, IJ, Hayward, R, Kent, DM. Using group data to treat individuals: understanding heterogeneous treatment effects in the age of precision medicine and patient-centred evidence. Int J Epidemiol 2016;45:2184–93.10.1093/ije/dyw125Search in Google Scholar PubMed PubMed Central

2. Kent, DM, Steyerberg, E, van Klaveren, D. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. Br Med J 2018;363:k4245.10.1136/bmj.k4245Search in Google Scholar PubMed PubMed Central

3. Kosorok, MR, Laber, EB. Precision medicine. Annu Rev Stat Appl 2019;6:263–86.10.1146/annurev-statistics-030718-105251Search in Google Scholar PubMed PubMed Central

4. Skeem, JL, Manchak, S, Peterson, JK. Correctional policy for offenders with mental illness: creating a new paradigm for recidivism reduction. Law Hum Behav 2011;35:110–26.10.1007/s10979-010-9223-7Search in Google Scholar PubMed

5. Skeem, JL, Winter, E, Kennealy, PJ, Louden, JE, Tatar, JRII. Offenders with mental illness have criminogenic needs, too: toward recidivism reduction. Law Hum Behav 2014;38:212–24.10.1037/lhb0000054Search in Google Scholar PubMed

6. Lipsey, MW, Landenberger, NA, Wilson, SJ. Effects of cognitive-behavioral programs for criminal offenders. Campbell Syst Rev 2007;3:1–27.10.4073/csr.2007.6Search in Google Scholar

7. Hu, F, Rosenberger, WF. The theory of response-adaptive randomization in clinical trials. Hoboken: John Wiley and Sons; 2006.10.1002/047005588XSearch in Google Scholar

8. Lei, H, Nahum-Shani, I, Lynch, K, Oslin, D, Murphy, SA. A ”SMART” design for building individualized treatment sequences. Annu Rev Clin Psychol 2012;8:21–48.10.1146/annurev-clinpsy-032511-143152Search in Google Scholar PubMed PubMed Central

9. Lipkovich, I, Dmitrienko, A, D’Agostino, RB. Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med 2017;36:136–96.10.1002/sim.7064Search in Google Scholar PubMed

10. Bembom, O, van der Laan, MJ. A practical illustration of the importance of realistic individualized treatment rules in causal inference. Electron J Stat 2007;1:574–96.10.1214/07-EJS105Search in Google Scholar PubMed PubMed Central

11. Chakraborty, B, Moodie, EEM. Statistical methods for dynamic treatment regimes. New York: Springer; 2013.10.1007/978-1-4614-7428-9Search in Google Scholar

12. Robins, JM. A new approach to causal inference in mortality studies with a sustained exposure period - application to control of the healthy worker survivor effect. Math Model 1986;7:9–12.10.1016/0270-0255(86)90088-6Search in Google Scholar

13. van der Laan, MJ, Petersen, ML. Causal effect models for realistic individualized treatment and intention to treat rules. Int J Biostat 2007;3:1557–4679.10.2202/1557-4679.1022Search in Google Scholar PubMed PubMed Central

14. Murphy, SA. Optimal dynamic treatment regimes. J Roy Stat Soc B 2003;65:331–55.10.1111/1467-9868.00389Search in Google Scholar

15. Robins, JM. Optimal structural nested models for optimal sequential decisions. In: Proceedings of the second seattle symposium in biostatistics. New York: Springer; 2004:189–326 pp.10.1007/978-1-4419-9076-1_11Search in Google Scholar

16. Luedtke, AR, van der Laan, MJ. Optimal individualized treatments in resource-limited settings. Int J Biostat 2016;12:283–303.10.1515/ijb-2015-0007Search in Google Scholar PubMed PubMed Central

17. Laber, EB, Linn, KA, Stefanski, LA. Interactive model building for q-learning. Biometrika 2014;101:831–47.10.1093/biomet/asu043Search in Google Scholar PubMed PubMed Central

18. Moodie, EEM, Chakraborty, B, Kramer, MS. Q-learning for estimating optimal dynamic treatment rules from observational data. Can J Stat 2012;40:629–45.10.1002/cjs.11162Search in Google Scholar PubMed PubMed Central

19. Qian, M, Murphy, SA. Performance guarantees for individualized treatment rules. Ann Stat 2011;39:1180–210.10.1214/10-AOS864Search in Google Scholar PubMed PubMed Central

20. Schulte, PJ, Tsiatis, AA, Laber, EB, Davidian, M. Q-and a-learning methods for estimating optimal dynamic treatment regimes. Stat Sci 2014;29:640–61.10.1214/13-STS450Search in Google Scholar PubMed PubMed Central

21. Moodie, EE, Richardson, TS, Stephens, DA. Demystifying optimal dynamic treatment regimes. Biometrics 2007;63:447–55.10.1111/j.1541-0420.2006.00686.xSearch in Google Scholar PubMed

22. Zhang, B, Tsiatis, AA, Davidian, M, Zhang, M, Laber, E. Estimating optimal treatment regimes from a classification perspective. Stat 2012;1:103–14.10.1002/sta.411Search in Google Scholar PubMed PubMed Central

23. Zhao, Y, Zeng, D, Rush, AJ, Kosorok, MR. Estimating individualized treatment rules using outcome weighted learning. J Am Stat Assoc 2012;107:1106–18.10.1080/01621459.2012.695674Search in Google Scholar PubMed PubMed Central

24. Zhao, Y-Q, Zeng, D, Laber, EB, Kosorok, MR. New statistical learning methods for estimating optimal dynamic treatment regimes. J Am Stat Assoc 2015;110:583–98.10.1080/01621459.2014.937488Search in Google Scholar PubMed PubMed Central

25. Kosorok, MR, Moodie, EE. Adaptive treatment strategies in practice: planning trials and analyzing data for personalized medicine. Philadelphia: Society for Industrial and Applied Mathematics; 2015.10.1137/1.9781611974188Search in Google Scholar

26. Zhao, Y-Q, Laber, EB. Estimation of optimal dynamic treatment regimes. Clin Trials 2014;11:400–7.10.1177/1740774514532570Search in Google Scholar PubMed PubMed Central

27. van der Laan, MJ, Polley, EC, Hubbard, AE. Super learner. Stat Appl Genet Mol Biol 2007;6.10.2202/1544-6115.1309Search in Google Scholar PubMed

28. Breiman, L. Stacked regressions. Mach Learn 1996;24:49–64.10.1007/BF00117832Search in Google Scholar

29. LeDell, E, van der Laan, MJ, Petersen, M. Auc-maximizing ensembles through metalearning. Int J Biostat 2016;12:203–18.10.1515/ijb-2015-0035Search in Google Scholar PubMed PubMed Central

30. Petersen, ML, LeDell, E, Schwab, J, Sarovar, V, Gross, R, Reynolds, N, et al.. Super learner analysis of electronic adherence data improves viral prediction and may provide strategies for selective hiv rna monitoring. J Acquir Immune Defic Syndr 2015;69:109–18.10.1097/QAI.0000000000000548Search in Google Scholar PubMed PubMed Central

31. Pirracchio, R, Petersen, ML, Carone, M, Rigon, MR, Chevret, S, van der Laan, MJ. Mortality prediction in intensive care units with the super icu learner algorithm (sicula): a population-based study. Lancet Respir Med 2015;3:42–52.10.1016/S2213-2600(14)70239-5Search in Google Scholar PubMed PubMed Central

32. Pirracchio, R, Petersen, ML, van der Laan, MJ. Improving propensity score estimators’ robustness to model misspecification using super learner. Am J Epidemiol 2014;181:108–19.10.1093/aje/kwu253Search in Google Scholar PubMed PubMed Central

33. Coyle, JR. Computational considerations for targeted learning [Ph.D. thesis]. Berkeley: University of California; 2017.Search in Google Scholar

34. Luedtke, AR, van der Laan, MJ. Super-learning of an optimal dynamic treatment rule. Int J Biostat 2016;12:305–32.10.1515/ijb-2015-0052Search in Google Scholar PubMed PubMed Central

35. Almirall, D, Nahum-Shani, I, Sherwood, NE, Murphy, SA. Introduction to SMART designs for the development of adaptive interventions: with application to weight loss research. Transl Behav Med 2014;4:260–74.10.1007/s13142-014-0265-0Search in Google Scholar PubMed PubMed Central

36. Montoya, L, Skeem, JL, van der Laan, MJ, Petersen, ML. Estimators for the value of the optimal dynamic treatment rule with application to criminal justice interventions. Int J Biostat, in press.Search in Google Scholar

37. Petersen, ML, van der Laan, MJ. Causal models and learning from data: integrating causal modeling and statistical estimation. Epidemiology 2014;25:418–26.10.1097/EDE.0000000000000078Search in Google Scholar PubMed PubMed Central

38. Pearl, J. Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press; 2000.Search in Google Scholar

39. Luedtke, AR, van der Laan, MJ. Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Ann Stat 2016;44:713–42.10.1214/15-AOS1384Search in Google Scholar PubMed PubMed Central

40. Robins, JM, Hernan, MA. Estimation of the causal effects of time-varying exposures. Longitudinal Data Anal 2009;553:553–99.10.1201/9781420011579.ch23Search in Google Scholar

41. Petersen, ML, Porter, KE, Gruber, S, Wang, Y, van der Laan, MJ. Diagnosing and responding to violations in the positivity assumption. Stat Methods Med Res 2012;21:31–54.10.1177/0962280210386207Search in Google Scholar PubMed PubMed Central

42. Rubin, DB, van der Laan, MJ. Statistical issues and limitations in personalized medicine research with clinical trials. Int J Biostat 2012;8:1557–4679.10.1515/1557-4679.1423Search in Google Scholar PubMed PubMed Central

43. Ripley, BD. Pattern recognition and neural networks. Cambridge: Cambridge University Press; 1996.10.1017/CBO9780511812651Search in Google Scholar

44. Friedman, JH. Multivariate adaptive regression splines. Ann Stat 1991;19:1–67.10.1214/aos/1176347963Search in Google Scholar

45. Breiman, L, Friedman, JH, Olshen, RA, Stone, CJ. Classification and regression trees. New York: Routledge; 1984.Search in Google Scholar

46. van der Laan, MJ, Rose, S. Targeted learning: causal inference for observational and experimental data. New York: Springer; 2011.10.1007/978-1-4419-9782-1Search in Google Scholar

47. van der Laan, MJ, Gruber, S. Targeted minimum loss based estimation of an intervention specific mean outcome. U.C. Berkeley Division of Biostatistics Working Paper Series; 2011.Search in Google Scholar

48. Core Team R. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2020. Available from: in Google Scholar

49. Coyle, J. tlverse: umbrella package for targeted learning in R; 2021. Available from:, r package version 0.0.1.Search in Google Scholar

50. van der Laan, MJ, Luedtke, AR. Targeted learning of the mean outcome under an optimal dynamic treatment rule. J Causal Inference 2015;3:61–95.10.1515/jci-2013-0022Search in Google Scholar PubMed PubMed Central

51. Chang, C-C, Lin, C-J. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2011;2:1–27.10.1145/1961189.1961199Search in Google Scholar

52. Polley, E, LeDell, E, Kennedy, C, van der Laan, M. SuperLearner: super learner prediction; 2020. Available from:, r package version 2.0-28.Search in Google Scholar

53. Zhou, X, Mayer-Hamblett, N, Khan, U, Kosorok, MR. Residual weighted learning for estimating individualized treatment rules. J Am Stat Assoc 2017;112:169–87.10.1080/01621459.2015.1093947Search in Google Scholar PubMed PubMed Central

54. Zhao, Y-Q, Laber, EB, Ning, Y, Saha, S, Sands, BE. Efficient augmentation and relaxation learning for individualized treatment rules using observational data. J Mach Learn Res 2019;20:1–23.Search in Google Scholar

55. Holloway, ST, Laber, EB, Linn, KA, Zhang, B, Davidian, M, Tsiatis, AA. DynTxRegime: methods for estimating optimal dynamic treatment regimes; 2019. Available from:, r package version 4.1.Search in Google Scholar

56. Moore, KL, van der Laan, MJ. Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat Med 2009;28:39–64.10.1002/sim.3445Search in Google Scholar PubMed PubMed Central

57. van Klaveren, D, Vergouwe, Y, Farooq, V, Serruys, PW, Steyerberg, EW. Estimates of absolute treatment benefit for individual patients required careful modeling of statistical interactions. J Clin Epidemiol 2015;68:1366–74.10.1016/j.jclinepi.2015.02.012Search in Google Scholar PubMed PubMed Central

58. Varadhan, R, Segal, JB, Boyd, CM, Wu, AW, Weiss, CO. A framework for the analysis of heterogeneity of treatment effect in patient-centered outcomes research. J Clin Epidemiol 2013;66:818–25.10.1016/j.jclinepi.2013.02.009Search in Google Scholar PubMed PubMed Central

59. Yusuf, S, Wittes, J, Probstfield, J, Tyroler, HA. Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. JAMA, J Am Med Assoc 1991;266:93–8.10.1001/jama.266.1.93Search in Google Scholar

60. Cohen, ZD, DeRubeis, RJ. Treatment selection in depression. Annu Rev Clin Psychol 2018;14:209–36.10.1146/annurev-clinpsy-050817-084746Search in Google Scholar PubMed

Supplementary Material

The online version of this article offers supplementary material (

Received: 2020-09-04
Accepted: 2022-05-06
Published Online: 2022-06-16

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 5.12.2023 from
Scroll to top button