Accessible Unlicensed Requires Authentication Published online by De Gruyter April 5, 2021

More than one way: exploring the capabilities of different estimation approaches to joint models for longitudinal and time-to-event outcomes

Anja Rappl, Andreas Mayr and Elisabeth Waldmann

Abstract

The development of physical functioning after a caesura in an aged population is still widely unexplored. Analysis of this topic would need to model the longitudinal trajectories of physical functioning and simultaneously take terminal events (deaths) into account. Separate analysis of both results in biased estimates, since it neglects the inherent connection between the two outcomes. Thus, this type of data generating process is best modelled jointly. To facilitate this several software applications were made available. They differ in model formulation, estimation technique (likelihood-based, Bayesian inference, statistical boosting) and a comparison of the different approaches is necessary to identify their capabilities and limitations. Therefore, we compared the performance of the packages JM, joineRML, JMbayes and JMboost of the R software environment with respect to estimation accuracy, variable selection properties and prediction precision. With these findings we then illustrate the topic of physical functioning after a caesura with data from the German ageing survey (DEAS). The results suggest that in smaller data sets and theory driven modelling likelihood-based methods (expectation maximation, JM, joineRML) or Bayesian inference (JMbayes) are preferable, whereas statistical boosting (JMboost) is a better choice with high-dimensional data and data exploration settings.


Corresponding author: Anja Rappl, Friedrich-Alexander-Universität Erlangen-Nürnberg, Institut für Medizininformatik, Biometrie und Epidemiologie, Waldstraße 6, Erlangen91054, Germany, E-mail:

  1. 3

    Note: Anja Rappl performed the present work in partial fulfilment of the requirements for obtaining the degree ‘Dr. rer. biol. hum.’ at Friedrich-Alexander-Universität Erlangen-Nürnberg.

  2. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  3. Research funding: The work on this article was supported by the DFG (Projekt WA 4249/2-1) and the Volkswagen Foundation.

  4. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

Appendix A: MSE-values for section estimation accuracy

Calculation of MSE

R:number of simulated data setsθ:parameter of interestθ̂r:estimate for parameter of interest in data set rMSE=1Rr=1Rθθ̂r2

Table 4:

Average MSE-values by model and software.

ModelParameterJMjoineRMLJMbayesJMboost
AM1βl00.0260.0280.0290.039
βls10.0260.0250.0240.031
βls20.0190.0180.0200.028
βls30.0260.0280.0260.037
βls40.0170.0180.0180.024
βls50.0270.0260.0250.029
βls60.0250.0280.0300.037
βls70.0270.0280.0280.035
βls80.0280.0280.0260.037
βls90.0190.0190.0200.023
βls100.0210.0220.0230.029
βt0.0160.0160.0170.130
βs0.0380.0420.0420.038
α0.0070.0310.0070.011
σ20.0400.0020.0351.019
B000.1690.1640.141
B01/100.0280.0280.084
B110.0290.028337.920
AM2βl00.0260.0260.0560.044
βl10.0020.0020.0020.135
βl20.0020.0020.0020.132
βl30.0020.0020.0020.131
βl40.0020.0020.0020.082
βl50.0010.0010.0010.137
βl60.0010.0010.0010.121
βt0.0210.0210.0190
βs0.0480.0500.0520.033
α0.0130.0290.014Inf
σ20.0390.0020.0361.444
B000.1150.1130.097
B01/100.0420.0410.099
B110.0340.033126.073
AM3βl00.0280.0280.0320.040
βl10.0020.0020.0020.150
βl20.0020.0020.0020.144
βl30.0020.0020.0020.141
βl40.0020.0020.0020.082
βl50.0010.0010.0010.149
βls10.0220.0220.2380.031
βls20.0240.0230.0370.029
βls30.0230.0230.2960.030
βls40.0230.0220.0200.024
βls50.0270.0260.0510.030
βt0.0210.0220.0320.160
βs0.0440.0440.0470.040
α0.0080.0260.0080.022
σ20.0390.0020.0371.298
B000.1250.1280.093
B01/100.0410.0400.113
B110.0350.034620.771
Appendix B: Median false-positive and false-negative rates (min, max)

See Table 5.

Table 5:

Median of false-positive and -negative rates of packages by model and dimensionality.

ModelSoftwareNo. of available simulationsNo. of failuresMedian false-positive rate [min; max]Median false-negative rate [min; max]
V1M1JM9640.846 [0.538; 0.923]0.000 [0.000; 1.000]
joineRML10000.923 [0.692; 0.923]0.000 [0.000; 0.500]
JMbayes88120.923 [0.692; 0.923]0.000 [0.000; 0.500]
JMboost10000.538 [0.231; 0.769]0.188 [0.062; 0.375]
V1M2JM9820.923 [0.538; 0.923]0.000 [0.000; 1.000]
joineRML9910.923 [0.769; 0.923]0.000 [0.000; 0.500]
JMbayes83170.923 [0.692; 0.923]0.000 [0.000; 1.000]
JMboost10000.385 [0.077; 0.538]0.375 [0.167; 0.583]
V1M3JM9550.923 [0.615; 0.923]0.000 [0.000; 1.000]
joineRML10000.923 [0.615; 0.923]0.000 [0.000; 0.500]
JMbayes77230.923 [0.769; 0.923]0.000 [0.000; 1.000]
JMboost9730.462 [0.154; 0.692]0.312 [0.125; 0.562]
V2M1JM2980.764 [0.750; 0.777]0.250 [0.000; 0.500]
joineRML2980.659 [0.635; 0.682]0.500 [0.500; 0.500]
JMbayes0100
JMboost10000.128 [0.061; 0.223]0.438 [0.188; 0.625]
V2M2JM10900.818 [0.730; 0.912]0.000 [0.000; 0.500]
joineRML1990.676 [0.676; 0.676]0.500 [0.500; 0.500]
JMbayes0100
JMboost10000.068 [0.041; 0.270]0.500 [0.250; 0.667]
V2M3JM12880.780 [0.676; 0.865]0.000 [0.000; 1.000]
joineRML1990.649 [0.649; 0.649]0.500 [0.500; 0.500]
JMbayes0100
JMboost10000.088 [0.027; 0.331]0.438 [0.250; 0.688]
V3M1JM0100
joineRML0100
JMbayes0100
JMboost10000.028 [0.013; 0.056]0.562 [0.250; 0.875]
V3M2JM0100
joineRML0100
JMbayes0100
JMboost10000.020 [0.011; 0.057]0.500 [0.333; 0.667]
V3M3JM0100
joineRML0100
JMbayes0100
JMboost10000.019 [0.007; 0.123]0.562 [0.375; 0.750]
Appendix C: Average mean squared prediction error (MSPE)-values by model

Calculation of MSPE for longitudinal outcomes

ỹi:new observation of individual iŷi:predicted value for new observation of individual iñ:number of new observationsMSPE=1ñi=1ñỹiŷi2

Calculation of MSPE for survival outcomes

n: number of observationsTi: observed event time of individual (censored or actual event) iu: arbitrary future time pointtni: last observed longitudinal time point for individual iδi: event indicator for individual iη̂i: estimated (longitudianl, shared, survival) predictorgiven the data of individual i
MSPE=1ni=1nITiu(1Pr(TiuTitni,η̂li(t),η̂lsi,η̂si))2+δiITi<u(0Pr(TiuTitni,η̂li(t),η̂lsi,η̂si))2+(1δi)ITi<uPr(TiuTitni,η̂li(t),η̂lsi,η̂si)(1Pr(TiuTitni,η̂li(t),η̂lsi,η̂si))2+(1Pr(TiuTitni,η̂li(t),η̂lsi,η̂si))(0Pr(TiuTitni,η̂li(t),η̂lsi,η̂si))2

Table 6:

MSPE-values for marginal prediction of longitudinal outcome by model, dimensionality and package.

DimensionModelJMjoineRMLJMbayesJMboost
PAM12.76 [2.47; 3.12]2.77 [2.46; 3.32]2.77 [2.47; 3.23]2.82 [2.49; 3.44]
PAM22.53 [2.30; 2.83]2.53 [2.30; 2.83]2.54 [2.35; 3.02]3.22 [2.83; 3.83]
PAM32.63 [2.39; 3.19]2.63 [2.40; 3.18]3.21 [2.44; 4.04]3.30 [2.96; 4.04]
PV1M12.94 [2.56; 3.43]2.95 [2.56; 3.46]2.82 [2.50; 3.37]2.99 [2.51; 3.59]
PV1M22.63 [2.34; 3.07]2.64 [2.35; 3.08]2.57 [2.36; 3.18]3.25 [2.87; 3.83]
PV1M32.76 [2.47; 3.21]2.77 [2.47; 3.22]3.29 [2.66; 3.91]3.44 [2.99; 4.14]
PV2M16.28 [5.67; 6.89]6.65 [6.03; 7.27]3.54 [2.79; 4.74]
PV2M24.70 [4.00; 6.24]4.64 [4.64; 4.64]3.21 [2.86; 3.83]
PV2M35.40 [4.40; 6.24]5.89 [5.89; 5.89]3.71 [3.18; 5.83]
PV3M14.17 [3.29; 7.35]
PV3M23.23 [2.87; 3.80]
PV3M34.14 [3.32; 6.23]

Table 7:

MSPE-values for subject-specific prediction of the survival probability by model, dimensionality and package.

DimensionModelJMjoineRMLJMbayesJMboost
PAM10.12 [0.08; 0.19]0.24 [0.12; 0.44]0.11 [0.08; 0.14]0.14 [0.11; 0.16]
PAM20.13 [0.09; 0.20]0.23 [0.13; 0.42]0.11 [0.09; 0.14]
PAM30.13 [0.10; 0.21]0.24 [0.12; 0.43]0.11 [0.09; 0.14]0.14 [0.11; 0.17]
PV1M10.13 [0.08; 0.20]0.24 [0.12; 0.44]0.11 [0.09; 0.14]0.14 [0.11; 0.17]
PV1M20.13 [0.09; 0.20]0.23 [0.13; 0.42]0.11 [0.09; 0.14]0.14 [0.11; 0.17]
PV1M30.13 [0.10; 0.22]0.24 [0.12; 0.43]0.11 [0.09; 0.14]0.14 [0.11; 0.17]
PV2M10.13 [0.13; 0.13]0.31 [0.30; 0.33]0.14 [0.11; 0.17]
PV2M20.14 [0.12; 0.15]0.14 [0.14; 0.14]0.14 [0.11; 0.17]
PV2M30.12 [0.11; 0.16]0.16 [0.16; 0.16]0.14 [0.12; 0.17]
PV3M10.14 [0.11; 0.17]
PV3M20.14 [0.11; 0.17]
PV3M30.14 [0.11; 101293.98]
Appendix D: Boosted coefficient paths of “healthy ageing”

The update mechanism of statistical boosting algorithms yield characteristic coefficient paths across their set iteration times. They reflect that per iteration only one coefficient per gradient is updated by a small proportion, that coefficients are selected at different times throughout the algorithm and that earlier selected coefficients tend to be larger once the algorithm stops. An example of a standard image of coefficient paths is given in Figure 9.

Figure 9: Ideal behaviour of coefficient paths in statistical boosting.

Figure 9:

Ideal behaviour of coefficient paths in statistical boosting.

In comparison Figure 10 shows the coefficient paths of the data example of “healthy ageing”. The paths oscillate heavily, which is a sign of instability. This becomes particularly problematic with the effect of obstime, since it alternates between positive and negative values. The values for mstop_l, mstop_ls and mstop_s indicate the optimal iteration number for the predicted likelihood to be at its minimum (cross-validation). A potential remedy for this behaviour is for the algorithm to be forced to run longer at the compromise of less optimal prediction. In this case, however, the oscillation will not stop even when the iteration numbers are increased. Thus, boosting will not yield entirely satisfying results for this data problem.

Figure 10: Suboptimal trajectory of coefficient paths of the data model “healthy ageing” when boosted.

Figure 10:

Suboptimal trajectory of coefficient paths of the data model “healthy ageing” when boosted.

References

1. WHO. World report on ageing and health. Geneva: WHO; 2015.Search in Google Scholar

2. Henderson, R, Diggle, P, Dobson, A. Joint modelling of longitudinal measurements and event time data. Biostatistics 2000;1:465–80. https://doi.org/10.1093/biostatistics/1.4.465.Search in Google Scholar

3. Wulfsohn, MS, Tsiatis, AA. A joint model for survival and longitudinal data measured with error. Biometrics 1997;53:330–39. https://doi.org/10.2307/2533118.Search in Google Scholar

4. Rizopoulos, D. Joint models for longitudinal and time-to-event data: With applications in R. In: Chapman & Hall/CRC biostatistics series, vol 6. Boca Raton: CRC Press; 2012.Search in Google Scholar

5. Song, X, Davidian, M, Tsiatis, AA. A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics 2002;58:742–53. https://doi.org/10.1111/j.0006-341x.2002.00742.x.Search in Google Scholar

6. Hsieh, F, Tseng, Y-K, Wang, J-L. Joint modeling of survival and longitudinal data: likelihood approach revisited. Biometrics 2006;62:1037–43. https://doi.org/10.1111/j.1541-0420.2006.00570.x.Search in Google Scholar

7. Tseng, Y-K, Hsieh, F, Wang, J-L. Joint modelling of accelerated failure time and longitudinal data. Biometrika 2005;92:587–603. https://doi.org/10.1093/biomet/92.3.587.Search in Google Scholar

8. Lin, H, McCulloch, CE, Mayne, ST. Maximum likelihood estimation in the joint analysis of time-to-event and multiple longitudinal variables. Stat Med 2002;21:2369–82. https://doi.org/10.1002/sim.1179.Search in Google Scholar

9. Asar, O, Ritchie, J, Kalra, PA, Diggle, PJ. Joint modelling of repeated measurement and time-to-event data: an introductory tutorial. Int J Epidemiol 2015;44:334–44. https://doi.org/10.1093/ije/dyu262.Search in Google Scholar

10. Tsiatis, AA, Davidian, M. Joint modeling of longitudinal and time-to-event data: an overview. Stat Sin 2004;14:809–34.Search in Google Scholar

11. Faucett, CL, Thomas, DC. Simultaneously modelling censored survival data and repeatedly measured covariates: a Gibbs sampling approach. Stat Med 1996;15:1663–85. .Search in Google Scholar

12. Waldmann, E, Taylor-Robinson, D, Klein, N, Kneib, T, Pressler, T, Schmid, M, et al.. Boosting joint models for longitudinal and time-to-event data. Biom J 2017;59:1104–21. https://doi.org/10.1002/bimj.201600158.Search in Google Scholar

13. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: .Search in Google Scholar

14. SAS Institute Inc. The SAS system, version 9.4. Cary: SAS Institute Inc.; 2013. Available from: .Search in Google Scholar

15. StataCorp. Stata Statistical Software: release 16. College Station, TX: StataCorp LLC; 2019. Available from: .Search in Google Scholar

16. Lunn, DJ, Thomas, A, Best, N, Spiegelhalter, D. WinBUGS – a bayesian modelling framework: concepts, structure, and extensibility. Stat Comput 2000;10:325–37. .Search in Google Scholar

17. Rizopoulos, D. JM: an R package for the joint modelling of longitudinal and time-to-event data. J Stat Software 2010;35. .Search in Google Scholar

18. Philipson, P, Sousa, I, Diggle, PJ, Williamson, P, Kolamunnage-Dona, R, Henderson, R, et al.. joineR: joint modelling of repeated measurements and time-to-event data. CRAN; 2017. Available from: .Search in Google Scholar

19. Hickey, GL, Philipson, P, Jorgensen, A, Kolamunnage-Dona, R. joineRML: a joint model and software package for time-to-event and multivariate longitudinal outcomes. BMC Med Res Methodol 2018;18:50. https://doi.org/10.1186/s12874-018-0502-1.Search in Google Scholar

20. Guo, X, Carlin, BP. Separate and joint modeling of longitudinal and event time data using standard computer packages. Am Statistician 2004;58:16–24. https://doi.org/10.1198/0003130042854.Search in Google Scholar

21. Crowther, MJ, Abrams, KR, Lambert, PC. Joint modeling of longitudinal and survival data. STATA J 2013;13:165–84. https://doi.org/10.1177/1536867x1301300112.Search in Google Scholar

22. Rizopoulos, D. The R package JMbayes for fitting joint models for longitudinal and time-to-event data using MCMC. J Stat Software 2016;72. https://doi.org/10.18637/jss.v072.i07.Search in Google Scholar

23. Yuen, HP, Mackinnon, A. Performance of joint modelling of time-to-event data with time-dependent predictors: an assessment based on transition to psychosis data. PeerJ 2016;4:e2582. https://doi.org/10.7717/peerj.2582.Search in Google Scholar

24. Neuhaus, A, Augustin, T, Heumann, C, Daumer, M. A review on joint models in biometrical research. Sonderforschungsbereich 386; 2006. Discussion Paper 506, Available from: .Search in Google Scholar

25. Griesbach, C, Mayr, A, Waldmann, E. Extension of the gradient boosting algorithm for joint modeling of longitudinal and time-to-event data; 2018. arXiv, Available from: .Search in Google Scholar

26. Mayr, A, Binder, H, Gefeller, O, Schmid, M. The evolution of boosting algorithms – from machine learning to statistical modelling. Methods Inf Med 2014;53:419–27. https://doi.org/10.3414/ME13-01-0122.Search in Google Scholar

27. Hickey, GL. joineRML: Vignette. CRAN; 2017. Available from: .Search in Google Scholar

28. Hickey, GL, Philipson, P, Jorgensen, AL, Kolamunnage-Dona, R. joineRML: a joint model and software package for time-to-event and multivariate longitudinal data. CRAN; 2017. Available from: .Search in Google Scholar

29. Mayr, A, Hofner, B, Schmid, M. The importance of knowing when to stop. A sequential stopping rule for component-wise gradient boosting. Methods Inf Med 2012;51:178–86. https://doi.org/10.3414/ME11-02-0030.Search in Google Scholar

30. Rizopoulos, D. Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 2011;67:819–29. https://doi.org/10.1111/j.1541-0420.2010.01546.x.Search in Google Scholar

31. Rizopoulos, D. JM: joint modeling of longitudinal and survival data. CRAN; 2018. Available from: .Search in Google Scholar

32. Engstler, H, Hameister, N, Schrader, S. User manual DEAS SUF 2014. DZA German Centre of Gerontology; 2014.Search in Google Scholar

33. Klaus, D, Engstler, H. Daten und methoden des deutschen alterssurveys. In: Mahne, K, Wolff, JK, Simonson, J, Tesch-Römer, C, editors Altern im Wandel. Wiesbaden: Springer Fachmedien Wiesbaden; 2017. pp. 29–45.Search in Google Scholar

34. Mahne, K, Wolff, J, Tesch-Römer, C. Scientific use file German Ageing Survey (SUF DEAS) 2014, version 1.0. DZA German Centre of Gerontology; 2016. https://doi.org/10.5156/DEAS.2014.M.001.Search in Google Scholar

35. Motel-Klingebiel, A, Tesch-Römer, C, Wurm, S, Funded by the Federal Ministry for Family Affairs, Senior Citizens, Women and Youth. Scientific use file German Ageing Survey (SUF DEAS) 2002, version 3.0. DZA German Centre of Gerontology; 2016. https://doi.org/10.5156/DEAS.2002.M.003.Search in Google Scholar

36. Motel-Klingebiel, A, Tesch-Römer, C, Wurm, S, Funded by the Federal Ministry for Family Affairs, Senior Citizens, Women and Youth. Scientific use file German Ageing Survey (SUF DEAS) 2008, version 3.0. DZA German Centre of Gerontology; 2016. https://doi.org/10.5156/DEAS.2008.M.003.Search in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2020-0067).

Received: 2020-05-18
Revised: 2021-01-24
Accepted: 2021-03-12
Published Online: 2021-04-05

© 2021 Walter de Gruyter GmbH, Berlin/Boston