Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter March 24, 2021

Regularized bidimensional estimation of the hazard rate

  • Vivien Goepp ORCID logo EMAIL logo , Jean-Christophe Thalabard , Grégory Nuel and Olivier Bouaziz

Abstract

In epidemiological or demographic studies, with variable age at onset, a typical quantity of interest is the incidence of a disease (for example the cancer incidence). In these studies, the individuals are usually highly heterogeneous in terms of dates of birth (the cohort) and with respect to the calendar time (the period) and appropriate estimation methods are needed. In this article a new estimation method is presented which extends classical age-period-cohort analysis by allowing interactions between age, period and cohort effects. We introduce a bidimensional regularized estimate of the hazard rate where a penalty is introduced on the likelihood of the model. This penalty can be designed either to smooth the hazard rate or to enforce consecutive values of the hazard to be equal, leading to a parsimonious representation of the hazard rate. In the latter case, we make use of an iterative penalized likelihood scheme to approximate the L0 norm, which makes the computation tractable. The method is evaluated on simulated data and applied on breast cancer survival data from the SEER program.


Corresponding author: Vivien Goepp, MAP5, CNRS UMR 8145, 45, rue des Saints-Pères, 75006, Paris, France; MINES ParisTech, CBIO–Centre for Computational Biology, PSL Research University, 75006, Paris, France; Institut Curie, PSL Research University, 75005, Paris, France; and Inserm, U900, Paris, France, E-mail: .

Acknowledgements

The authors are thankful to the National Cancer Institute for providing U.S. mortality data on cancer. We also thank an anonymous reviewer for comments on the adaptive ridge and its link to related methods which significantly improved the quality of this paper.

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of Interest statement: The authors have declared no conflict of interest.

References

1. Yang, Y, Land, KC. Age-period-cohort analysis: new models, methods, and empirical applications. Chapman & Hall/CRC Interdisciplinary Statistics; 2013.Search in Google Scholar

2. Osmond, C, Gardner, MJ. Age, period and cohort models applied to cancer mortality rates. Stat Med 1982;1:245–59. https://doi.org/10.1002/sim.4780010306.Search in Google Scholar PubMed

3. Heuer, C. Modeling of time trends and interactions in vital rates using restricted regression splines. Biometrics 1997;53:161–77. https://doi.org/10.2307/2533105.Search in Google Scholar

4. Holford, TR. The estimation of age, period and cohort effects for vital rates. Biometrics 1983;39:311–24. https://doi.org/10.2307/2531004.Search in Google Scholar

5. Carstensen, B. Age–period–cohort models for the Lexis diagram. Stat Med 2007;26:3018–45. https://doi.org/10.1002/sim.2764.Search in Google Scholar PubMed

6. Kuang, D, Nielsen, B, Nielsen, JP. Identification of the age-period-cohort model and the extended chain-ladder model. Biometrika 2008;95:979–86. https://doi.org/10.1093/biomet/asn026.Search in Google Scholar

7. Nielsen, B. Apc: an R package for age-period-cohort analysis. The R Journal 2015;7:52. https://doi.org/10.32614/rj-2015-020.Search in Google Scholar

8. Carstensen, B, Plummer, M, Laara, E, Hills, M. Epi: a package for statistical analysis in epidemiology; 2017.Search in Google Scholar

9. Plummer, M, Carstensen, B. Lexis: an R class for epidemiological studies with long-term follow-up. J Stat Software 2011;38:1–12. https://doi.org/10.18637/jss.v038.i05.Search in Google Scholar

10. Beran, R. Nonparametric regression with randomly censored survival data. Technical Report. Berkeley: University of California; 1981.Search in Google Scholar

11. McKeague, IW, Utikal, KJ. Identifying nonlinear covariate effects in semimartingale regression models. Probab Theor Relat Field 1990;87:1–25. https://doi.org/10.1007/bf01217745.Search in Google Scholar

12. Keiding, N. Statistical inference in the Lexis diagram. Phil Trans Roy Soc Lond A 1990;332:487–509.10.1098/rsta.1990.0128Search in Google Scholar

13. Currie, ID, Kirkby, JG. Smoothing age-period-cohort models with P -splines: a mixed model approach; 2009.Search in Google Scholar

14. Candès, EJ, Wakin, MB, Boyd, SP. Enhancing sparsity by reweighted L1 minimization. J Fourier Anal Appl 2008;14:877–905. https://doi.org/10.1007/s00041-008-9045-x.Search in Google Scholar

15. Chartrand, R, Yin, W. Iteratively reweighted algorithms for compressive sensing. In: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing; 2008:3869–72 pp.10.1109/ICASSP.2008.4518498Search in Google Scholar

16. Rippe, RCA, Meulman, JJ, Eilers, PHC. Visualization of genomic changes by segmented smoothing using an L0 penalty. PloS One 2012;7:e38230. https://doi.org/10.1371/journal.pone.0038230.Search in Google Scholar PubMed PubMed Central

17. Frommlet, F, Nuel, G. An adaptive ridge procedure for L0 regularization. PloS One 2016;11:e0148620. https://doi.org/10.1371/journal.pone.0148620.Search in Google Scholar PubMed PubMed Central

18. Bouaziz, O, Nuel, G. L0 regularization for the estimation of piecewise constant hazard rates in survival analysis. Appl Math 2017;08:377–94. https://doi.org/10.4236/am.2017.83031.Search in Google Scholar

19. Aalen, O, Borgan, O, Gjessing, H. Survival and event history analysis: a process point of view. Springer Science & Business Media; 2008.10.1007/978-0-387-68560-1Search in Google Scholar

20. Ogata, Y, Katsura, K. Likelihood analysis of spatial in homogeneity for marked point patterns. Ann Inst Stat Math 1988;40:29–39. https://doi.org/10.1007/bf00053953.Search in Google Scholar

21. Foucart, S, Lai, M-J. Sparsest solutions of underdetermined linear systems via Lq-minimization for 0 < q ≤ 1. Appl Comput Harmon Anal 2009;26:395–407.10.1016/j.acha.2008.09.001Search in Google Scholar

22. Wipf, D, Nagarajan, S. Iterative reweighted ℓ1 and ℓ2 methods for finding sparse solutions. IEEE J Sel Top Signal Process 2010;4:317–29. https://doi.org/10.1109/jstsp.2010.2042413.Search in Google Scholar

23. Csardi, G, Nepusz, T. The igraph software package for complex network research; 2006.Search in Google Scholar

24. Schwarz, G. Estimating the dimension of a model. Ann Stat 1978;6:461–4. https://doi.org/10.1214/aos/1176344136.Search in Google Scholar

25. Żak-Szatkowska, M, Bogdan, M. Modified versions of the Bayesian information criterion for sparse generalized linear models. Comput Stat Data Anal 2011;55:2908–24.10.1016/j.csda.2011.04.016Search in Google Scholar

26. Chen, J, Chen, Z. Extended Bayesian information criteria for model selection with large model spaces. Biometrika 2008;95:759–71. https://doi.org/10.1093/biomet/asn034.Search in Google Scholar

27. Akaike, H. Information theory and an extension of the maximum likelihood principle. In Selected papers of Hirotugu Akaike. Springer; 1998:199–213 pp.10.1007/978-1-4612-1694-0_15Search in Google Scholar


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2019-0003).


Received: 2019-01-07
Revised: 2020-12-03
Accepted: 2021-02-26
Published Online: 2021-03-24

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 5.12.2023 from https://www.degruyter.com/document/doi/10.1515/ijb-2019-0003/html
Scroll to top button