Abstract
A classical problem in survival analysis is to estimate the failure time distribution from right-censored observations obtained from an incident cohort study. Frequently, however, failure time data comprise two independent samples, one from an incident cohort study and the other from a prevalent cohort study with follow-up, which is known to produce length-biased observed failure times. There are drawbacks to each of these two types of study when viewed separately. We address two main questions here: (i) Can our statistical inference be enhanced by combining data from an incident cohort study with data from a prevalent cohort study with follow-up? (ii) What statistical methods are appropriate for these combined data? The theory we develop to address these questions is based on a parametrically defined failure time distribution and is supported by simulations. We apply our methods to estimate the duration of hospital stays.
-
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
-
Research funding: The first author was supported by a Natural Sciences and Engineering Research Council of Canada PGSD-3 award. David Stephens acknowledges the support of a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC).
-
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
1. Humbert, M, Sitbon, O, Yaïci, A, Montani, D, O’Callaghan, DS, Jaïs, X, et al. On behalf of the French Pulmonary Arterial Hypertension Network. Survival in incident and prevalent cohorts of patients with pulmonary arterial hypertension. Eur Respir J 2010;36:549–55. https://doi.org/10.1183/09031936.00057010.Search in Google Scholar
2. Lee, CH, Ning, J, Kryscio, RJ, Shen, Y. Analysis of combined incident and prevalent cohort data under a proportional mean residual life model. Stat Med 2019;38:2103–14. https://doi.org/10.1002/sim.8098.Search in Google Scholar
3. Daepp, MIG, Hamilton, MJ, West, GB, Bettencourt, LMA. The mortality of companies. J R Soc Interface 2015;12. https://doi.org/10.1098/rsif.2015.0120.Search in Google Scholar
4. Groothuis, PA, Hill, JR. Pay discrimination, exit discrimination or both? Another look at an old issue using NBA data. J Sports Econ 2011;14:171–85. https://doi.org/10.1177/1527002511418515.Search in Google Scholar
5. Welch, S.M. Nonparametric estimates of the duration of welfare spells. Econ Lett 1998;60:217–21. https://doi.org/10.1016/s0165-1765(98)00105-0.Search in Google Scholar
6. Andersen, PK, Borgan, Ø, Gill, RD, Keiding, N. Statistical Models Based on Counting Processes. New York: Springer-Verlag; 1993.10.1007/978-1-4612-4348-9Search in Google Scholar
7. Kaplan, EL, Meier, P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457–81. https://doi.org/10.1080/01621459.1958.10501452.Search in Google Scholar
8. Kalbfleisch, JD, Prentice, RL. The statistical analysis of failure time data, 2nd ed. New York: Wiley; 1980.Search in Google Scholar
9. Tsai, W-Y, Jewell, NP, Wang, M-C. A note on the product-limit estimator under right censoring and left truncation. Biometrika 1987;74:883–6. https://doi.org/10.1093/biomet/74.4.883.Search in Google Scholar
10. Wang, M-C. Nonparametric estimation from cross-sectional survival data. J Am Stat Assoc 1991;86:130–43. https://doi.org/10.1080/01621459.1991.10475011.Search in Google Scholar
11. Zhou, Y. A note on the TJW product-limit estimator for truncated and censored data. Stat Probab Lett 1996;26:381–7. https://doi.org/10.1016/0167-7152(95)00035-6.Search in Google Scholar
12. Asgharian, M, M’Lan, CE, Wolfson, DB. Length-biased sampling with right censoring: an unconditional approach. J Am Stat Assoc 2002;97:201–9. https://doi.org/10.1198/016214502753479347.Search in Google Scholar
13. Bergeron, P-J, Asgharian, M, Wolfson, DB. Covariate bias induced by length-biased sampling of failure times. J Am Stat Assoc 2008;103:737–42. https://doi.org/10.1198/016214508000000382.Search in Google Scholar
14. Pan, W, Chappell, R. A nonparametric estimator of survival functions for arbitrarily truncated and censored data. Lifetime Data Anal 1998;4:187–202. https://doi.org/10.1023/a:1009637624440.10.1023/A:1009637624440Search in Google Scholar
15. Wolfson, DB, Best, AF, Addona, V, Wolfson, J, Gadalla, SM. Benefits of combining prevalent and incident cohorts: an application to myotonic dystrophy. Stat Methods Med Res 2019;28:3333–45. https://doi.org/10.1177/0962280218804275.Search in Google Scholar PubMed
16. McVittie, JH, Wolfson, DB, Stephens, DA. A note on the applicability of the standard non-parametric maximum likelihood estimator for combined incident and prevalent cohort data. Stat 2020;9. https://doi.org/10.1002/sta4.280.Search in Google Scholar
17. Vardi, Y. Nonparametric estimation in the presence of length bias. Ann Stat 1982;10:616–20. https://doi.org/10.1214/aos/1176345802.Search in Google Scholar
18. Vardi, Y. Empirical distributions in selection bias models. Ann Stat 1985;13:178–203. https://doi.org/10.1214/aos/1176346585.Search in Google Scholar
19. Gill, RD, Vardi, Y, Wellner, JA. Large sample theory of empirical distributions in biased sampling models. Ann Stat 1988;16:1069–112. https://doi.org/10.1214/aos/1176350948.Search in Google Scholar
20. Miller, RGJr. What price Kaplan-Meier?. Biometrics 1983;39:1077–81. https://doi.org/10.2307/2531341.Search in Google Scholar
21. Laslett, GM. The survival curve under monotone density constraints with application to two-dimensional line segment processes. Biometrika 1982;69:153–60. https://doi.org/10.1093/biomet/69.1.153.Search in Google Scholar
22. van der Laan, MJ. Efficiency of the NPMLE in the line-segment problem. Scand J Stat 1996;23:527–50.Search in Google Scholar
23. Wijers, BJ. Consistent non-parametric estimation for a one-dimensional line segment process observed in an interval. Scand J Stat 1995;22:335–60.Search in Google Scholar
24. Saarela, O, Kulathinal, S, Karvanen, J. Joint analysis of prevalence and incidence data using conditional likelihood. Biostatistics 2009;10:575–87. https://doi.org/10.1093/biostatistics/kxp013.Search in Google Scholar PubMed
25. Ibragimov, IA, Has’minskii, RZ. Statistical Estimation: Asymptotic Theory. Springer-Verlag; 1981.10.1007/978-1-4899-0027-2Search in Google Scholar
26. Hoadley, B. Asymptotic properties of maximum likelihood estimators for the independent not identically distributed case. Ann Math Stat 1991;42:1977–91. https://doi.org/10.1214/aoms/1177693066.Search in Google Scholar
27. Wilks, SS. Multidimensional statistical scatter. In: Olkin, I, Ghurye, S, Hoeffding, W, Madow, W, Mann, H, editors Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling Stanford University Press; 1960. pp. 486–503.Search in Google Scholar
28. Verma, A, Rochefort, C, Powell, G, Buckeridge, D. Hospital readmissions and the day of the week. J Health Serv Res Pol 2018;23:21–7. https://doi.org/10.1177/1355819617750185.Search in Google Scholar PubMed
29. Shaban-Nejad, A, Lavinge, M, Okhmatovskaia, A, Buckeridge, DL. PopHR: a knowledge-based platform to support integration, analysis, and visualization of population health data. Ann N Y Acad Sci 2017;1387:44–53. https://doi.org/10.1111/nyas.13271.Search in Google Scholar PubMed
30. Addona, V, Atherton, J, Wolfson, DB. Testing the assumptions for the analysis of survival data arising from a prevalent cohort study with follow-up. Int J Biostat 2012;8. https://doi.org/10.1515/1557-4679.1419.Search in Google Scholar PubMed
31. Addona, V, Wolfson, DB. A formal test for the stationarity of the incidence rate using data from a prevalent cohort study with follow-up. Lifetime Data Anal 2006;12:267–84. https://doi.org/10.1007/s10985-006-9012-2.Search in Google Scholar PubMed
32. Tierney, JF, Pignon, J-P, Gueffyier, F, Clarke, M, Askie, L, Vale, CL, et al. On behalf of the Cochrane IPD Meta-analysis Methods Group. How individual participant data meta-analyses have influenced trial design, conduct, and analysis. J Clin Epidemiol 2015;68:1325–35. https://doi.org/10.1016/j.jclinepi.2015.05.024.Search in Google Scholar PubMed PubMed Central
33. Wolfson, C, Wolfson, DB, Asgharian, M, M’Lan, CE, Østbye, T, Rockwood, K, et al. For the Clinical Progression of Dementia Study Group. A reevaluation of the duration of survival after the onset of dementia. N Engl J Med 2001;344:1111–16. https://doi.org/10.1056/nejm200104123441501.Search in Google Scholar PubMed
Supplementary material
The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2020-0042).
© 2020 Walter de Gruyter GmbH, Berlin/Boston