Abstract: In this article, we consider the problem of comparing the distribution of observations in a planar region to a pre-specified null distribution. Our motivation is a surveillance setting where we map locations of incident disease, aiming to monitor these data over time, to locate potential areas of high/low incidence so as to direct public health actions.
We propose a non-parametric approach to distance-based disease risk mapping inspired by tomographic imaging. We consider several one-dimensional projections via the observed distribution of distances to a chosen fixed point; we then compare this distribution to that expected under the null and average these comparisons across projections to compute a relative-risk-like score at each point in the region. The null distribution can be established from historical data. Scores are displayed on the map using a color scale.
In addition, we give a detailed description of the method along with some desirable theoretical properties. To further assess the performance of this method, we compare it to the widely used log ratio of kernel density estimates. As a performance metric, we evaluate the accuracy to locate simulated spatial clusters superimposed on a uniform distribution in the unit disk. Results suggest that both methods can adequately locate this increased risk but each relies on an appropriate choice of parameters. Our proposed method, distance-based mapping (DBM), can also generalize to arbitrary metric spaces and/or high-dimensional data.
Acknowledgements
The research in this paper was funded by a grant from the National Institutes of Health R01 EB0006195 and CDC grant R01 PH000021–01.
References
[1] TeutschS, ChurchillR. Principles and practice of public health surveillance, 2nd ed. New York: Oxford University Press.Search in Google Scholar
[2] HenningK. What is syndromic surveillance? MMWR2004; 53S:7–11.10.1037/e307182005-001Search in Google Scholar
[3] ForsbergL, BonettiM, JefferyC, OzonoffA, PaganoM. Distance based methods for spatial and spatio-temporal surveillance. Wiley, 2005:133–52.10.1002/0470092505.ch8Search in Google Scholar
[4] ForsbergL, JefferyC, OzonoffA, PaganoM. A spatio-temporal analysis of syndromic data for biosurveillance. Springer-Verlag, 2006:173–93.Search in Google Scholar
[5] KelsallJ, DiggleP. Kernel estimation of relative risk. Bernoulli1995a; 1:3–16.10.2307/3318678Search in Google Scholar
[6] TurnbullB, IwanoE, BurnettW, HoweH, and ClarkL. Monitoring for clustering of disease: application to leukemia incidence in upstate New York. Am J Epidemiol (suppl.) 1990; 132:S136–43.10.1093/oxfordjournals.aje.a115775Search in Google Scholar PubMed
[7] KulldorffM, FeuerE, MillerB, FreedmaL. Breast cancer clusters in the Northeast United States: a geographic analysis. Am J Epidemiol1997; 146:161–70.10.1093/oxfordjournals.aje.a009247Search in Google Scholar PubMed
[8] WheelerD. A comparison of spatial clustering and cluster detection techniques for childhood leukemia incidence in Ohio, 1996–2003. Int J Health Geographics2007; 6:13.10.1186/1476-072X-6-13Search in Google Scholar PubMed PubMed Central
[9] SonessonC, BockD. A review and discussion of prospective statistical surveillance in public health. J Royal Stat Soc: Series A (Stat Soc)2003; 166:5–21.10.1111/1467-985X.00256Search in Google Scholar
[10] KleinmanK, LazarusR, PlattR. A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. Am J Epidemiol2004; 159:217.10.1093/aje/kwh029Search in Google Scholar PubMed
[11] OlsonK, GrannisS, MandlK. Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health2006; 96:2002.10.2105/AJPH.2005.069526Search in Google Scholar PubMed PubMed Central
[12] CuzickJ, EdwardsR. Spatial clustering for inhomogeneous populations. J Royal Stat Soc Series B (Methodological)1990; 52:73–104.10.1111/j.2517-6161.1990.tb01773.xSearch in Google Scholar
[13] DiggleP, ChetwyndA. Second-order analysis of spatial clustering for inhomogeneous populations. Biometrics1991; 47:1155–63.10.2307/2532668Search in Google Scholar
[14] TangoT. A test for spatial disease clustering adjusted for multiple testing. Stat Med2000; 19:191–204.10.1002/(SICI)1097-0258(20000130)19:2<191::AID-SIM281>3.0.CO;2-QSearch in Google Scholar
[15] BonettiM, PaganoM. The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering. Stat Med2005; 24:753–73.10.1002/sim.1947Search in Google Scholar
[16] BithellJ. An application of density estimation to geographical epidemiology. Stat Med1990; 9:691–701.10.1002/sim.4780090616Search in Google Scholar
[17] KelsallJ, DiggleP. Non-parametric estimation of spatial variation in relative risk. Stat Med1995b; 14:2335–42.10.1002/sim.4780142106Search in Google Scholar
[18] KelsallJ, DiggleP. Spatial variation in risk of disease: a non-parametic binary regression approach. Appl Stat1998; 47:559–73.10.1111/1467-9876.00128Search in Google Scholar
[19] IzenmanA. Recent developments in nonparametric density estimation. J Am Stat Assoc1991; 86:205–24.10.1080/01621459.1991.10475021Search in Google Scholar
[20] KammannE, WandM. Geoadditive models. J Royal Stat Soc: Series C (Appl Stat), 2003; 52:1–18.10.1111/1467-9876.00385Search in Google Scholar
[21] PaciorekC, SchervishM. Nonstationary covariance functions for gaussian process regression. Adv Neural Inf Process Systems2004; 16:273–80.Search in Google Scholar
[22] CressieN. Statistics for spatial data. New York: John Wiley & Sons Inc, 1993.Search in Google Scholar
[23] WikleC. A kernel-based spectral model for non-gaussian spatio-temporal processes. Stat Modell2002; 2:299–314.10.1191/1471082x02st036oaSearch in Google Scholar
[24] Wand, Matt P., and M, Chris Jones. Kernel smoothing. London: Chapman & Hall, 1995.10.1007/978-1-4899-4493-1Search in Google Scholar
[25] Scott, D. W.Multivariate density estimationNew York: Wiley, 1992.10.1002/9780470316849Search in Google Scholar
[26] FriedmanJ, StuetzleW, SchroederA. Projection pursuit density estimation. J Am Stat Assoc1984; 79:599–608.10.1080/01621459.1984.10478086Search in Google Scholar
[27] ManjouridesJ, LinH, ShinS, JefferyC, ContrerasC, Santa CruzJ, JaveO, YaguiM, AsenciosL, PaganoM, CohenT. Identifying multidrug resistant tuberculosis transmission hotspots using routinely collected data. Tuberculosis. 2012.10.1016/j.tube.2012.02.003Search in Google Scholar PubMed PubMed Central
[28] JefferyC. Disease mapping and statistical issues in public health surveillance, PhD Thesis, Harvard University, Cambridge, MA, 2010.Search in Google Scholar
[29] ManjouridesJ, JefferyC, OzonoffA, PaganoM. The use of distances in surveillance. In: Proceedings of the American Statistical Association, Statistical Computing Section [CD-ROM]. ASA, 2007.Search in Google Scholar
[30] OzonoffA, BonettiM, ForsbergL, JefferyC, PaganoM. The distribution of interpoint distances, cluster detection, and syndromic surveillance. In: Proceedings of the American Statistical Association, Biometrics Section [CD-ROM]. ASA, 2004.Search in Google Scholar
[31] EpsteinC. Introduction to the mathematics of medical imaging, 2nd ed. Philadelphia: Society for Industrial and Applied Mathematics, 2007.Search in Google Scholar
[32] BracewellR. The Fourier transform and its applications, Electrical Engineering. New York: McGraw-Hill, 1999.Search in Google Scholar
[33] MeselsonM, GuilleminJ, Hugh-JonesM, LangmuirA, PopovaI, ShelokovA, YampolskayaO. The Sverdlovsk anthrax outbreak of 1979. Science1994; 266:1202.10.1126/science.7973702Search in Google Scholar PubMed
[34] LagakosS, WessenB, ZelenM. An analysis of contaminated well water and health effects in Woburn, Massachusetts. J Am Stat Assoc1986; 81(395):583–96.10.1080/01621459.1986.10478307Search in Google Scholar
[35] NaumovaE, EgorovA, MorrisR, GriffithsJ. The elderly and waterborne Cryptosporidium infection: gastroenteritis hospitalizations before and during the 1993 Milwaukee outbreak. Emerg Inf Dis2003; 9:418–25.10.3201/eid0904.020260Search in Google Scholar PubMed PubMed Central
[36] JefferyC, OzonoffA, WhiteL, NunoM, PaganoM. Power to detect spatial disturbances under different levels of geographic aggregation. J Am Med Inform Assoc, 2009; 16:847–54.10.1197/jamia.M2788Search in Google Scholar PubMed PubMed Central
[37] LawsonA. Disease mapping and risk assessment for public health. New York: Wiley, 1999.Search in Google Scholar
[38] Vidal RodeiroC, LawsonA. An evaluation of the edge effects in disease map modelling. Comput Stat Data Anal2005; 49:45–62.10.1016/j.csda.2004.05.012Search in Google Scholar
[39] DiggleP. A kernel method for smoothing point process data. Appl Stat1985; 34:138–47.10.2307/2347366Search in Google Scholar
[40] BermanM, DiggleP. Estimating weighted integrals of the second-order intensity of a spatial point process. J Royal Stat Soc Series B (Methodological)1989; 51:81–92.10.1111/j.2517-6161.1989.tb01750.xSearch in Google Scholar
[41] TakahashiK, TangoT. An extended power of cluster detection tests. Stat Med2006; 25:841–852.10.1002/sim.2419Search in Google Scholar
[42] OzonoffA, JefferyC, ManjouridesJ, WhiteL, PaganoM. Effect of spatial resolution on cluster detection: a simulation study. Int J Health Geographics2007; 6:52.10.1186/1476-072X-6-52Search in Google Scholar
[43] SabelC, GatrellA, LoytonenM, MaasiltaP, JokelainenM. Modelling exposure opportunities: estimating relative risk for motor neurone disease in finland. Social Sci Med2000; 50:1121–37.10.1016/S0277-9536(99)00360-3Search in Google Scholar
[44] OzonoffA, JefferyC, PaganoM. Multivariate disease mapping. In Proceedings of the American Statistical Association, Biometrics Section [CD-ROM]. ASA, 2009.Search in Google Scholar
[45] KowalskiJ, PaganoM, DegruttolaV. A nonparametric test of gene region heterogeneity associated with phenotype. J Am Stat Assoc2002; 97:398–409.10.1198/016214502760046952Search in Google Scholar
[46] GrahamD. Statistical methods for the analysis of HIV drug-resistance Data, DSc Thesis, Harvard University, Cambridge, MA, 2005.Search in Google Scholar
©2013 by Walter de Gruyter Berlin / Boston