degruyter.com uses cookies to store information that enables us to optimize our website and make browsing more comfortable for you. To learn more about the use of cookies, please read our Privacy Policy. OK

Dependence and dependence structures: estimation and visualization using the unifying concept of distance multivariance

Björn Böttcher 1
  • 1 TU Dresden, Fakultät Mathematik, Institut für Mathematische Stochastik, 01062, Dresden, Germany

Abstract

Distance multivariance is a multivariate dependence measure, which can detect dependencies between an arbitrary number of random vectors each of which can have a distinct dimension. Here we discuss several new aspects, present a concise overview and use it as the basis for several new results and concepts: in particular, we show that distance multivariance unifies (and extends) distance covariance and the Hilbert-Schmidt independence criterion HSIC, moreover also the classical linear dependence measures: covariance, Pearson’s correlation and the RV coefficient appear as limiting cases. Based on distance multivariance several new measures are defined: a multicorrelation which satisfies a natural set of multivariate dependence measure axioms and m-multivariance which is a dependence measure yielding tests for pairwise independence and independence of higher order. These tests are computationally feasible and under very mild moment conditions they are consistent against all alternatives. Moreover, a general visualization scheme for higher order dependencies is proposed, including consistent estimators (based on distance multivariance) for the dependence structure.

Many illustrative examples are provided. All functions for the use of distance multivariance in applications are published in the R-package multivariance.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Aitkin, M. (1978), “The analysis of unbalanced cross-classifications”, Journal of the Royal Statistical Society: Series A (General), 141(2), 195–211.

  • Berg, C., and Forst, G. (1975), Potential Theory on Locally Compact Abelian Groups, Berlin: Springer.

  • Berschneider, G., and Böttcher, B. (2019), On complex Gaussian random fields, Gaussian quadratic forms and sample distance multivariance, arXiv:1808.07280v2.

  • Bilodeau, M., and Guetsop Nangue, A. (2017), “Tests of mutual or serial independence of random vectors with applications”, The Journal of Machine Learning Research, 18(1), 2518–2557.

  • Böttcher, B. (2019), multivariance: Measuring Multivariate Dependence Using Distance Multivariance. R package version 2.2.0.

  • Böttcher, B., Keller-Ressel, M., and Schilling, R. L. (2018), “Detecting independence of random vectors: Generalized distance covariance and Gaussian covariance”, Modern Stochastics: Theory and Applications, 5(3), 353–383.

  • Böttcher, B., Keller-Ressel, M., and Schilling, R. L. (2019), “Distance multivariance: New dependence measures for random vectors”, The Annals of Statistics, 47(5). 2757–2789.

  • Böttcher, B., Schilling, R. L., and Wang J. (2013), Lévy-Type Processes: Construction, Approximation and Sample Path Properties, volume 2099 of Lecture Notes in Mathematics, Lévy Matters, Springer.

  • Chakraborty, S., and Zhang, X. (2019), “Distance metrics for measuring joint dependence with application to causal inference”, Journal of the American Statistical Association, 114(528), 1638-1650.

  • Cox, T. F., and Dunn, R. T. (2002), “An analysis of decathlon data”, Journal of the Royal Statistical Society. Series D (The Statistician), 51(2), 179–187.

  • Csardi, G., and Nepusz, T. (2006), “The igraph software package for complex network research”, InterJournal, Complex Systems, 1695.

  • Csörgő, S. (1985), “Testing for independence by the empirical characteristic function”, Journal of Multivariate Analysis, 16(3), 290–299.

  • Edelmann D. (2015), Structures of Multivariate Dependence, PhD thesis, Universität Heidelberg.

  • Escoufier, Y. (1973), “Le traitement des variables vectorielles”, Biometrics, 29(4), 751–760.

  • Fan, Y., de Micheaux, P. L., Penev, S., and Salopek, D. (2017), “Multivariate nonparametric test of independence”, Journal of Multivariate Analysis, 153, 189–210.

  • Genest, C., and Rémillard, B. (2004), “Test of independence and randomness based on the empirical copula process”, Test, 13(2), 335–369.

  • Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Schölkopf, B., and Smola, A. J. (2008), “A kernel statistical test of independence”, Advances in Neural Information Processing Systems, 20, 585–592.

  • Guetsop Nangue, A. (2017), Tests de permutation d’ind′ependance en analyse multivariée, PhD thesis, Université de Montréal.

  • Han, J., Pei, J., and Kamber, M. (2011), Data mining: concepts and techniques, Burlington: Morgan Kaufmann.

  • Hofert, M., Kojadinovic, I., Maechler, M., and Yan, J. (2018), copula: Multivariate Dependence with Copulas, R package version 0.999-19.

  • Jacob, N. (2001), Pseudo-Differential Operators andMarkov Processes I. Fourier Analysis and Semigroups, London: Imperial College Press.

  • Jin, Z., and Matteson, D. S. (2018), “Generalizing distance covariance to measure and test multivariate mutual dependence via complete and incomplete V-statistics”, Journal of Multivariate Analysis, 168, 304–322.

  • Josse, J., and Holmes, S. (2016), “Measuring multivariate association and beyond”, Statistics Surveys, 10, 132-167.

  • Kallenberg, O. (1997), Foundations of Modern Probability, New York, Berlin, Heidelberg: Springer.

  • Kankainen, A. (1995), Consistent testing of total independence based on the empirical characteristic function, PhD thesis, University of Jyväskylä.

  • Korolyuk, V. S., and Borovskich, Y. V. (1994), Theory of U-statistics, volume 273, Dordrecht: Springer Science & Business Media.

  • Liu, Y., de la Pena, V., and Zheng, T. (2018), “Kernel-based measures of association”,Wiley Interdisciplinary Reviews: Computational Statistics, 10(2), e1422.

  • Lyons, R. (2013), “Distance covariance in metric spaces”, The Annals of Probability, 41(5), 3284-3305.

  • Matheron, G. (1963), “Principles of Geostatistics”, Economic geology, 58(8), 1246–1266.

  • Móri, T. F., and Székely, G. J. (2018), “Four simple axioms of dependence measures”, Metrika, 82, 1–16.

  • Pfister, N., Bühlmann, P., Schölkopf, B., and Peters, J. (2017), “Kernel-based tests for joint independence”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 5–31.

  • Pfister, N., and Peters, J. (2019), dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion, R package version 2.1.

  • Robert, P., and Escoufier, Y. (1976), “A Unifying Tool for Linear Multivariate Statistical Methods: The RV- Coeflcient”, Journal of the Royal Statistical Society. Series C (Applied Statistics), 25(3), 257- 265.

  • Rényi, A. (1959), “On measures of dependence”, Acta mathematica hungarica, 10(3-4), 441–451.

  • Sato, K. (1999), Lévy Processes and Infinitely Divisible Distributions, Cambridge: Cambridge University Press.

  • Sejdinovic, D., Gretton, A., and Bergsma,W. (2013), “A Kernel Test for Three-Variable Interactions” in Advances in Neural Information Processing Systems (NeurIPS), volume 26, pp. 1124–1132.

  • Sejdinovic, D., Sriperumbudur, B., Gretton, A., and Fukumizu, K. (2013), “Equivalence of distance-based and RKHS-based statistics in hypothesis testing”, Annals of Statistics, 41(5), 2263–2291.

  • Shen, C., and Vogelstein, J. T. (2018), “The exact equivalence of distance and kernel methods for hypothesis testing”, CoRR, abs/1806.05514.

  • Székely, G. J., and Bakirov, N. K. (2003), “Extremal probabilities for Gaussian quadratic forms”, Probability Theory and Related Fields, 126(2), 184–202.

  • Székely, G. J., and Rizzo, M. L. (2009), “Brownian distance covariance”, Annals of Applied Statistics, 3(4), 1236– 1265.

  • Székely, G. J., Rizzo, M. L., and Bakirov, N. K. (2007), “Measuring and testing dependence by correlation of distances”, The Annals of Statistics, 35(6), 2769–2794.

  • Tjøstheim, D., Otneim, H., and Støve, B. (2018), Statistical dependence: Beyond pearson’s ρ. arXiv:1809.10455v1.

  • Unwin, A. (2015), GDAdata: Datasets for the Book Graphical Data Analysis with R, R package version 0.93.

  • Venables, W. N., and Ripley, B. D. (2002), Modern Applied Statistics with S. New York: Springer, fourth edition.

  • Woolf, A., Ansley, L., and Bidgood, P. (2007), “Grouping of decathlon disciplines”, Journal of Quantitative Analysis in Sports, 3(4).

  • Yao, S., Zhang, X., and Shao, X. (2017), “Testing mutual independence in high dimension via distance covariance”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80, 455–480.

  • Zinger, A., Kakosyan, A. V., and Klebanov, L. B. (1992), “A characterization of distributions by mean values of statistics and certain probabilistic metrics”, Journal of Mathematical Sciences, 59(4), 914–920.

OPEN ACCESS

Journal + Issues

Search