Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Causal Inference

Ed. by Imai, Kosuke / Pearl, Judea / Petersen, Maya Liv / Sekhon, Jasjeet / van der Laan, Mark J.

See all formats and pricing
More options …

Detecting Confounding in Multivariate Linear Models via Spectral Analysis

Dominik Janzing
  • Corresponding author
  • Deaprtment ‘Empirical Inference’,Max Planck Institute for Intelligent Systems,Spemannstr. 36, 70569 Tübingen,Germany
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Bernhard Schölkopf
  • Deaprtment ‘Empirical Inference’,Max Planck Institute for Intelligent Systems,Tübingen,Germany
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2017-10-28 | DOI: https://doi.org/10.1515/jci-2017-0013


We study a model where one target variable Y is correlated with a vector X:=(X1,,Xd) of predictor variables being potential causes of Y. We describe a method that infers to what extent the statistical dependences between X and Y are due to the influence of X on Y and to what extent due to a hidden common cause (confounder) of X and Y. The method relies on concentration of measure results for large dimensions d and an independence assumption stating that, in the absence of confounding, the vector of regression coefficients describing the influence of each X on Y typically has ‘generic orientation’ relative to the eigenspaces of the covariance matrix of X. For the special case of a scalar confounder we show that confounding typically spoils this generic orientation in a characteristic way that can be used to quantitatively estimate the amount of confounding (subject to our idealized model assumptions).

Keywords: confounding; independence of mechanisms; spectral analysis


  • [1]

    Reichenbach H. The direction of time. Berkeley: University of California Press, 1956.Google Scholar

  • [2]

    Pearl J. Causality: Models, reasoning, and inference. Cambridge University Press, 2000.Google Scholar

  • [3]

    Spirtes P, Glymour C, Scheines R. Causation, Prediction, and Search (Lecture notes in statistics). New York, NY: Springer-Verlag, 1993.Google Scholar

  • [4]

    Bowden R, Turkington D. Instrumental variables. Cambridge: Cambridge University Press, 1984.Google Scholar

  • [5]

    Hoyer P, Shimizu S, Kerminen A, Palviainen M. Estimation of causal effects using linear non-gaussian causal models with hidden variables. Int J Approx Reason. 2008;49:362–378.CrossrefWeb of ScienceGoogle Scholar

  • [6]

    Janzing D, Peters J, Mooij J, Schölkopf B. Identifying latent confounders using additive noise models. In: Ng A, Bilmes J, editor. Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009). Corvallis, OR, USA: AUAI Press, 2009:249–257.Google Scholar

  • [7]

    Janzing D, Sgouritsa E, Stegle O, Peters P, Schölkopf B. Detecting low-complexity unobserved causes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011). Available at: http://uai.sis.pitt.edu/papers/11/p383-janzing.pdf.Google Scholar

  • [8]

    Janzing D, Balduzzi D, Grosse-Wentrup M, Schölkopf B. Quantifying causal influences. Ann Stat. 2013;41:2324–2358.Web of ScienceCrossrefGoogle Scholar

  • [9]

    Janzing D, Schölkopf B. Causal inference using the algorithmic Markov condition. IEEE Trans Inf Theo. 2010;56:5168–5194.CrossrefGoogle Scholar

  • [10]

    Lemeire J, Janzing D. Replacing causal faithfulness with algorithmic independence of conditionals. Minds Mach. 2012;23:227–249.Web of ScienceGoogle Scholar

  • [11]

    Li M, Vitányi P. An Introduction to Kolmogorov Complexity and its Applications. New York: Springer, 1997 (3rd edition: 2008).Google Scholar

  • [12]

    Janzing D, Steudel B. Justifying additive-noise-based causal discovery via algorithmic information theory. Open Syst Inf Dynam. 2010;17:189–212.CrossrefGoogle Scholar

  • [13]

    Meek C. Strong completeness and faithfulness in Bayesian networks. In: Proceedings of 11th Uncertainty in Artificial Intelligence (UAI). Montreal, Canada: Morgan Kaufmann, 1995:411–418.

  • [14]

    Uhler C, Raskutti G, Bühlmann P, Yu B. Geometry of the faithfulness assumption in causal inference. Ann Stat. 2013;41:436–463.CrossrefWeb of ScienceGoogle Scholar

  • [15]

    Kato T. Perturbation theory for linear operators. Berlin: Springer, 1996.Google Scholar

  • [16]

    Murphy G. C-algebras and operator theory. Boston: Academic Press, 1990.Google Scholar

  • [17]

    Reed M, Simon B. Functional Analysis. San Diego, California: Academic Press, 1980.Google Scholar

  • [18]

    Janzing D, Hoyer P, Schölkopf B. Telling cause from effect based on high-dimensional observations. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), Haifa, Israel, 06, 2010:479–486.Google Scholar

  • [19]

    Zscheischler J, Janzing D, Zhang K. Testing whether linear equations are causal: A free probability theory approach. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), 2011. Available at: http://uai.sis.pitt.edu/papers/11/p839-zscheischler.pdf.Google Scholar

  • [20]

    Voiculescu D, editor. Free probability theory, volume 12 of Fields Institute Communications. American Mathematical Society, 1997.Google Scholar

  • [21]

    Chandrasekaran V, Parrilo P, Willsky A. Latent variable graphical model selection via convex optimization. Ann Stat. 2012;40:1935–1967.Web of ScienceCrossrefGoogle Scholar

  • [22]

    Datta BN. Numerical Linear Algebra and Applications. Philadelphia, USA: Society for Industrial and Applied Mathematics, 2010.Google Scholar

  • [23]

    Cima J, Matheson A, Ross W. The Cauchy Transform. Mathematical Surveys and Monographs 125. American Mathematical Society, 2006.Google Scholar

  • [24]

    Simon B. Spectral analysis of rank one perturbations and applications. Lectur given at the Vancouver Summer School in Mathematical Physics (1993). Available at: 1994.CrossrefGoogle Scholar

  • [25]

    Simon B. Trace ideals and their applications. Providence, RI: American Mathematical Society, 2005.Google Scholar

  • [26]

    Kiselev A, Simon B. Rank one perturbations with infinitesimal coupling. J Funct Anal. 1995;130:345–356.CrossrefGoogle Scholar

  • [27]

    Albeverio S, Konstantinov A, Koshmanenko V. The Aronszajn-Donoghue theory for rank one perturbations of the H2-class. Integral Equ Operat Theo. 2004;50:1–8.CrossrefGoogle Scholar

  • [28]

    Albeverio S, Kurasov P. Rank one perturbations, approximations, and selfadjoint extensions. J Func Anal. 1997;148:152–169.CrossrefGoogle Scholar

  • [29]

    Bartlett MS. An inverse matrix adjustment arising in discriminant analysis. Ann. Math. Statist. 1951;22:107–111.CrossrefGoogle Scholar

  • [30]

    Mingo J, Speicher R. Free probability and random matrices. New York: Springer, 2017.Google Scholar

  • [31]

    Bercovici H, Voiculescu D. Free convolution of measures with unbounded supports. Ind Univ Math J. 1993;42:733–773.CrossrefGoogle Scholar

  • [32]

    Rudelson M. Random vectors in the isotropic position. J Func Anal. 1999;164:60–72.CrossrefGoogle Scholar

  • [33]

    Vershynin R.. How close is the sample covariance matrix to the actual covariance matrix? J Theo Probab. 2012;25:655–686.CrossrefGoogle Scholar

  • [34]

    Karlin S, Rinott Y. Classes of orderings of measures and related correlation inequalities. I. multivariate totally positive distributions. J Multiv Anal. 1980;10:467–498.Google Scholar

  • [35]

    Fallat S, Lauritzen S, Sadeghi K, Uhler C, Wermuth N, Zwiernik P. Total positivity in markov structures. To appear in Annals of Statistics, 2016.Google Scholar

  • [36]

    Lichman M. UCI machine learning repository. Available at: http://archive.ics.uci.edu/ml, 2013.Google Scholar

  • [37]

    City of Chicago. Data portal: Chicago poverty and crime. Available at: https://data.cityofchicago.org/Health-Human-Services/Chicago-poverty-and-crime/fwns-pcmk.

  • [38]

    Yeh C. Concrete compressive strength data set. https://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+ Strength.

  • [39]

    Yeh I-C. Modeling of strength of high performance concrete using artificial neural networks. Cement Concrete Res. 1998.Google Scholar

  • [40]

    Schölkopf B, Smola A. Learning with kernels. Cambridge, MA: MIT Press, 2002.Google Scholar

  • [41]

    Gretton A, Herbrich R, Smola A, Bousquet O, Schölkopf B. Kernel methods for measuring independence. J Mach Learn Res. 2005;6:2075–2129.Google Scholar

  • [42]

    Speicher R. Free probability theory and non-crossing partitions. LOTHAR. COMB. 1997;39.Google Scholar

About the article

Received: 2017-04-05

Revised: 2017-06-14

Accepted: 2017-09-21

Published Online: 2017-10-28

Published in Print: 2018-03-26

Citation Information: Journal of Causal Inference, Volume 6, Issue 1, 20170013, ISSN (Online) 2193-3685, DOI: https://doi.org/10.1515/jci-2017-0013.

Export Citation

© 2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in