# Journal of Causal Inference

Volume 6, Issue 1

# Detecting Confounding in Multivariate Linear Models via Spectral Analysis

Dominik Janzing
• Deaprtment ‘Empirical Inference’,Max Planck Institute for Intelligent Systems,Spemannstr. 36, 70569 Tübingen,Germany
/ Bernhard Schölkopf
• Deaprtment ‘Empirical Inference’,Max Planck Institute for Intelligent Systems,Tübingen,Germany
Published Online: 2017-10-28 | DOI: https://doi.org/10.1515/jci-2017-0013

## Abstract

We study a model where one target variable $Y$ is correlated with a vector $\mathbf{\text{X}}:=\left({X}_{1},\dots ,{X}_{d}\right)$ of predictor variables being potential causes of $Y$. We describe a method that infers to what extent the statistical dependences between $\mathbf{\text{X}}$ and $Y$ are due to the influence of $\mathbf{\text{X}}$ on $Y$ and to what extent due to a hidden common cause (confounder) of $\mathbf{\text{X}}$ and $Y$. The method relies on concentration of measure results for large dimensions $d$ and an independence assumption stating that, in the absence of confounding, the vector of regression coefficients describing the influence of each $\mathbf{\text{X}}$ on $Y$ typically has ‘generic orientation’ relative to the eigenspaces of the covariance matrix of $\mathbf{\text{X}}$. For the special case of a scalar confounder we show that confounding typically spoils this generic orientation in a characteristic way that can be used to quantitatively estimate the amount of confounding (subject to our idealized model assumptions).

