Instrumental Variables

Instrumental Variables (IV) deal with the case for estimating causal effects in the presence of unobserved confounding variables that simultaneously have effects on the treatment \(X\) and the outcome \(Y\). A set of variables \(Z\) is said to be a set of instrumental variables if for any \(z\) in \(Z\):

  1. \(z\) has a causal effect on \(X\).

  2. The causal effect of \(z\) on \(Y\) is fully mediated by \(X\).

  3. There are no back-door paths from \(z\) to \(Y\).

In such case, we must first find the IV (which can be done by using the CausalModel, see Identification). For an instance, the variable \(Z\) in the following figure can serve as a valid IV for estimating the causal effects of \(X\) on \(Y\) in the presence of the unobserved confounder \(U\).

../../_images/iv3.png

Causal graph with IV

YLearn implements two different methods related to IV: deepiv [Hartford], which utilizes the deep learning models to IV, and IV of nonparametric models [Newey2002].

The IV Framework and Problem Setting

The IV framework aims to predict the value of the outcome \(y\) when the treatment \(x\) is given. Besides, there also exist some covariates vectors \(v\) that simultaneously affect both \(y\) and \(x\). There also are some unobserved confounders \(e\) that potentially also affect \(y\), \(x\) and \(v\). The core part of causal questions lies in estimating the causal quantity

\[\mathbb{E}[y| do(x)]\]

in the following causal graph, where the set of causal relationships are determined by the set of functions

\[\begin{split}y & = f(x, v) + e\\ x & = h(v, z) + \eta\\ \mathbb{E}[e] & = 0.\end{split}\]
../../_images/iv4.png

Causal graph with IV and both observed and unobserved confounders

The IV framework solves this problem by doing a two-stage estimation:

  1. Estimate \(\hat{H}(z, v)\) that captures the relationship between \(x\) and the variables \((z, v)\).

  2. Replace \(x\) with the predicted result of \(\hat{H}(z, v)\) given \((v, z)\). Then estimate \(\hat{G}(x, v)\) to build the relationship between \(y\) and \((x, v)\).

The final casual effects can then be calculated.

IV Classes