Some notes for Casual Inference

Ref: https://www.stat.cmu.edu/~larry/=stat700/Lecture22.pdf

Counterfactual

E[Y_1]=E[Y_1|X=1] Pr(X=1) + E[Y_1|X=0] Pr(X=0) and Y=X Y_1 + (1-X) Y_0.

Y_1 is the potential outcome if an individual were treated. Y_0 is the potential outcome if an individual were not treated.

Y is the observed outcome:

    \[Y = X Y_1 + (1 - X) Y_0\]

which means:

  • If X=1, then Y = Y_1 (we observe the treated outcome)
  • If X=0, then Y = Y_0 (we observe the untreated outcome).

From Y=X Y_1 + (1-X) Y_0, we have E[Y_1|X=1]=E[Y|X=1]. Therefore

E[Y_1] -E[Y|X=1]=E[Y|X=1] Pr(X=1) + E[Y_1|X=0] Pr(X=0) -E[Y|X=1]=Pr(X=0)(E[Y_1|X=0]-E[Y_1|X=1])\neq 0 in general.

Note that E[Y_1 | X=0] is an unobserved counterfactual. For example, if X \in {0,1} represents whether an individual receives a treatment and Y is the health outcome, then E[Y_1 | X=0] represents the \textbf{expected outcome if the treatment were applied to individuals who did not receive it}. In contrast, E[Y_1] represents the \textbf{expected outcome if the treatment were applied universally to everyone}.

If the treatment is effective, we expect E[Y_1 | X=0] < E[Y_1 | X=1], as individuals who voluntarily choose the treatment may have characteristics that lead to better outcomes. As a result, we also expect E[Y | X=1] > E[Y_1], since the observed group receiving treatment (X=1) may include individuals who are more health-conscious and proactive about their well-being. On the other hand, E[Y_1] accounts for individuals who might not be health-conscious but were hypothetically \textbf{forced} to take the treatment, potentially lowering the overall expected outcome.

Impossible to estimate treatment effect in general

Given treatment effect \theta = E[Y_1]-E[Y_0], it is generally impossible to estimate \theta because we can only observe Y_1 or Y_0 at one time but not both.

Randomization to estimate treatment effect

It would be possible to estimate \theta if X \bot (Y_1,Y_0). When the indepenence holds, we have E[Y|X=1]\overset{(a)}{=}E[Y_1|X=1]\overset{(b)}{=}E[Y_1], where (a) always holds and (b) is due to Y_1 \bot X. Similarly, we have E[Y|X=0]=E[Y_0|X=0]=E[Y_0]. Therefore, we have E[Y_1]-E[Y_0] = E[Y|X=1]-E[Y|X=0], where the latter can be readily observed.

Leave a Reply

Your email address will not be published. Required fields are marked *