Ref: https://www.stat.cmu.edu/~larry/=stat700/Lecture22.pdf
Counterfactual
and
.
is the potential outcome if an individual were treated.
is the potential outcome if an individual were not treated.
is the observed outcome:
- If
, then
(we observe the treated outcome)
- If
, then
(we observe the untreated outcome).
From , we have
. Therefore
in general.
Note that is an unobserved counterfactual. For example, if
represents whether an individual receives a treatment and
is the health outcome, then
represents the \textbf{expected outcome if the treatment were applied to individuals who did not receive it}. In contrast,
represents the \textbf{expected outcome if the treatment were applied universally to everyone}.
If the treatment is effective, we expect , as individuals who voluntarily choose the treatment may have characteristics that lead to better outcomes. As a result, we also expect
, since the observed group receiving treatment (
) may include individuals who are more health-conscious and proactive about their well-being. On the other hand,
accounts for individuals who might not be health-conscious but were hypothetically \textbf{forced} to take the treatment, potentially lowering the overall expected outcome.
Impossible to estimate treatment effect in general
Given treatment effect , it is generally impossible to estimate
because we can only observe
or
at one time but not both.
Randomization to estimate treatment effect
It would be possible to estimate if
. When the indepenence holds, we have
, where
always holds and
is due to
. Similarly, we have
. Therefore, we have
, where the latter can be readily observed.