counterfactual explanation

Counterfactual explanations are proposed and studied in recent years. In logic, counterfactual refers to the scenario when the condition of an if-statement is universally false. Note that the if-statement is universally true when the condition is universally false. So the conclusion is false even though the if-statement holds true always.

Counterfactual example in ML refers to as an example that supposes to belong to one class but is identified as another class. Moreover, we emphasize that the modification that has been made to the original image and leads to the misclassification is small. Often, we even intentionally tried to look for the smallest variation that can be introduced to the original data to lead to misclassification. Such change can be used to identify what precise changes actually lead to the model to interpret the input as belonging to the target class. This required change is sometimes referred to as a counterfactual explanation.

 

Leave a Reply

Your email address will not be published. Required fields are marked *