Interventions and Counterfactuals

In addition to the .sample() method, which allows us to sample from the observational distribution, the graph object provides methods to sample from interventional distributions, defined in different three types.

Fixed-value Intervention

This method sets the value of one or more nodes to a fixed value. As this is the most basic type of intervention in causal inference theory, we name the method simply .do(). Below is an example of a causal triangle graph, with fixed-value interventions applied.

description = Description({'C': 'normal(mu_=0, sigma_=1)',
                           'A': 'normal(mu_=2C-1, sigma_=1)',
                           'Y': 'normal(mu_=C+0.6A, sigma_=1)'},
                          infer_edges=True)
graph = Graph(description)
samples, _ = graph.do(size=3, interventions={'A': 2.5})
print(samples)
#           C    A         Y
# 0  1.651819  2.5  3.770674
# 1 -1.010956  2.5 -0.522014
# 2  1.108496  2.5  2.864730

# Average Treatment Effect:
do_1, _ = graph.do(size=1000, interventions={'A': 1})
do_0, _ = graph.do(size=1000, interventions={'A': 0})
ate = (do_1['Y']-do_0['Y']).mean()
print('ATE is:', np.round(ate, 2))
# ATE is: 0.61

# intervening on two variables
samples, _ = graph.do(size=1000, interventions={'A': 1/0.6, 'C': -1})
y_fixed = samples['Y']
print(f'mean: {np.round(y_fixed.mean(), 2)}, variance: {np.round(y_fixed.var(), 2)}')
# mean: -0.08, variance: 1.03

Functional Intervention

In this type of intervention which you do via .do_functional(), you can set a node’s value to a deterministic function of other nodes. The function is a Python function (defined by def keyword or as a lambda function). Using this method, you can induce the soft (parametric) intervention scenario, where the inputs of the intervention function are the parents of the node in the original DAG. Nevertheless, the inputs of the function can be any subset of the nodes, except for the descendants of the intervened node.

description = Description({'C': 'normal(mu_=0, sigma_=1)',
                           'A': 'normal(mu_=2C-1, sigma_=1)',
                           'Y': 'normal(mu_=C+0.6A, sigma_=1)'},
                          infer_edges=True)
graph = Graph(description)

samples, _ = graph.do_functional(
    size=3,
    intervene_on='Y', inputs=['A', 'C'],
    func=lambda a, c: (a+c)*10
)
print(samples)
#           C         A          Y
# 0  1.651819  1.983786  36.356056
# 1 -1.010956 -2.772037 -37.829930
# 2  1.108496 -0.354075   7.544213

Self Intervention

Imagine the following causal question: what if for every patient, we administer 1 unit of drug less than what we normally administer. The “do” term for this intervention would be \(\text{do}(A=f(A_\text{old}))\). To simulate data for this type of intervention, we use the method .do_self()

description = Description({'C': 'normal(mu_=0, sigma_=1)',
                           'A': 'normal(mu_=2C-1, sigma_=1)',
                           'Y': 'normal(mu_=C+0.6A, sigma_=1)'},
                          infer_edges=True)
graph = Graph(description)

samples, errors = graph.sample(3)
print(samples)
#           C         A         Y
# 0  1.651819  1.983786  3.460946
# 1 -1.010956 -2.772037 -3.685236
# 2  1.108496 -0.354075  1.152285

intrv_samples, _ = graph.do_self(
    func=lambda a: a+1, intervene_on='A',
    use_sampled_errors=True, sampled_errors=errors
)
print(intrv_samples)
#           C         A         Y
# 0  1.651819  2.983786  4.060946
# 1 -1.010956 -1.772037 -3.085236
# 2  1.108496  0.645925  1.752285

Note

In the example above, line 16 tells the method to reuse the samples. This decision can be made in previous types of intervention as well. Using this feature, we can simulate counterfactual scenarios: what would have happened had we changed our actions? In this case, the intervention must be applied to the same sampled dataset, and this can be done by reusing the sampled errors.