Interventions and Counterfactuals

In addition to the .sample() method, which allows us to sample from the observational distribution, the graph object provides methods to sample from interventional distributions, defined in different three types.

Fixed-value Intervention

This method sets the value of one or more nodes to a fixed value. As this is the most basic type of intervention in causal inference theory, we name the method simply .do(). Below is an example of a causal triangle graph, with fixed-value interventions applied.

 1description = Description({'C': 'normal(mu_=0, sigma_=1)',
 2                           'A': 'normal(mu_=2C-1, sigma_=1)',
 3                           'Y': 'normal(mu_=C+0.6A, sigma_=1)'},
 4                          infer_edges=True)
 5graph = Graph(description)
 6samples, _ = graph.do(size=3, interventions={'A': 2.5})
 7print(samples)
 8#           C    A         Y
 9# 0  1.651819  2.5  3.770674
10# 1 -1.010956  2.5 -0.522014
11# 2  1.108496  2.5  2.864730
12
13# Average Treatment Effect:
14do_1, _ = graph.do(size=1000, interventions={'A': 1})
15do_0, _ = graph.do(size=1000, interventions={'A': 0})
16ate = (do_1['Y']-do_0['Y']).mean()
17print('ATE is:', np.round(ate, 2))
18# ATE is: 0.61
19
20# intervening on two variables
21samples, _ = graph.do(size=1000, interventions={'A': 1/0.6, 'C': -1})
22y_fixed = samples['Y']
23print(f'mean: {np.round(y_fixed.mean(), 2)}, variance: {np.round(y_fixed.var(), 2)}')
24# mean: -0.08, variance: 1.03

Functional Intervention

In this type of intervention which you do via .do_functional(), you can set a node’s value to a deterministic function of other nodes. The function is a Python function (defined by def keyword or as a lambda function). Using this method, you can induce the soft (parametric) intervention scenario, where the inputs of the intervention function are the parents of the node in the original DAG. Nevertheless, the inputs of the function can be any subset of the nodes, except for the descendants of the intervened node.

 1description = Description({'C': 'normal(mu_=0, sigma_=1)',
 2                           'A': 'normal(mu_=2C-1, sigma_=1)',
 3                           'Y': 'normal(mu_=C+0.6A, sigma_=1)'},
 4                          infer_edges=True)
 5graph = Graph(description)
 6
 7samples, _ = graph.do_functional(
 8    size=3,
 9    intervene_on='Y', inputs=['A', 'C'],
10    func=lambda a, c: (a+c)*10
11)
12print(samples)
13#           C         A          Y
14# 0  1.651819  1.983786  36.356056
15# 1 -1.010956 -2.772037 -37.829930
16# 2  1.108496 -0.354075   7.544213

Self Intervention

Imagine the following causal question: what if for every patient, we administer 1 unit of drug less than what we normally administer. The “do” term for this intervention would be \(\text{do}(A=f(A_\text{old}))\). To simulate data for this type of intervention, we use the method .do_self()

 1description = Description({'C': 'normal(mu_=0, sigma_=1)',
 2                           'A': 'normal(mu_=2C-1, sigma_=1)',
 3                           'Y': 'normal(mu_=C+0.6A, sigma_=1)'},
 4                          infer_edges=True)
 5graph = Graph(description)
 6
 7samples, errors = graph.sample(3)
 8print(samples)
 9#           C         A         Y
10# 0  1.651819  1.983786  3.460946
11# 1 -1.010956 -2.772037 -3.685236
12# 2  1.108496 -0.354075  1.152285
13
14intrv_samples, _ = graph.do_self(
15    func=lambda a: a+1, intervene_on='A',
16    use_sampled_errors=True, sampled_errors=errors
17)
18print(intrv_samples)
19#           C         A         Y
20# 0  1.651819  2.983786  4.060946
21# 1 -1.010956 -1.772037 -3.085236
22# 2  1.108496  0.645925  1.752285

Note

In the example above, line 16 tells the method to reuse the samples. This decision can be made in previous types of intervention as well. Using this feature, we can simulate counterfactual scenarios: what would have happened had we changed our actions? In this case, the intervention must be applied to the same sampled dataset, and this can be done by reusing the sampled errors.