Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fidelity metrics #6116

Merged
merged 14 commits into from
Dec 19, 2022
Merged

Fidelity metrics #6116

merged 14 commits into from
Dec 19, 2022

Conversation

BlazStojanovic
Copy link
Contributor

@BlazStojanovic BlazStojanovic commented Dec 1, 2022

Introducing an interface for Explainability metrics [OUTDATED]

Due to the time constraints of the explainability sprint this PR will just implement Fidelity metrics and the Explanation method interface will be left as future work

The main goal of this PR is to unify the behaviour of explainability metrics. The explainability metrics can broadly be used in two ways:
1. To test the quality of explanation methods, when developing new explainers. This requires measuring explanation metrics with respect to many different explanations (for many nodes, edges, subgraphs, etc.), usually at many different explanation configurations (e.g. many different explanation sparsities). In order to full-fill this need we provide PyG users with the abstract class ExplanationMetric.
2. To test the quality of a single explanation, e.g. when using explanations at inference time. For this purpose a single method suffices.

### The ExplanationMetric class

The proposed workflow of ExplanationMetric subclasses is as follows:
1. The __init__ function specifies the range of explanations to test over, e.g. at different sparsities of masks, at different entropies, etc. Additional functionality to achieve this will be provided in the util.py.
2. After created, the class can be called as explanation_metric(explainer), this returns the specific explanation metric computed over specified explanation settings in (1), done in (3) steps:
1. ExplanationMetric.get_inputs() generates all different inputs over which the metrics will be computed (e.g. explanation and complement graphs for all different explanation sparsities, for all nodes being explained)
2. ExplanationMetric.compute_metrics() computes the metrics over the inputs
3. ExplanationMetric.aggregate() aggregates all the evaluated metrics

### The explanation_metric method
Each explanation metric should have a corresponding explanation_metric method. Its signature is e.g.:

explanation_metric(Explainer, Explanation, target, index)

And it computes this metric for given explainer (as it is configured), the explanation (that explains index) it produced, given target (i.e. the target prediction with the full graph or ground truth prediction).


Fidelity

Both implementations will be provided for Fidelity which will serve as a basis for implementing other ExplanationMetrics. Thus this also solves #5958.

@codecov
Copy link

codecov bot commented Dec 7, 2022

Codecov Report

Merging #6116 (a986d59) into master (7a8ea61) will increase coverage by 0.01%.
The diff coverage is 91.30%.

@@            Coverage Diff             @@
##           master    #6116      +/-   ##
==========================================
+ Coverage   84.33%   84.34%   +0.01%     
==========================================
  Files         378      378              
  Lines       21010    21052      +42     
==========================================
+ Hits        17719    17757      +38     
- Misses       3291     3295       +4     
Impacted Files Coverage Δ
torch_geometric/explain/explainer.py 88.50% <77.77%> (-3.17%) ⬇️
torch_geometric/explain/metric/__init__.py 100.00% <100.00%> (ø)
torch_geometric/explain/metric/fidelity.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor Author

@BlazStojanovic BlazStojanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left an initial review with some comments where I am stuck. Would greatly appreciate some feedback so we can finish this and unblock other people in the sprint! :)

complement_graph = explanation.get_complement_subgraph() # for fid+

# get target
target = explainer.get_target(x=explanation.x,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rusty1s and @dufourc1, any ideas of how to make sure the right input is passed to get_target here, as well as get_prediction() calls on lines 81 and 84. I am unsure how to do this, given that the get_prediction calls get subgraphs as inputs, while explanation.get_explanation_subgraph and explanation.get_complement_subgraph don't return modified node and edge attributes needed to get the appropriate predictions. Am I missing a simple way to achieve this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, I think you have to pass the input that is explained with the model (i.e the one that was used to generate the explanation). I'm not sure if you can always make sure that an Explanation instance has a x or even an edge_index attributes.

You would need to apply the masks of the explanation to the original input to get the explanation_graph and complement_graph, but maybe I'm missing something in what you want to do ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlazStojanovic is this still blocking you?

pass
elif task_level == 'edge':
# get edge prediction
pass # TODO (blaz)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node level and graph level tasks make sense to me. For nodes we will have a clearly indexed output (each row of the prediction corresponding to nodes) and for full graphs we only have one prediction anyway. This makes it clear on what to compare to get fidelities (both in classification and regression tasks).

But I am not certain for edge level tasks. Let say for example when obtaining the explanation for an edge level task with the explainer class, we use:

explanation = explainer(
        x,
        edge_index,
        target=target,
        index=index,
        edge_label_index=edge_label_index,
    )

What exactly does the index refer to, are we explaining the prediciton(s) for edge(s) with indices in index which correspond to edges in edge_index? Or something else? If so, how can we easily compare the predictions, as the edge_index changes for induced and complement graphs for an explanation?

Copy link
Member

@dufourc1 dufourc1 Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the tests can help ?

edge_label_index = torch.tensor([[0, 1, 2], [3, 4, 5]])

I haven't taken a look at the edge level explainers in details, but it seems that for explaining the edge between node 0 and 1 you would have edge_label_index = torch.tensor([[0],[1]]).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlazStojanovic I think we should leave this as NotImplemented for now. Because I think task_level=edge needs some more work

  1. We don't have an example to demonstrate task_level=edge on some real world data. We could have and example that explains this edge prediction task.
  2. GNNExplainer needs to be updated to support edge level tasks, one line that needs to be updated is
    if self.model_config.task_level == ModelTaskLevel.node:
    .

@BlazStojanovic BlazStojanovic changed the title Base class for explanation metrics and fidelity +/- Fidelity +/- Dec 14, 2022
Comment on lines 52 to 61
explainer :obj:`~torch_geometric.explain.Explainer`
The explainer to evaluate
explanation :obj:`~torch_teometric.explain.Explanation`
The explanation to evaluate
target (Tensor, optional): The target prediction, if not provided it
is inferred from obj:`explainer`, defaults to obj:`None`
index (Union[int, Tensor]): The explanation target index, for node-
and edge- level task it signifies the nodes/edges explained
respectively, for graph-level tasks it is assumed to be None,
defaults to obj:`None`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
explainer :obj:`~torch_geometric.explain.Explainer`
The explainer to evaluate
explanation :obj:`~torch_teometric.explain.Explanation`
The explanation to evaluate
target (Tensor, optional): The target prediction, if not provided it
is inferred from obj:`explainer`, defaults to obj:`None`
index (Union[int, Tensor]): The explanation target index, for node-
and edge- level task it signifies the nodes/edges explained
respectively, for graph-level tasks it is assumed to be None,
defaults to obj:`None`
explanation :obj:`~torch_teometric.explain.Explanation`
The explanation to evaluate

Do we need anything other than explanation now given other values are added in to Explanation in Explainer.

pass
elif task_level == 'edge':
# get edge prediction
pass # TODO (blaz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlazStojanovic I think we should leave this as NotImplemented for now. Because I think task_level=edge needs some more work

  1. We don't have an example to demonstrate task_level=edge on some real world data. We could have and example that explains this edge prediction task.
  2. GNNExplainer needs to be updated to support edge level tasks, one line that needs to be updated is
    if self.model_config.task_level == ModelTaskLevel.node:
    .

complement_graph = explanation.get_complement_subgraph() # for fid+

# get target
target = explainer.get_target(x=explanation.x,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlazStojanovic is this still blocking you?

@rusty1s rusty1s linked an issue Dec 19, 2022 that may be closed by this pull request
@rusty1s rusty1s changed the title Fidelity +/- Fidelity metrics Dec 19, 2022
@rusty1s rusty1s marked this pull request as ready for review December 19, 2022 10:42
@rusty1s rusty1s merged commit 540103d into pyg-team:master Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Explainability Evaluation] - Fidelity +/- metrics
4 participants