Skip to content

Conversation

rittik9
Copy link
Collaborator

@rittik9 rittik9 commented Sep 1, 2024

What does this PR do?

Fixes #2441

Details
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

Did you have fun?

Yes

Issue:

  • The root of the problem seems to be that the ignore_index information is not being properly propagated to the final averaging step i.e. the _adjust_weights_safe_divide function doesn't know that which class should be ignored.

Solution:

  • To address this issue, I updated the code to ensure that the ignore_index information is preserved throughout the entire process, making sure it is correctly passed through all intermediate steps up to the final averaging stage i.e. _adjust_weights_safe_divide function .
  • Updated the _adjust_weights_safe_divide function to accept an additional ignore_index parameter, which is passed through the _precision_recall_reduce function, called in the compute method of the MulticlassRecall class. This change adjusts the weights in the _adjust_weights_safe_divide function, setting the weight of the ignored class to 0.

📚 Documentation preview 📚: https://torchmetrics--2710.org.readthedocs.build/en/2710/

Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, can we add also test for this case...

@rittik9 rittik9 marked this pull request as draft September 2, 2024 19:12
@rittik9 rittik9 marked this pull request as ready for review September 2, 2024 22:15
@rittik9
Copy link
Collaborator Author

rittik9 commented Sep 2, 2024

looks good, can we add also test for this case...

Sure

@rittik9 rittik9 requested a review from Borda September 4, 2024 11:26
Borda
Borda previously approved these changes Sep 4, 2024
@Borda Borda self-requested a review September 6, 2024 09:10
@Borda Borda dismissed their stale review September 6, 2024 09:10

pending on adding test

@Borda Borda added the bug / fix Something isn't working label Sep 6, 2024
@rittik9
Copy link
Collaborator Author

rittik9 commented Sep 6, 2024

@Borda What do I have to modify?

@Borda
Copy link
Member

Borda commented Sep 11, 2024

@rittik9 mind checking the changed docstest values and whether it is correct?

@Borda Borda mentioned this pull request Oct 31, 2024
4 tasks
assert res == 1.0


def test_multiclass_recall_ignore_index():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we are already testing various ignore_index with reference metric so if we had it wrong this did not pass already... it is possible that we also have a bug in the reference metric?
cc: @SkafteNicki

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking to the code and the ignore index is already applied in _multilabel_stat_scores_format which reduces the preds/target size the same way as the reference metric so calling it with null weights in fact ignores additional index

Copy link
Collaborator Author

@rittik9 rittik9 Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is we are using sklearn's recall_score as a reference for our unittests. So even if in _reference_sklearn_precision_recall_multiclass() function we are using remove_ignore_index function for removing those predictions whose real values are ignore_index class before passing it to recall_score function, it does not matter. Because whenever average='macro' sklearn's recall_score will always return mean cosidering the total no. of classes (as we are passing all the classes in recall_score() function's labels argument). That is the reason why unittests failed in the first place. I think we need to fix the unittests to take care of ignore_index using sklearn's recall_score() function's labels argument. I've prepared a notebook for explanation. cc:@Borda.

@rittik9 rittik9 marked this pull request as draft November 1, 2024 20:31
auto-merge was automatically disabled November 1, 2024 20:31

Pull request was converted to draft

@rittik9 rittik9 marked this pull request as ready for review November 2, 2024 00:05
@rittik9 rittik9 marked this pull request as draft November 2, 2024 00:16
@rittik9 rittik9 marked this pull request as ready for review November 2, 2024 00:47
@rittik9 rittik9 marked this pull request as draft November 2, 2024 01:22
@DimitrisMantas
Copy link

Just to chime in, I think this issue is present in pretty much all metrics that make use of _adjust_weights_safe_divide.

I see this PR fixes, some of them, but others, such as JaccardIndex are left as is.

… wrong answer when ignore_index is specified
@Borda Borda force-pushed the master branch 2 times, most recently from 4b07b79 to 9f4a001 Compare March 20, 2025 12:09
@Borda
Copy link
Member

Borda commented Jul 2, 2025

Shall we then fix the unittest, or was it in the meantime already resolved with sklearn?

@rittik9
Copy link
Collaborator Author

rittik9 commented Jul 2, 2025

I think we gotta fix the unit tests first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug / fix Something isn't working topic: Classif
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect result in computing MulticlassRecall macro average when ignore_index is specified
6 participants