Reference Metric in multiclass pecision recall unittests provides wrong answer when  ```ignore_index``` is specified with ```average = 'macro'```

## 🐛 Bug

In [unittests](https://github.com/Lightning-AI/torchmetrics/blob/master/tests/unittests/classification/test_precision_recall.py) sklearn's ```recall_score``` and ```precision_score``` is being used as a reference . So even if in ```_reference_sklearn_precision_recall_multiclass()``` function ```remove_ignore_index``` function is being used  for removing those predictions whose real values are ignore_index class before passing it to ```recall_score``` function, it does not matter. Because whenever ```average='macro'``` sklearn's ```recall_score``` and ```precision_score``` will always return mean cosidering the total no. of classes (as we are passing all the classes in ```recall_score()``` and ```precision_score()``` function's ```labels``` argument).

### To Reproduce
https://github.com/Lightning-AI/torchmetrics/issues/2441 issue already talks about the wrong behaviour of MulticlassRecall macro average when ignore_index is specified. Although ignore_index is getting tested, but for it's wrong implementation testcase got passed.



<details>
  <summary>same error for multiclass precision</summary>

```python

### Code Example for Multiclass Precision

import torch
from torchmetrics.classification import MulticlassPrecision

metric = MulticlassPrecision(num_classes=2, ignore_index=0, average="none")

y_true = torch.tensor([0, 0, 1, 1])

# Predicted probabilities (logits)
y_pred = torch.tensor([
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Incorrectly predicted as class 0 (should be class 1)
    [0.1, 0.9],  # Correctly predicted as class 1
])

metric.update(y_pred, y_true)
precision_result = metric.compute()
print(precision_result)  # tensor([0., 1.])
```

```python

import torch
from torchmetrics.classification import MulticlassPrecision

metric = MulticlassPrecision(num_classes=2, ignore_index=0, average="macro")

y_true = torch.tensor([0, 0, 1, 1])

# Predicted probabilities (logits)
y_pred = torch.tensor([
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Incorrectly predicted as class 0 (should be class 1)
    [0.1, 0.9],  # Correctly predicted as class 1
])

metric.update(y_pred, y_true)
precision_result = metric.compute()
print(precision_result)  # tensor(0.5000) , expected: tensor(1.0)
```
</details>

### Expected behavior

```python
import numpy as np
from sklearn.metrics import precision_score

y_true = np.array([0, 0, 1, 1])

# Predicted probabilities (logits)
y_pred_probs = np.array([
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Incorrectly predicted as class 0 (should be class 1)
    [0.1, 0.9],  # Correctly predicted as class 1
])

# Convert predicted probabilities to predicted classes
y_pred = np.argmax(y_pred_probs, axis=1)

precision = precision_score(y_true, y_pred, average='macro', labels = [1]) #only considering label 1, i.e. ignoring label 0
print(f"Multiclass Precision: {precision:.2f}") #1.00
```



### Environment

- TorchMetrics version : 1.5.1
- Python version: 3.10.12
- OS : ubuntu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reference Metric in multiclass pecision recall unittests provides wrong answer when ignore_index is specified with average = 'macro' #2828

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828