Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predictions are all zeros #7

Open
ama454 opened this issue Feb 9, 2023 · 2 comments
Open

predictions are all zeros #7

ama454 opened this issue Feb 9, 2023 · 2 comments

Comments

@ama454
Copy link

ama454 commented Feb 9, 2023

When I try to run the code every prediction is all 0 or all 1. I've tried re-balancing my dataset, but it didn't work for me,

@ama454 ama454 changed the title predictions are all zeros during the training predictions are all zeros Feb 10, 2023
@OmarMohammed88
Copy link
Owner

would you share your notebook to figure out what's the problem?

@ama454
Copy link
Author

ama454 commented Feb 10, 2023

`import librosa
from sklearn.metrics import classification_report

def speech_file_to_array_fn(batch):
speech_array, sampling_rate = torchaudio.load(batch["name"])
speech_array = speech_array
resampler = torchaudio.transforms.Resample(sampling_rate, 16_000)
speech_array = resampler(speech_array).squeeze().numpy()

batch["speech"] = speech_array
return batch

def predict(batch):
features = processor(batch["speech"], sampling_rate=processor.feature_extractor.sampling_rate, return_tensors="pt", padding=True)

input_values = features.input_values.to(device)

with torch.no_grad():
    logits = model(input_values).logits 

pred_ids = torch.argmax(logits, dim=-1).detach().cpu().numpy()
batch["predicted"] = pred_ids
return batch

if name == 'main':

data_files = {
    "test" : 'test.csv'
}
test_dataset = load_dataset('csv', data_files = data_files, delimiter = "\t")["test"]
print(test_dataset)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Device: {device}")


test_dataset = test_dataset.map(speech_file_to_array_fn)

result = test_dataset.map(predict,batch_size=8)

label_names = [config.id2label[i] for i in range(config.num_labels)]

print(f'Labels: {label_names}')
      
y_true = [config.label2id[name] for name in result["emotion"]]
y_pred = result["predicted"]

print(y_true)
print(y_pred)
`

Downloading and preparing dataset csv/default to /root/.cache/huggingface/datasets/csv/default-a94fa44ad00c5272/0.0.0/6b34fb8fcf56f7c8ba51dc895bfa2bfbe43546f190a60fcf74bb5e8afdcc2317... Downloading data files: 100% 1/1 [00:00<00:00, 29.44it/s] Extracting data files: 100% 1/1 [00:00<00:00, 33.98it/s] Dataset csv downloaded and prepared to /root/.cache/huggingface/datasets/csv/default-a94fa44ad00c5272/0.0.0/6b34fb8fcf56f7c8ba51dc895bfa2bfbe43546f190a60fcf74bb5e8afdcc2317. Subsequent calls will reuse this data. 100% 1/1 [00:00<00:00, 27.50it/s] Dataset({ features: ['name', 'emotion'], num_rows: 30 }) Device: cpu 100% 30/30 [00:01<00:00, 27.23ex/s] 100% 30/30 [15:19<00:00, 66.05s/ex] Labels: [0, 1] [1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0] [[1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1]]

print(classification_report(y_true,y_pred))
` precision recall f1-score support

       0       0.00      0.00      0.00        14
       1       0.53      1.00      0.70        16

accuracy                           0.53        30

macro avg 0.27 0.50 0.35 30
weighted avg 0.28 0.53 0.37 30

/usr/local/lib/python3.8/dist-packages/sklearn/metrics/_classification.py:1318: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/usr/local/lib/python3.8/dist-packages/sklearn/metrics/_classification.py:1318: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
/usr/local/lib/python3.8/dist-packages/sklearn/metrics/_classification.py:1318: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants