Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

renaming "test" split to "dev" #59

Open
felixbur opened this issue Aug 30, 2023 · 4 comments
Open

renaming "test" split to "dev" #59

felixbur opened this issue Aug 30, 2023 · 4 comments

Comments

@felixbur
Copy link
Owner

Nkululeko only knows two splits: train and test.
but it would be more correct to name the "test" split "dev" (short for development), as we kind of always use it to optimize a model.
Any thoughts?

@bagustris
Copy link
Collaborator

bagustris commented Aug 30, 2023

I think the name is already correct, i.e., "test" but consider the following considerations.

case 1: test has labels

The current behavior should work as expected, i.e., performance score could be calculated directly from test set.

case 2: test has no labels (unseen target)

If the test has no labels, then the model will concatenate train and dev (split_strategy = train), and the model will predict target given the the audio file from the test set (maybe mark test database with split_strategy = predict ?). No performance score on the test (but instead the output is a prediction file [CSV] with header file and label). The user should define another experiment and use "dev" as a "test" split to obtain a score.

@bagustris
Copy link
Collaborator

@felixbur
I think it makes sense. See my two cases above.
In case 1, the current 'test' split should be renamed to 'dev' split.
In case 2, we need to reintroduce the 'test' split again.

So, consider the following (it is common, e.g., in ComParE challenge). There are three splits given by the authors of the dataset: train.csv, dev.csv, and test.csv. Train and dev have labels, the test does not (in the CSV file, it only contains a file, the labels (e.g. emotion), are usually replaced by a question mark.

So there are possibilities for building a model:

  • using train only, evaluation metric for training
  • using train + dev, evaluation metric for dev
  • using train +dev for training and test, evaluation metric for training

The output in the last option will be a CSV file containing a file and prediction of labels. This file usually is submitted to the organizer to obtain the score of the test set.

@bagustris
Copy link
Collaborator

bagustris commented Sep 13, 2023

A simple workaround maybe just keep the current test as it is but providing more option if test has label (default) to differ where is it dev (has label) or test (unseen).

[DATA]
test.has_labels = False

By default, it assumed test has labels (test.has_labels = True), if it hasn't, so the output is the prediction from the model (CSV file containing file and target).

@bagustris
Copy link
Collaborator

@felixbur
After realizing that Nkululeko already has .test and .demo modules, this proposal absolutely makes sense (renaming test to dev).

One suggestion point that after getting the best model, user will be allowed to use both train and dev data to train so the final model contains more training data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants