renaming "test" split to "dev" #59

felixbur · 2023-08-30T07:59:46Z

Nkululeko only knows two splits: train and test.
but it would be more correct to name the "test" split "dev" (short for development), as we kind of always use it to optimize a model.
Any thoughts?

bagustris · 2023-08-30T13:12:57Z

I think the name is already correct, i.e., "test" but consider the following considerations.

case 1: test has labels

The current behavior should work as expected, i.e., performance score could be calculated directly from test set.

case 2: test has no labels (unseen target)

If the test has no labels, then the model will concatenate train and dev (split_strategy = train), and the model will predict target given the the audio file from the test set (maybe mark test database with split_strategy = predict ?). No performance score on the test (but instead the output is a prediction file [CSV] with header file and label). The user should define another experiment and use "dev" as a "test" split to obtain a score.

bagustris · 2023-09-06T02:56:22Z

@felixbur
I think it makes sense. See my two cases above.
In case 1, the current 'test' split should be renamed to 'dev' split.
In case 2, we need to reintroduce the 'test' split again.

So, consider the following (it is common, e.g., in ComParE challenge). There are three splits given by the authors of the dataset: train.csv, dev.csv, and test.csv. Train and dev have labels, the test does not (in the CSV file, it only contains a file, the labels (e.g. emotion), are usually replaced by a question mark.

So there are possibilities for building a model:

using train only, evaluation metric for training
using train + dev, evaluation metric for dev
using train +dev for training and test, evaluation metric for training

The output in the last option will be a CSV file containing a file and prediction of labels. This file usually is submitted to the organizer to obtain the score of the test set.

bagustris · 2023-09-13T00:13:50Z

A simple workaround maybe just keep the current test as it is but providing more option if test has label (default) to differ where is it dev (has label) or test (unseen).

[DATA]
test.has_labels = False

By default, it assumed test has labels (test.has_labels = True), if it hasn't, so the output is the prediction from the model (CSV file containing file and target).

bagustris · 2024-03-01T02:29:37Z

@felixbur
After realizing that Nkululeko already has .test and .demo modules, this proposal absolutely makes sense (renaming test to dev).

One suggestion point that after getting the best model, user will be allowed to use both train and dev data to train so the final model contains more training data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

renaming "test" split to "dev" #59

renaming "test" split to "dev" #59

felixbur commented Aug 30, 2023

bagustris commented Aug 30, 2023 •

edited

Loading

bagustris commented Sep 6, 2023

bagustris commented Sep 13, 2023 •

edited

Loading

bagustris commented Mar 1, 2024

renaming "test" split to "dev" #59

renaming "test" split to "dev" #59

Comments

felixbur commented Aug 30, 2023

bagustris commented Aug 30, 2023 • edited Loading

case 1: test has labels

case 2: test has no labels (unseen target)

bagustris commented Sep 6, 2023

bagustris commented Sep 13, 2023 • edited Loading

bagustris commented Mar 1, 2024

bagustris commented Aug 30, 2023 •

edited

Loading

bagustris commented Sep 13, 2023 •

edited

Loading