Skip to content

Commit

Permalink
fix: wrong dataset paths, was using non-tokenized data in pre-tokeniz…
Browse files Browse the repository at this point in the history
…ed dataset tests

Signed-off-by: Harikrishnan Balagopal <harikrishmenon@gmail.com>
  • Loading branch information
HarikrishnanBalagopal committed Sep 2, 2024
1 parent 654bbf1 commit b6fc949
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions tests/test_sft_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@
EMPTY_DATA,
MALFORMATTED_DATA,
MODEL_NAME,
TWITTER_COMPLAINTS_DATA_INPUT_OUTPUT_JSON,
TWITTER_COMPLAINTS_DATA_INPUT_OUTPUT_JSONL,
TWITTER_COMPLAINTS_DATA_JSON,
TWITTER_COMPLAINTS_DATA_JSONL,
Expand Down Expand Up @@ -850,8 +849,8 @@ def test_run_with_good_experimental_metadata():
@pytest.mark.parametrize(
"dataset_path",
[
TWITTER_COMPLAINTS_DATA_INPUT_OUTPUT_JSONL,
TWITTER_COMPLAINTS_DATA_INPUT_OUTPUT_JSON,
TWITTER_COMPLAINTS_TOKENIZED_JSONL,
TWITTER_COMPLAINTS_TOKENIZED_JSON,
],
)
### Tests for pretokenized data
Expand Down

0 comments on commit b6fc949

Please sign in to comment.