Closed
Description
In the RL training README.md, it mentions that we can specify dataset_path to load from a json dataset, i.e.:
bash scripts/train_reinforce_ray_rule_rm.sh # reinforcement++
...
# dataset_path="./data/acecode_87K/acecode_87K.json"
...
However, in the implementation, the dataset loading statement appears to only support loading from huggingface hub (missing the path="json" argument necessary to load from local?)
To properly load from a local JSON file, should the code be something like?:
# For local JSON files
if os.path.exists(args.dataset):
self.dataset = datasets.load_dataset("json", data_files=args.dataset, split="train")
else:
# Fallback to Hugging Face datasets
self.dataset = datasets.load_dataset(args.dataset, split="train")
Metadata
Metadata
Assignees
Labels
No labels