Description
Hello,
I am trying to use torchtune to fine-tune the Llama-3.1-8B model. I need to implement it on our local data, so I am trying to set up our config file properly. However, whenever I try to run using the config file using
tune run full_finetune_single_device --config custom_config.yaml
I get the following error:
NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.
This is how I updated our config file:
dataset:
_component_: torchtune.datasets.instruct_dataset
data_files: /gpfs/u/home/project_name/project_user/scratch/datasets/custom_instruct_data_file.json
source: json
split: train
From the error logs, I found out that even when I specified the data file location, it is still trying to download some instruct dataset to some cache memory:
Downloading and preparing dataset json/default to file:///gpfs/u/home/project_name/project_user/scratch/Llama-3.1/huggingface_cache/datasets/json/default-04bb5956f8665e2d/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4...
Downloading data files: 100%|███████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7244.05it/s]
Extracting data files: 100%|█████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 105.73it/s]
Dataset json downloaded and prepared to file:///gpfs/u/home/project_name/project_user/scratch/Llama-3.1/huggingface_cache/datasets/json/default-04bb5956f8665e2d/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data.
Previously, I thought it was because of the .cache location, so I changed the cache path using
export FT_ROOT=/gpfs/u/home/project_name/project_user/scratch/Llama-3.1
# inside it, make a folder for ALL HF caches
mkdir -p $FT_ROOT/huggingface_cache
However, I still get the local filesystem not supported error. Either way, I do not see any reason torchtune is trying to download data to cache memory despite the file location being specified. Am I doing something wrong? Does the download here simply imply loading the local data file to a separate location for processing, or is it downloading some data from the hf repo? Why is it caching the files to a local LocalFileSystem anyways? Why can it not directly read from the provided file location? Is there a way we can load it directly to the RAM?
How do I solve this error issue? Do we need to provide some read/write permission? Or do we need to change version? I have the 0.6.1 version.