Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to run library on Kaggle #317

Open
kittycattoys opened this issue Nov 4, 2024 · 0 comments
Open

Failing to run library on Kaggle #317

kittycattoys opened this issue Nov 4, 2024 · 0 comments

Comments

@kittycattoys
Copy link

Hello,

I am trying to use this library to train a model and test the results. I have started by trying to get the code to work without errors and then add my data. So far, I have run into the basic errors that torchrun_args and training_args are not valid run_training inputs. I search the repo here and there were no matches for these either. Should i try an older version?

Thanks for your assistance as I am very interested in using this library.

ERROR:


TypeError Traceback (most recent call last)
Cell In[7], line 46
43 os.makedirs(training_args.data_output_dir, exist_ok=True)
45 # Run the training
---> 46 run_training(
47 torchrun_args=TorchrunArgs(
48 nnodes=1,
49 nproc_per_node=1,
50 node_rank=0, # Node rank
51 rdzv_id=0, # Changed rdzv_id to an integer
52 rdzv_endpoint="localhost:29500", # Endpoint
53 ),
54 training_args=training_args
55 )
57 print("Training completed successfully.")

TypeError: run_training() got an unexpected keyword argument 'torchrun_args'

PYTHON CODE ON KAGGLE

#!pip install instructlab-training
import json
import os
from instructlab.training import run_training, TrainingArgs, TorchrunArgs

Step 1: Create a small hardcoded synthetic dataset in JSONL format

def create_synthetic_data(output_file="dataset.jsonl"):
examples = [
{"instruction": "Translate 'Hello' to Spanish.", "response": "Hola"},
{"instruction": "What is the capital of France?", "response": "Paris"},
{"instruction": "Solve 5 + 3.", "response": "8"},
{"instruction": "Provide a synonym for 'happy'.", "response": "Joyful"},
{"instruction": "List three primary colors.", "response": "Red, Blue, Yellow"}
]

with open(output_file, 'w') as f:
    for example in examples:
        f.write(json.dumps(example) + '\n')
print(f"Synthetic dataset created at {output_file}")

Generate dataset

create_synthetic_data()

Step 2: Define training arguments with all required fields

training_args = TrainingArgs(
model_path="ibm-granite/granite-3.0-1b-a400m-instruct",
data_path="dataset.jsonl",
ckpt_output_dir="data/saved_checkpoints",
data_output_dir="data/outputs",
max_seq_len=512,
max_batch_len=64, # Added max_batch_len
num_epochs=1,
effective_batch_size=8,
save_samples=1000, # Added save_samples
learning_rate=2e-6,
warmup_steps=100, # Added warmup_steps
is_padding_free=True, # Added is_padding_free
random_seed=42,
)

Ensure output directories exist

os.makedirs(training_args.ckpt_output_dir, exist_ok=True)
os.makedirs(training_args.data_output_dir, exist_ok=True)

Run the training

run_training(
torchrun_args=TorchrunArgs(
nnodes=1,
nproc_per_node=1,
node_rank=0, # Node rank
rdzv_id=0, # Changed rdzv_id to an integer
rdzv_endpoint="localhost:29500", # Endpoint
),
training_args=training_args
)

print("Training completed successfully.")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant