Open
Description
Hello, I have a few questions about OctoCoder.
For this part in the paper:
For instruction tuning our models, we select 5,000 random samples from COMMITPACKFT across the 6 programming languages that we evaluate on.
Could you please provide the exact training data and the launch script to fine-tune StarCoder into OctoCoder?
Or, the seeds that you used for selecting 5,000 instructions from CommitPackFT?
For a second question, was OctoCoder and the results in the paper produced using the finetuning/starcoder/finetune.py
with LoRA/peft?
Thanks!
Btw, fantastic results @Muennighoff and team :)
Metadata
Metadata
Assignees
Labels
No labels