Cannot Reproduce Fine-Tuning #30

Concyclics · 2024-10-28T08:07:38Z

First and foremost, thank you for your outstanding work on this project. We‘d like to follow this work and fine-tune a model from deepseek-coder 1.3B by your datasets. But we cannot achieve a promising result. So may we get the fine-tuning settings like batch-size, learning-rate and other specification?

Anindyadeep · 2024-10-28T15:52:49Z

If I remember (I need to check though) Batch size was specified to 4, learning rate was set to 1e-5 and we used some synthetic dataset in order to fine-tune the models. I will also give you one observation which I saw. These small models are not very much generalizable. PremSQL-1B was very much focussed on BirdBench, what we tried was generated some synthetic samples which was similar to BirdBench training data. Training with those gave a huge leap in the results.

As of now, the scripts for fine-tuning in PremSQL, might be bit buggy, and I am working on it. However the main sauce was the different datasets we used and also continual fine-tuning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot Reproduce Fine-Tuning #30

Cannot Reproduce Fine-Tuning #30

Concyclics commented Oct 28, 2024

Anindyadeep commented Oct 28, 2024

Cannot Reproduce Fine-Tuning #30

Cannot Reproduce Fine-Tuning #30

Comments

Concyclics commented Oct 28, 2024

Anindyadeep commented Oct 28, 2024