You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First and foremost, thank you for your outstanding work on this project. We‘d like to follow this work and fine-tune a model from deepseek-coder 1.3B by your datasets. But we cannot achieve a promising result. So may we get the fine-tuning settings like batch-size, learning-rate and other specification?
The text was updated successfully, but these errors were encountered:
If I remember (I need to check though) Batch size was specified to 4, learning rate was set to 1e-5 and we used some synthetic dataset in order to fine-tune the models. I will also give you one observation which I saw. These small models are not very much generalizable. PremSQL-1B was very much focussed on BirdBench, what we tried was generated some synthetic samples which was similar to BirdBench training data. Training with those gave a huge leap in the results.
As of now, the scripts for fine-tuning in PremSQL, might be bit buggy, and I am working on it. However the main sauce was the different datasets we used and also continual fine-tuning.
First and foremost, thank you for your outstanding work on this project. We‘d like to follow this work and fine-tune a model from deepseek-coder 1.3B by your datasets. But we cannot achieve a promising result. So may we get the fine-tuning settings like batch-size, learning-rate and other specification?
The text was updated successfully, but these errors were encountered: