-
Notifications
You must be signed in to change notification settings - Fork 252
Description
I am working on fine-tuning a model and running into a "forgetful" situation I wanted to bring to your attention.
The 2 changes we made to the finetuning Jupyter notebook are:
- create PyCharm Python script
- Change output and provide scores
model: urchade/gliner_small
json: sample_data.json
num_steps = 500
batch_size = 8
data_size = 57
num_batches = 7
num_epochs = 7
Before training results:
Cristiano Ronaldo > Person > 0.9846
Ballon d'Or > Award > 0.9413
UEFA Men's Player of the Year Awards > Award > 0.8620
European Golden Shoes > Award > 0.9594
After training, using final model:
Cristiano Ronaldo dos Santos Aveiro > Person > 0.9472
Ballon d'Or awards > Award > 0.8051
UEFA Men's Player of the Year Awards > Award > 0.9852
European Golden Shoes > Award > 0.9863
outfield player > Person > 0.8722
Model retained original entities (although the scores changed) and even predicted a new entity. So I think the finetuning Juypter file works for your sample data just fine.
Our data set is composed of 72 records, which after the 90% split,
there are 64 records in the training set, 8 in the test set. All records
are for a single label, EntC.
num_steps = 500
batch_size = 8
data_size = 64
num_batches = 8
num_epochs = 62
Before training, results are:
EntA > OurLabel > 0.8799
EntA > OurLabel > 0.8288
EntB > OurLabel > 0.7210
EntA > OurLabel > 0.8052
EntA > OurLabel > 0.7026
EntC > OurLabel > 0.5243
EntA > OurLabel > 0.7475
After training, results are:
EntC > OurLabel > 1.0000
The model now finds EntC with a score of 1.000, but it is as if the last model completely forgot all other entities except EntC.
Any thoughts as to why the forgetfulness could be happening?
While I cannot disclose the entity names or label, I can say that all entities are three-characters long.
Any suggestions are appreciated, thank you.