Skip to content

Fine-tuning fogetfulness #163

@davidress-ILW

Description

@davidress-ILW

I am working on fine-tuning a model and running into a "forgetful" situation I wanted to bring to your attention.

The 2 changes we made to the finetuning Jupyter notebook are:

  1. create PyCharm Python script
  2. Change output and provide scores

model: urchade/gliner_small
json: sample_data.json
num_steps = 500
batch_size = 8
data_size = 57
num_batches = 7
num_epochs = 7

Before training results:
Cristiano Ronaldo > Person > 0.9846
Ballon d'Or > Award > 0.9413
UEFA Men's Player of the Year Awards > Award > 0.8620
European Golden Shoes > Award > 0.9594

After training, using final model:
Cristiano Ronaldo dos Santos Aveiro > Person > 0.9472
Ballon d'Or awards > Award > 0.8051
UEFA Men's Player of the Year Awards > Award > 0.9852
European Golden Shoes > Award > 0.9863
outfield player > Person > 0.8722

Model retained original entities (although the scores changed) and even predicted a new entity. So I think the finetuning Juypter file works for your sample data just fine.

Our data set is composed of 72 records, which after the 90% split,
there are 64 records in the training set, 8 in the test set. All records
are for a single label, EntC.

num_steps = 500
batch_size = 8
data_size = 64
num_batches = 8
num_epochs = 62

Before training, results are:
EntA > OurLabel > 0.8799
EntA > OurLabel > 0.8288
EntB > OurLabel > 0.7210
EntA > OurLabel > 0.8052
EntA > OurLabel > 0.7026
EntC > OurLabel > 0.5243
EntA > OurLabel > 0.7475

After training, results are:
EntC > OurLabel > 1.0000

The model now finds EntC with a score of 1.000, but it is as if the last model completely forgot all other entities except EntC.
Any thoughts as to why the forgetfulness could be happening?

While I cannot disclose the entity names or label, I can say that all entities are three-characters long.

Any suggestions are appreciated, thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions