-
Notifications
You must be signed in to change notification settings - Fork 118
Description
Hello again!
If you don't mind, I have one more question for detailed procedure of fine-tuning the CLEVR-Humans dataset.
I was able to reproduce 12-step MAC's accuracy (98.9%) using Pytorch, but failed to reproduce Humans after FT (result was 76.6%, lower than paper's 81.5%).
My fine-tuning was done by (1) load fully trained model on CLEVR, (2) initialize new words' embedding vectors just as original words, (3) re-training the model on CLEVR-Humans train dataset ONLY following original model's learning schedule.
It seems your fine-tuning code trains the model on mixture CLEVR and CLEVR-Humans train dataset rather than using only CLEVR-Humans train dataset. (sorry if I misread again 😢) So I'm guessing that this difference might be the reason.
Since using the mixture of both dataset will take longer than just using CLEVR-Humans, I'm opening the issue thinking you might encountered the same problem and could help me out.
Thanks!