-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is it necessary to train before testing even in zero-shot learning? #3
Comments
The zero-shot version of UniTS has shared prompt, cls, and mask tokens for all tasks, which is different from other setting. So we need to pretrain another model for this version. |
It seems to take a lot of time to train such a pre-trained model. So are you able to provide pre-trained models that are applicable to this situation? |
Our code is still under internal administrative review, and we are not allowed to release new ckpts for now. Training is pretty fast, as it takes about 1-2 day to train on one gpu. |
We have spent two days pre-training using UniTS_zeroshot_newdata.sh. However, when executing the second command of UniTS_zeroshot_newdata.sh for testing, it reported the following error: /home/deeprob/UniTS/auto
no ckpt found! |
Our pre-trained model and logs are attached: |
Therefore, how should we correctly reproduce the experimental results in Section 5.3 of the paper? @gasvn |
I read the exp_sup.py. It seems that "auto" is only recognized in the train function, but not in the test function. |
The values of the loss function during training are as follows: |
The results in the paper is only for one sample (the first sample of the dataset), as we need to compare previous zero-shot method which is very slow (They only use one example in their paper for comparison). The results you have by using the current repo is the performance on the whole dataset. |
From the loss curve, it seems the training is working well. We will figure out the bug you mentioned. Thank you for the feedback. |
In UniTS_zeroshot_newdata.sh,we found that we need to pre-train before we could test zero-shot. Is there a pre-trained model on this for direct testing?
The text was updated successfully, but these errors were encountered: