-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data preprocessing for VQA #56
Comments
Hi, if you just want to make inference on your custom QA sample and try open-domain VQA (not restricted in the 3,129 candidate answers in VQA-v2 dataset), I recommend to refer to the open-domain VQA Colab we provided (link) and follow how to pre-process raw samples to do inference. The open-domain VQA Colab uses pretrained OFA (rather than VQA-finetuned) and are able to answer more open-domain visual questions outside the 3,129 VQA-v2 candidate answers. Our VQA demo on Huggingface spaces are also based on pretrained OFA. If you would like to work on VQA-v2 or other VQA-formulated competition datasets, please refer to the section "1. Prepare the Dataset & Checkpoints" of finetuning on VQA in the readme. It shows how to organize the TSV datafile to facilitate the VQA finetuning and inference on the finetuned VQA-v2 checkpoint. Specifically, the question-id, image-id and the question text (lowercased) are just derived from your original dataset. For the training samples, the answer text (also lowercased) is derived from the original dataset concatenated with its confidence using "|!+" (for example 0.6|!+no). For inference samples do not have ground-truth answers, just make a fake answer string as a placeholder (like 1.0|!+no). The object labels are not necessary and you can just leave it blank (or you should refer to VinVL repo to see how to obtain labels on custom data, we just employed its released labels on COCO & VG images). To transform images to base64 strings, please use the following code:
For one sample, the columns mentioned above are concatenated using '\t' into a line. You can refer to the example in the readme. For the already pre-processed VQA-v2 dataset, you can directly download it in datasets.md. If you have further questions, please do not hesitate to ask me. |
Hi, the Colab link seems broken. Is the notebook still up? Thanks. |
When will you release the data preprocessing codes for VQA-v2?
The text was updated successfully, but these errors were encountered: