data preprocessing for VQA #56

runzeer · 2022-03-25T10:25:49Z

When will you release the data preprocessing codes for VQA-v2?

yangapku · 2022-03-26T04:03:58Z

Hi, if you just want to make inference on your custom QA sample and try open-domain VQA (not restricted in the 3,129 candidate answers in VQA-v2 dataset), I recommend to refer to the open-domain VQA Colab we provided (link) and follow how to pre-process raw samples to do inference. The open-domain VQA Colab uses pretrained OFA (rather than VQA-finetuned) and are able to answer more open-domain visual questions outside the 3,129 VQA-v2 candidate answers. Our VQA demo on Huggingface spaces are also based on pretrained OFA.

If you would like to work on VQA-v2 or other VQA-formulated competition datasets, please refer to the section "1. Prepare the Dataset & Checkpoints" of finetuning on VQA in the readme. It shows how to organize the TSV datafile to facilitate the VQA finetuning and inference on the finetuned VQA-v2 checkpoint. Specifically, the question-id, image-id and the question text (lowercased) are just derived from your original dataset. For the training samples, the answer text (also lowercased) is derived from the original dataset concatenated with its confidence using "|!+" (for example 0.6|!+no). For inference samples do not have ground-truth answers, just make a fake answer string as a placeholder (like 1.0|!+no). The object labels are not necessary and you can just leave it blank (or you should refer to VinVL repo to see how to obtain labels on custom data, we just employed its released labels on COCO & VG images). To transform images to base64 strings, please use the following code:

from PIL import Image
from io import BytesIO
import base64

img = Image.open(fn)
img_buffer = BytesIO()
img.save(img_buffer, format=img.format)
byte_data = img_buffer.getvalue()
base64_str = base64.b64encode(byte_data) # bytes
base64_str = base64_str.decode("utf-8") # str

For one sample, the columns mentioned above are concatenated using '\t' into a line. You can refer to the example in the readme. For the already pre-processed VQA-v2 dataset, you can directly download it in datasets.md.

If you have further questions, please do not hesitate to ask me.

ricvolpi · 2023-01-24T09:10:28Z

Hi, if you just want to make inference on your custom QA sample and try open-domain VQA (not restricted in the 3,129 candidate answers in VQA-v2 dataset), I recommend to refer to the open-domain VQA Colab we provided (link)

Hi, the Colab link seems broken. Is the notebook still up? Thanks.

yangapku self-assigned this Mar 26, 2022

yangapku closed this as completed Apr 1, 2022

yangapku mentioned this issue Apr 21, 2022

OFA on customised task e.g. OK-VQA #76

Closed

xiaoqiang-lu mentioned this issue Apr 22, 2022

How to train VQA on my custom data? #73

Closed

logicwong mentioned this issue May 2, 2022

How to fine-tune on custom dataset #89

Closed

logicwong mentioned this issue May 10, 2022

pretraining with the dataset examples #90

Closed

yangapku mentioned this issue May 13, 2022

Additional issues trying to finetune on custom (VQA-like) dataset (VizWiz) #105

Closed

kyleseelman mentioned this issue Jun 19, 2022

Error in fine-tuning on custom dataset #131

Closed

logicwong mentioned this issue Jul 15, 2022

How to produce the caption for our customed images #165

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data preprocessing for VQA #56

data preprocessing for VQA #56

runzeer commented Mar 25, 2022

yangapku commented Mar 26, 2022 •

edited

Loading

ricvolpi commented Jan 24, 2023

data preprocessing for VQA #56

data preprocessing for VQA #56

Comments

runzeer commented Mar 25, 2022

yangapku commented Mar 26, 2022 • edited Loading

ricvolpi commented Jan 24, 2023

yangapku commented Mar 26, 2022 •

edited

Loading