Skip to content

Scripts for to inference using Oscar in Image Captioning and VQA tasks

License

Notifications You must be signed in to change notification settings

Alcoholrithm/Oscar_Scripts

Repository files navigation

Oscar

Scripts for to inference using Oscar in Image Captioning and VQA tasks

Requirements

For inference One Image
Minimum VRAM = 6GB, You must run 'torch.cuda.empty_cache()' to flush gpu cache at every inference
Recommemd VRAM = 7GB or more

Demo

To Know How to use Oscar models see *.ipynb

Results

Image Captioning

GQA

Notice

Since scene_graph_benchmark repo, the vinvl encoder only support default cuda.
So if you want to use other cuda device.
You must change default cuda.
Insert the code below on your code.

import os
os.environ['CUDA_VISIBLE_DEVICES']='1'

MODEL ZOO

Image Captioning

Task BLEU-1 BLEU-2 BLEU-3 BLEU-4 CIDEr
Ours+B(XE) 72.7 54.6 36.9 23.0 118.0
Ours+L(XE) 72.9 54.92 37.4 23.7 118.0
Ours+B(CIDEr) 76.9 59.7 41.6 25.6 128.6
Ours+L(CIDEr) 76.8 59.7 41.8 25.9 128.6
Oscar+ - - - 41.0 140.9

GQA

Task ACC
Ours+ 58.1
Oscar+ 64.7

About

Scripts for to inference using Oscar in Image Captioning and VQA tasks

Resources

License

Stars

Watchers

Forks

Packages

No packages published