Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 1.08 KB

File metadata and controls

23 lines (14 loc) · 1.08 KB

Strong baseline for visual question answering

This is a re-implementation of Vahid Kazemi and Ali Elqursh's paper Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering in PyTorch.

The paper shows that with a relatively simple model, using only common building blocks in Deep Learning, you can get better accuracies than the majority of previously published work on the popular VQA v1 dataset.

A fully trained model (convergence shown below) is available for download.

Graph of convergence of implementation versus paper results

Note that the model in my other VQA repo performs better than the model implemented here.

This project uses the code provided here