Repository containing the solution to the Medical Visual Question Answering for GI Task (MEDVQA-GI) in the ImageCLEF 2023 Challenge.
This repository contains the code and implementation details for the publication titled "Language-based colonoscopy image analysis with pretrained neural networks" which was created for ImageCLEF 2023 Lab: Medical Visual Question Answering for GI Task - MEDVQA-GI.
Title: Language-based colonoscopy image analysis with pretrained neural networks
Authors: Patrycja Cieplicka, Julia Kłos, Maciej Morawski, Jarosław Opała
Please follow the instructions in data/README.md to download the required data.
Create and activate a new Conda environment:
conda env create -f environment.yml
conda activate image-clef
Run the following command to execute the VQA/VQG pipeline:
python3 src/pipeline_vqga.py DATA_PATH MODELS_PATH TRAIN_FLAG INPUT_CONFIG INFERENCE_DATA_PATH INFERENCE_TEXTS_PATH INFERENCE_OUTPUT_PATH
Example:
python3 src/pipeline_vqga.py \
"data/" \
"models/" \
"true" \
"src/template/vqg_05_dense_8k.yaml" \
"true" \
"data/ImageCLEFmed-MEDVQA-GI-2023-Testing-Dataset/images/" \
"data/inference_answers.txt" \
"vqg_05_dense_8k.json"
- Training:
src/train_test.py
- Simple inference:
src/predict_test.py
- Evaluation:
src/eval_test.py