Skip to content

ataffe/VQDatasetGAN

Repository files navigation

Generating a synthetic dataset for surgical instrument segmentation with VQDatasetGAN

This project is designed to improve YOLO's performance in segmenting surgical instruments in real-time surgical video. This is an implementation of the VQGAN-based version of BigDatasetGAN. I rearranged some of the code from Taiming Transformers and implemented the segmentation head for VQGAN from BigDatasetGAN based on the segmentation head from BigDatasetGAN.

Status

I am currently working on improving the image and segmentation mask quality by enhancing the data quality and using transfer learning. I am training VQGAN on a subset of the SurgVu dataset (900k Images), fine-tuning on the SARAS-MEAD (23k Images) dataset, and then further fine-tuning on a smaller private dataset specific to Transorbital Robotic Surgery (2k Images). The idea is to train on a large dataset of surgical instruments used on porcine tissue (SurgVu), then fine-tune on a medium-sized dataset: SARAS-MEAD (~25k images), and a small private dataset TORS (~2k images) of surgical instruments used to operate on human tissue.

Example Synthetic Images with segmentation mask generated by segmentation model overlay

SurgVu Images

The VQDatasetGAN model generated these images at 256 x 256 resolution, then upsampled to 512 x 512

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors