AIM

This document contains links to selected datasets, models, papers, and related PyTorch links related to AI in Medical Image and video Analysis.

The links also include general-purpose foundation models, essential PyTorch models, and datasets. The links have been verified in December, 2025.

🔴 Important: Click on the Outline button (upper-right button in GitHub) for a table of contents and to jump to a particular topic.
🔴 Important: Right-click on each link to open in a new browser window.

Please reference:

A. S. Panayides et al., "Position Paper: Artificial Intelligence in Medical Image Analysis: Advances, Clinical Translation, and Emerging Frontiers," IEEE J. Biomed. Health Inform., vol. 10, no. 2, pp. 1187–1202, Feb. 2026, doi: 10.1109/JBHI.2025.3649496.

@article{AIinMedicalImaging,
  title={Position paper: Artificial Intelligence in Medical Image Analysis: Advances, Clinical Translation, and Emerging Frontiers},
  author={Panayides, A. S., and Chen, H. and Filipovic, N. D. and Geroski, T. and Hou, K. and Lekadir, K. and 
  Marias, K. and Matsopoulos, G. and Papanastasiou, G. and Sarder, P. and Tourassi, G. and  
  Tsaftaris, S. A. and Amini, A. and Fu, H. and Kyriacou, E. and Loizou, C. P. and Zervakis, M. and 
  Saltz, J. H. and Shamout, F. E. and Wong, K. C. L. and Yao, J. and Fotiadis, D. I. and
  Pattichis, C. S. and Pattichis, M. S.}
  journal={IEEE Journal on Biomedical and Health Informatics},
  volume  = {10},
  number  = {2},
  pages   = {1187 - 1202},
  doi     = {10.1038/s41586-021-00000-x},
  month   = feb,
  year    = {2026},
  doi     = {10.1109/JBHI.2025.3649496}
}

For updates, email Prof. Marios S. Pattichis at pattichi@unm.edu.

Open Models for Digital Image Analysis

A generalist vision–language foundation model for diverse biomedical tasks

PyTorch Image encoders/backbones

pytorch-image-models: The largest collection of PyTorch image encoders/backbones. Including train, eval, inference, export scripts, and pretrained weights

Vision Transformer Implementations

Python libraries for Pathology image analysis

Cancer imaging

Open datasets focused on pathology

Echocardiography

Echonet datasets and models

Foundation Models

Foundation model - related general libraries

Foundation models for pathology image analysis

Vision language foundation models for pathology

CONCH: A Vision-Language Foundation Model for Computational Pathology

Foundation Model for Endoscopy Video Analysis

Contains links to 10 different endoscopy video datasets.
A large-scale endoscopic video dataset with over 33K video clips.
Supports 3 types of downstream tasks, including classification, segmentation, and detection.

SAM foundation models for image and video segmentation, and 3D reconstruction

Instructional Medical Videos

A dataset for medical instructional video classification and question answering

How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?

Main website with model: OpenBiomedVid
OpenBiomedVid dataset
SurgeryVideoQA
MIMIC-IV-ECHO: Echocardiogram Matched Subset
Related OpenAI o3 and o4-mini System
OpenAI models

Generative AI Image Models

Deterministic Medical Image Translation via High-fidelity Brownian Bridges (CVPR 2025 (preprint) paper only). General (cross-modality translation) using MRI/CT simulated datasets (no fixed subject count). This is an image-to-image method. Deterministic diffusion using Brownian bridge paths to connect source and target modalities, improving realism and consistency without stochastic sampling.
GDM-VE: Geodesic Diffusion Models for Medical Image-to-Image Generation (2025) (GitHub link, paper link also). MRI & CT (brain, thoracic; open datasets). Geodesic Diffusion Model. Image-to-image method. Introduces a geodesic metric in latent space for efficient and stable sampling in medical image-to-image synthesis.
Cross-conditioned Diffusion Model for Medical Image to Image Translation (2024) (paper only). Multi-modal MRI (T1, T2, FLAIR; public datasets). Image-to-Image method. Cross-modality conditioning where the source MRI guides target-modality diffusion; modality-specific encoders enhance structural and contrast fidelity.
GitHub: Cascaded diffusion models for medical image translation paper link. Brain / Cardiac (general datasets). Image-to-Image method. Combines a coarse GAN prior with a diffusion refinement stage; shortcut paths reduce steps while preserving fidelity and uncertainty quantification.
Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation (JBHI 2025) (GitHub and paper link). General (denoising, SR, modality transfer). MRI / CT (Brain, Thorax; open datasets). Image-to-Image method. Efficient DDPM variant using only 10 diffusion steps; achieves state-of-the-art results on denoising, super-resolution, and modality translation tasks.

Generative AI Video Models

Tensorflow models for video and multimodal risk assessment

Open Models for Explainability

PyTorch Video Models, Datasets, and Optimization Resources

PyTorch video resources

Select PyTorch image and video classification models

PyTorch video documentation

Main optimization link in PyTorch

Pytorch: Adjusting the learning rate

Model Evaluation Notes

For evaluating your models, consider Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning by Sebastian Raschka..

How to find other datasets and models from general purpose websites

Search for Datasets on Google Dataset Search.
Search for Papers with code. Look separately for Methods and Datasets.
Search for datasets, models, and dataset competitions on kaggle.
Search for Computer Vision datasets on PyTorch vision datasets website.
Search for pretrained PyTorch models PyTorch models website.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIM

Open Models for Digital Image Analysis

A generalist vision–language foundation model for diverse biomedical tasks

PyTorch Image encoders/backbones

Vision Transformer Implementations

Python libraries for Pathology image analysis

Cancer imaging

Open datasets focused on pathology

Echocardiography

Echonet datasets and models

Foundation Models

Foundation model - related general libraries

Foundation models for pathology image analysis

Vision language foundation models for pathology

Foundation Model for Endoscopy Video Analysis

SAM foundation models for image and video segmentation, and 3D reconstruction

Instructional Medical Videos

A dataset for medical instructional video classification and question answering

How Well Can General Vision-Language Models Learn Medicine By Watching Public Educational Videos?

Generative AI Image Models

Generative AI Video Models

Tensorflow models for video and multimodal risk assessment

Open Models for Explainability

PyTorch Video Models, Datasets, and Optimization Resources

PyTorch video resources

Select PyTorch image and video classification models

PyTorch video documentation

Main optimization link in PyTorch

Pytorch: Adjusting the learning rate

Model Evaluation Notes

How to find other datasets and models from general purpose websites

About

Uh oh!

Releases

Packages

pattichis/AIM

Folders and files

Latest commit

History

Repository files navigation

AIM

Open Models for Digital Image Analysis

A generalist vision–language foundation model for diverse biomedical tasks

PyTorch Image encoders/backbones

Vision Transformer Implementations

Python libraries for Pathology image analysis

Cancer imaging

Open datasets focused on pathology

Echocardiography

Foundation Models

Foundation model - related general libraries

Foundation models for pathology image analysis

Vision language foundation models for pathology

SAM foundation models for image and video segmentation, and 3D reconstruction

Instructional Medical Videos

A dataset for medical instructional video classification and question answering

Generative AI Image Models

Generative AI Video Models

Tensorflow models for video and multimodal risk assessment

Open Models for Explainability

PyTorch Video Models, Datasets, and Optimization Resources

Select PyTorch image and video classification models

PyTorch video documentation

Model Evaluation Notes

How to find other datasets and models from general purpose websites

About

Resources

Uh oh!

Stars

Watchers

Forks