diff --git a/READING_LIST.md b/READING_LIST.md new file mode 100644 index 0000000..7950d52 --- /dev/null +++ b/READING_LIST.md @@ -0,0 +1,124 @@ +# Suggested reading list +This document contains the suggested reading list of papers pertaining to Diffusion Models +## Fundamental papers + +1. **Auto-Encoding Variational Bayes** + - https://arxiv.org/pdf/1312.6114 + +2. **Denoising Diffusion Probabilistic Models** + - https://arxiv.org/abs/2006.11239 + +3. **Improved Denoising Diffusion Probabilistic Models** + - https://arxiv.org/abs/2102.09672 + +4. **Generative Modeling by Estimating Gradients of the Data Distribution** + - https://arxiv.org/abs/1907.05600 + +5. **Score-Based Generative Modeling through Stochastic Differential Equations** + - https://arxiv.org/abs/2011.13456 + +6. **Denoising Diffusion Implicit Models** + - https://arxiv.org/abs/2010.02502 + +7. **Diffusion Models Beat GANs on Image Synthesis** + - https://arxiv.org/abs/2105.05233 + +8. **Elucidating the Design Space of Diffusion-Based Generative Models** + - https://arxiv.org/abs/2206.00364 + +9. **Classifier-Free Diffusion Guidance** + - https://arxiv.org/abs/2207.12598 + +10. **High-Resolution Image Synthesis with Latent Diffusion Models** + - https://arxiv.org/abs/2112.10752 + +11. **SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis** + - https://www.youtube.com/watch?v=kkYaikeLJd + + +## Inversion +1. Null-text Inversion for Editing Real Images using Guided Diffusion Models + - https://arxiv.org/abs/2211.09794 + + +## Text-based image editing +1. Prompt-to-Prompt Image Editing with Cross Attention Control + - https://arxiv.org/abs/2208.01626 + +2. Adding Conditional Control to Text-to-Image Diffusion Models + - https://arxiv.org/abs/2302.05543 + +3. **InstructPix2Pix: Learning to Follow Image Editing Instructions** + - https://arxiv.org/abs/2211.09800 + + +## SD finetuning and controlled generation +1. **An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion** + - https://arxiv.org/abs/2208.01618 + +2. **DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation** + - https://arxiv.org/abs/2208.12242 + +3. **LoRA: Low-Rank Adaptation of Large Language Models** + - https://arxiv.org/abs/2106.09685 + +4. Key-Locked Rank One Editing for Text-to-Image Personalization + - https://arxiv.org/abs/2305.01644 + +5. **T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models** + - https://arxiv.org/abs/2302.08453 + + +## Image-based editing +1. **SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations** + - https://arxiv.org/abs/2108.01073 + +2. **Palette: Image-to-Image Diffusion Models** + - https://arxiv.org/abs/2111.05826 + + +## SD-based video synthesis +1. **DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion** + - https://arxiv.org/abs/2304.06025 + + +## Super-resolution +1. **Image Super-Resolution via Iterative Refinement** + - https://arxiv.org/abs/2104.07636 + + +## Garments Try-on +1. **TryOnDiffusion: A Tale of Two UNets** + - https://arxiv.org/abs/2306.08276 + + +## Subject-swapping +1. **Photoswap: Personalized Subject Swapping in Images** + - https://arxiv.org/abs/2305.18286 + + +## Fast sampling +1. **Progressive Distillation for Fast Sampling of Diffusion Models** + - https://arxiv.org/abs/2202.00512 + +2. On Distillation of Guided Diffusion Models + - https://arxiv.org/abs/2210.03142 + +## Video Synthesis +1. **Video Diffusion Models** + - https://arxiv.org/abs/2204.03458 +2. **DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion** + - https://arxiv.org/abs/2304.06025 +3. **DisCo: Disentangled Control for Referring Human Dance Generation in Real World** + - https://arxiv.org/abs/2307.00040 +4. **Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation** + - https://arxiv.org/abs/2212.11565 + + +>Note: +> 1. Bold highlighted papers are must read for solid understanding of diffusion models and its possible applications. +> 2. Moreover, this is not exhaustive but suggested list. If any one of you find an interesting paper or has any suggestions. They are more than welcome! + + + +