Update Reading list, Diffusion papers

garg-aayush · Sep 13, 2023 · 0f272c1 · 0f272c1
1 parent cf63ecb
commit 0f272c1
Showing 1 changed file with 59 additions and 99 deletions.
diff --git a/READING_LIST.md b/READING_LIST.md
@@ -1,123 +1,83 @@
 # Suggested reading list
 This document contains the suggested reading list of papers pertaining to Diffusion Models
-## Fundamental papers
 
-1. **Auto-Encoding Variational Bayes**
-    - https://arxiv.org/pdf/1312.6114
-
-2. **Denoising Diffusion Probabilistic Models**
-    - https://arxiv.org/abs/2006.11239
-
-3. **Improved Denoising Diffusion Probabilistic Models**
-   - https://arxiv.org/abs/2102.09672
-
-4. **Generative Modeling by Estimating Gradients of the Data Distribution**
-   - https://arxiv.org/abs/1907.05600
-
-5. **Score-Based Generative Modeling through Stochastic Differential Equations**
-   - https://arxiv.org/abs/2011.13456
-
-6. **Denoising Diffusion Implicit Models**
-   - https://arxiv.org/abs/2010.02502
-
-7. **Diffusion Models Beat GANs on Image Synthesis**
-   - https://arxiv.org/abs/2105.05233
-
-8. **Elucidating the Design Space of Diffusion-Based Generative Models**
-    - https://arxiv.org/abs/2206.00364
-
-9. **Classifier-Free Diffusion Guidance**
-    - https://arxiv.org/abs/2207.12598
-
-10. **High-Resolution Image Synthesis with Latent Diffusion Models**
-    - https://arxiv.org/abs/2112.10752
-
-11. **SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis**
-    - https://www.youtube.com/watch?v=kkYaikeLJd
-
-
-## Inversion
-1. Null-text Inversion for Editing Real Images using Guided Diffusion Models
-   - https://arxiv.org/abs/2211.09794
-
-
-## Text-based image editing
-1. Prompt-to-Prompt Image Editing with Cross Attention Control
-   - https://arxiv.org/abs/2208.01626
-
-2. Adding Conditional Control to Text-to-Image Diffusion Models
-   - https://arxiv.org/abs/2302.05543
-
-3. **InstructPix2Pix: Learning to Follow Image Editing Instructions**
-   - https://arxiv.org/abs/2211.09800
-
-
-## SD finetuning and controlled generation
-1. **An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion**
-    - https://arxiv.org/abs/2208.01618
+## Diffusion Papers
+### Fundamental
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Auto-Encoding Variational Bayes    |  [Link](https://arxiv.org/pdf/1312.6114)    |   :heavy_check_mark: | &cross;    |      
+| Denoising Diffusion Probabilistic Models   |  [Link](https://arxiv.org/abs/2006.11239)    |   :heavy_check_mark: | :heavy_check_mark:    |      
+| Denoising Diffusion Implicit Models   |  [Link](https://arxiv.org/abs/2010.02502)    |   :heavy_check_mark: | :heavy_check_mark:    |      
+| Improved Denoising Diffusion Probabilistic Models   |  [Link](https://arxiv.org/abs/2102.09672)    |   :heavy_check_mark: | :heavy_check_mark:    |      
+| Generative Modeling by Estimating Gradients of the Data Distribution   |  [Link](https://arxiv.org/abs/1907.05600)    |    &cross;  |  &cross; |
+| Score-Based Generative Modeling through Stochastic Differential Equations | [Link](https://arxiv.org/abs/2011.13456) |    &cross;  |  &cross; |
+| Diffusion Models Beat GANs on Image Synthesis | [Link](https://arxiv.org/abs/2105.05233) |     &cross; | &cross; |
+| Elucidating the Design Space of Diffusion-Based Generative Models | [Link](https://arxiv.org/abs/2206.00364) |   &cross;  |  &cross; |
+| Classifier-Free Diffusion Guidance | [Link](https://arxiv.org/abs/2207.12598) |   :heavy_check_mark:  |  &cross; |
+| High-Resolution Image Synthesis with Latent Diffusion Models | [Link](https://arxiv.org/abs/2112.10752) |   :heavy_check_mark:  |  &cross; |
+| SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis | [Link](https://arxiv.org/abs/2307.01952) |   :heavy_check_mark:  |  :heavy_check_mark: |
+
+### Inversion
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Null-text Inversion for Editing Real Images using Guided Diffusion Models    |  [Link](https://arxiv.org/abs/2211.09794)    |   :heavy_check_mark: | :heavy_check_mark:  |      
 
-2.  **DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation**
-    - https://arxiv.org/abs/2208.12242
 
-3. **LoRA: Low-Rank Adaptation of Large Language Models**
-    - https://arxiv.org/abs/2106.09685
-  
-4. Key-Locked Rank One Editing for Text-to-Image Personalization
-    - https://arxiv.org/abs/2305.01644
+## Fast sampling
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Progressive Distillation for Fast Sampling of Diffusion Models    |  [Link](https://arxiv.org/abs/2202.00512)    |    &cross; |  &cross;      
+| On Distillation of Guided Diffusion Models    |  [Link](https://arxiv.org/abs/2210.03142)|    &cross; |  &cross;  |   
 
-5. **T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models**
-    - https://arxiv.org/abs/2302.08453
+### Text-based image editing
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Prompt-to-Prompt Image Editing with Cross Attention Control    |  [Link](https://arxiv.org/abs/2208.01626)    |   :heavy_check_mark: | :heavy_check_mark:  |      
+| InstructPix2Pix: Learning to Follow Image Editing Instructions    |  [Link](https://arxiv.org/abs/2211.09800)   |    &cross; |  &cross;     
 
 
 ## Image-based editing
-1. **SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations**
-    - https://arxiv.org/abs/2108.01073
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations    |  [Link](https://arxiv.org/abs/2108.01073)    |    &cross; |  &cross;   |
+| Palette: Image-to-Image Diffusion Models    |  [Link](https://arxiv.org/abs/2111.05826)  |    :heavy_check_mark: |  &cross;  |   
 
-2. **Palette: Image-to-Image Diffusion Models**
-   - https://arxiv.org/abs/2111.05826
 
+### SD finetuning and controlled generation
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion    |  [Link](https://arxiv.org/abs/2208.01618)    |   :heavy_check_mark: | &cross;   |  
+| DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation    |  [Link](https://arxiv.org/abs/2208.12242)   |   :heavy_check_mark: | &cross;   |      
+| Adding Conditional Control to Text-to-Image Diffusion Models    |  [Link](https://arxiv.org/abs/2302.05543)   |   :heavy_check_mark: | &cross;   |      
+| LoRA: Low-Rank Adaptation of Large Language Models   |  [Link](https://arxiv.org/abs/2106.09685)    |  &cross;  | &cross;   |  
+| Key-Locked Rank One Editing for Text-to-Image Personalization    |  [Link](https://arxiv.org/abs/2305.01644)   |   &cross;  | &cross;   |      
+|  T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models  |  [Link](https://arxiv.org/abs/2302.08453) |  &cross;  | &cross;   |      
 
-## SD-based video synthesis
-1. **DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion**
-   - https://arxiv.org/abs/2304.06025
 
 
 ## Super-resolution
-1. **Image Super-Resolution via Iterative Refinement**
-    - https://arxiv.org/abs/2104.07636
-
-
-## Garments Try-on
-1. **TryOnDiffusion: A Tale of Two UNets**
-    - https://arxiv.org/abs/2306.08276
-
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Image Super-Resolution via Iterative Refinement    |  [Link](https://arxiv.org/abs/2104.07636)    |   :heavy_check_mark: |  &cross; |      
 
 ## Subject-swapping
-1. **Photoswap: Personalized Subject Swapping in Images**
-    - https://arxiv.org/abs/2305.18286
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Photoswap: Personalized Subject Swapping in Images    |  [Link](https://arxiv.org/abs/2305.18286)    |   :heavy_check_mark: | &cross;   |      
 
+## Garments Try-on
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| TryOnDiffusion: A Tale of Two UNets    |  [Link](https://arxiv.org/abs/2306.08276)    |   :heavy_check_mark: |  &cross;   |      
 
-## Fast sampling
-1.  **Progressive Distillation for Fast Sampling of Diffusion Models**
-    - https://arxiv.org/abs/2202.00512
-
-2.  On Distillation of Guided Diffusion Models
-    - https://arxiv.org/abs/2210.03142
 
 ## Video Synthesis
-1. **Video Diffusion Models**
-    - https://arxiv.org/abs/2204.03458
-2. **DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion**
-    - https://arxiv.org/abs/2304.06025
-3. **DisCo: Disentangled Control for Referring Human Dance Generation in Real World**
-    - https://arxiv.org/abs/2307.00040
-4. **Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation**
-    - https://arxiv.org/abs/2212.11565
-
-
->Note: 
-> 1. Bold highlighted papers are must read for solid understanding of diffusion models and its possible applications.
-> 2. Moreover, this is not exhaustive but suggested list. If any one of you find an interesting paper or has any suggestions. They are more than welcome!
+| Papers | Archive Link | Read? | Notes|
+|------|:-------:|:------:|:----------:|
+| Video Diffusion Models |  [Link](https://arxiv.org/abs/2204.03458)    |   &cross; |  &cross;  |      
+| DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion |  [Link](https://arxiv.org/abs/2304.06025)    |   :heavy_check_mark: |  :heavy_check_mark:  |      
+| DisCo: Disentangled Control for Referring Human Dance Generation in Real World |  [Link](https://arxiv.org/abs/2307.00040)    |   &cross; |  &cross;  |      
+| Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation |  [Link](https://arxiv.org/abs/2212.11565)    |   &cross; |  &cross;  |