Skip to content

Commit

Permalink
Update Reading list, Diffusion papers
Browse files Browse the repository at this point in the history
  • Loading branch information
garg-aayush committed Sep 13, 2023
1 parent cf63ecb commit 0f272c1
Showing 1 changed file with 59 additions and 99 deletions.
158 changes: 59 additions & 99 deletions READING_LIST.md
Original file line number Diff line number Diff line change
@@ -1,123 +1,83 @@
# Suggested reading list
This document contains the suggested reading list of papers pertaining to Diffusion Models
## Fundamental papers

1. **Auto-Encoding Variational Bayes**
- https://arxiv.org/pdf/1312.6114

2. **Denoising Diffusion Probabilistic Models**
- https://arxiv.org/abs/2006.11239

3. **Improved Denoising Diffusion Probabilistic Models**
- https://arxiv.org/abs/2102.09672

4. **Generative Modeling by Estimating Gradients of the Data Distribution**
- https://arxiv.org/abs/1907.05600

5. **Score-Based Generative Modeling through Stochastic Differential Equations**
- https://arxiv.org/abs/2011.13456

6. **Denoising Diffusion Implicit Models**
- https://arxiv.org/abs/2010.02502

7. **Diffusion Models Beat GANs on Image Synthesis**
- https://arxiv.org/abs/2105.05233

8. **Elucidating the Design Space of Diffusion-Based Generative Models**
- https://arxiv.org/abs/2206.00364

9. **Classifier-Free Diffusion Guidance**
- https://arxiv.org/abs/2207.12598

10. **High-Resolution Image Synthesis with Latent Diffusion Models**
- https://arxiv.org/abs/2112.10752

11. **SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis**
- https://www.youtube.com/watch?v=kkYaikeLJd


## Inversion
1. Null-text Inversion for Editing Real Images using Guided Diffusion Models
- https://arxiv.org/abs/2211.09794


## Text-based image editing
1. Prompt-to-Prompt Image Editing with Cross Attention Control
- https://arxiv.org/abs/2208.01626

2. Adding Conditional Control to Text-to-Image Diffusion Models
- https://arxiv.org/abs/2302.05543

3. **InstructPix2Pix: Learning to Follow Image Editing Instructions**
- https://arxiv.org/abs/2211.09800


## SD finetuning and controlled generation
1. **An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion**
- https://arxiv.org/abs/2208.01618
## Diffusion Papers
### Fundamental
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Auto-Encoding Variational Bayes | [Link](https://arxiv.org/pdf/1312.6114) | :heavy_check_mark: | ✗ |
| Denoising Diffusion Probabilistic Models | [Link](https://arxiv.org/abs/2006.11239) | :heavy_check_mark: | :heavy_check_mark: |
| Denoising Diffusion Implicit Models | [Link](https://arxiv.org/abs/2010.02502) | :heavy_check_mark: | :heavy_check_mark: |
| Improved Denoising Diffusion Probabilistic Models | [Link](https://arxiv.org/abs/2102.09672) | :heavy_check_mark: | :heavy_check_mark: |
| Generative Modeling by Estimating Gradients of the Data Distribution | [Link](https://arxiv.org/abs/1907.05600) | ✗ | ✗ |
| Score-Based Generative Modeling through Stochastic Differential Equations | [Link](https://arxiv.org/abs/2011.13456) | ✗ | ✗ |
| Diffusion Models Beat GANs on Image Synthesis | [Link](https://arxiv.org/abs/2105.05233) | ✗ | ✗ |
| Elucidating the Design Space of Diffusion-Based Generative Models | [Link](https://arxiv.org/abs/2206.00364) | ✗ | ✗ |
| Classifier-Free Diffusion Guidance | [Link](https://arxiv.org/abs/2207.12598) | :heavy_check_mark: | ✗ |
| High-Resolution Image Synthesis with Latent Diffusion Models | [Link](https://arxiv.org/abs/2112.10752) | :heavy_check_mark: | ✗ |
| SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis | [Link](https://arxiv.org/abs/2307.01952) | :heavy_check_mark: | :heavy_check_mark: |

### Inversion
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Null-text Inversion for Editing Real Images using Guided Diffusion Models | [Link](https://arxiv.org/abs/2211.09794) | :heavy_check_mark: | :heavy_check_mark: |

2. **DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation**
- https://arxiv.org/abs/2208.12242

3. **LoRA: Low-Rank Adaptation of Large Language Models**
- https://arxiv.org/abs/2106.09685
4. Key-Locked Rank One Editing for Text-to-Image Personalization
- https://arxiv.org/abs/2305.01644
## Fast sampling
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Progressive Distillation for Fast Sampling of Diffusion Models | [Link](https://arxiv.org/abs/2202.00512) | ✗ | ✗
| On Distillation of Guided Diffusion Models | [Link](https://arxiv.org/abs/2210.03142)| ✗ | ✗ |

5. **T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models**
- https://arxiv.org/abs/2302.08453
### Text-based image editing
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Prompt-to-Prompt Image Editing with Cross Attention Control | [Link](https://arxiv.org/abs/2208.01626) | :heavy_check_mark: | :heavy_check_mark: |
| InstructPix2Pix: Learning to Follow Image Editing Instructions | [Link](https://arxiv.org/abs/2211.09800) | ✗ | ✗


## Image-based editing
1. **SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations**
- https://arxiv.org/abs/2108.01073
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations | [Link](https://arxiv.org/abs/2108.01073) | ✗ | ✗ |
| Palette: Image-to-Image Diffusion Models | [Link](https://arxiv.org/abs/2111.05826) | :heavy_check_mark: | ✗ |

2. **Palette: Image-to-Image Diffusion Models**
- https://arxiv.org/abs/2111.05826

### SD finetuning and controlled generation
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion | [Link](https://arxiv.org/abs/2208.01618) | :heavy_check_mark: | ✗ |
| DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | [Link](https://arxiv.org/abs/2208.12242) | :heavy_check_mark: | ✗ |
| Adding Conditional Control to Text-to-Image Diffusion Models | [Link](https://arxiv.org/abs/2302.05543) | :heavy_check_mark: | ✗ |
| LoRA: Low-Rank Adaptation of Large Language Models | [Link](https://arxiv.org/abs/2106.09685) | ✗ | ✗ |
| Key-Locked Rank One Editing for Text-to-Image Personalization | [Link](https://arxiv.org/abs/2305.01644) | ✗ | ✗ |
| T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models | [Link](https://arxiv.org/abs/2302.08453) | ✗ | ✗ |

## SD-based video synthesis
1. **DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion**
- https://arxiv.org/abs/2304.06025


## Super-resolution
1. **Image Super-Resolution via Iterative Refinement**
- https://arxiv.org/abs/2104.07636


## Garments Try-on
1. **TryOnDiffusion: A Tale of Two UNets**
- https://arxiv.org/abs/2306.08276

| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Image Super-Resolution via Iterative Refinement | [Link](https://arxiv.org/abs/2104.07636) | :heavy_check_mark: | ✗ |

## Subject-swapping
1. **Photoswap: Personalized Subject Swapping in Images**
- https://arxiv.org/abs/2305.18286
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Photoswap: Personalized Subject Swapping in Images | [Link](https://arxiv.org/abs/2305.18286) | :heavy_check_mark: | ✗ |

## Garments Try-on
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| TryOnDiffusion: A Tale of Two UNets | [Link](https://arxiv.org/abs/2306.08276) | :heavy_check_mark: | ✗ |

## Fast sampling
1. **Progressive Distillation for Fast Sampling of Diffusion Models**
- https://arxiv.org/abs/2202.00512

2. On Distillation of Guided Diffusion Models
- https://arxiv.org/abs/2210.03142

## Video Synthesis
1. **Video Diffusion Models**
- https://arxiv.org/abs/2204.03458
2. **DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion**
- https://arxiv.org/abs/2304.06025
3. **DisCo: Disentangled Control for Referring Human Dance Generation in Real World**
- https://arxiv.org/abs/2307.00040
4. **Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation**
- https://arxiv.org/abs/2212.11565


>Note:
> 1. Bold highlighted papers are must read for solid understanding of diffusion models and its possible applications.
> 2. Moreover, this is not exhaustive but suggested list. If any one of you find an interesting paper or has any suggestions. They are more than welcome!
| Papers | Archive Link | Read? | Notes|
|------|:-------:|:------:|:----------:|
| Video Diffusion Models | [Link](https://arxiv.org/abs/2204.03458) | ✗ | ✗ |
| DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion | [Link](https://arxiv.org/abs/2304.06025) | :heavy_check_mark: | :heavy_check_mark: |
| DisCo: Disentangled Control for Referring Human Dance Generation in Real World | [Link](https://arxiv.org/abs/2307.00040) | ✗ | ✗ |
| Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation | [Link](https://arxiv.org/abs/2212.11565) | ✗ | ✗ |



Expand Down

0 comments on commit 0f272c1

Please sign in to comment.