Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making PLKSR stable for real-world SISR #4

Open
neosr-project opened this issue May 10, 2024 · 3 comments
Open

Making PLKSR stable for real-world SISR #4

neosr-project opened this issue May 10, 2024 · 3 comments

Comments

@neosr-project
Copy link

neosr-project commented May 10, 2024

Hi. First of all, thanks to everyone who participated on this research. Very thorough analysis on the paper.

As reported by others in issue 3, PLKSR seems to be unstable for real-world SISR. GAN training is notoriously unstable, and causes issues even at lower learning rate.
So in an attempt to make it more stable, I have released a simple modification to PLKSR, named RealPLKSR:

  • Normalization was missing, as pointed by @dslisleedh. From my understanding, layer norm was avoided because of the impact on inference latency. I have tested multiple methods, including Instance norm, Layer norm, Batch norm, Group norm and RMSNorm. Because we usually train at lower batch sizes (<16), out of those tested, Group Normalization performed best on my experiments. The impact on inference latency was minimal (~5% max). The number of groups was also tested. Increasing it leads to better regularization, but impacts convergence speed. The value 4 offered a good balance on all tests.
  • Replacing GELU with Mish on channel mixer. Mish showed better, more stable, convergence compared to GELU.
  • Added nn.Dropout2d to the last conv, as proposed in "Reflash Dropout in Image Super-Resolution". Although not ideal, dropout is a simple method to increase generalization on real-world SISR.

Pretrained models:

scale download
4x GAN GDrive
4x GDrive
2x GDrive

Training can be done on neosr using the following configurations: paired dataset or through realesrgan degradation pipeline.
Credits were acknowledged inside the code and released under the same license as PLKSR (MIT). I hope this makes PLKSR more used under real-world degradations. It's a really impressive network. Thanks again for your research 👍

@dslisleedh
Copy link
Owner

Thank you for your interest in this work, we are impressed with RealPLKSR's ability to stably learn real-world SISR tasks while maintaining low latency. We will add this issue and implementation to the readme so that many people can utilize your work!

@dslisleedh dslisleedh pinned this issue May 10, 2024
@neosr-project
Copy link
Author

Thanks @dslisleedh!

@Phhofm
Copy link

Phhofm commented May 29, 2024

Just to add to this thread, I trained and released a RealPLKSR model on a dataset I degraded with a bit of lens blur, a bit of realistic noise, and a bit of jpg and webp (re)compression for photography.

The models and all the info to it can be found in its Github Release here

Some examples of my RealPLKSR model for visualization:
334438914-763be319-fd46-4ca8-a4c2-e71e66b7cbc7
334438906-ebbb01d3-c0c7-427d-ba8b-a77797255f59
334438920-e713fde1-d5d2-4355-905f-9756482b5e6c
334438924-624e054a-913a-431c-97a5-8406c5602151
334438928-42196634-c486-4b2b-adce-340f36af87fe
334438931-d7d94828-7cdd-4e9d-9de3-497820899372

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants