This is a fork of victorca25's BasicSR branch. Most of the documentation is there if you need any information regarding BasicSR. This readme will focus specifically on the differences of this fork.
- Minimum PyTorch 1.5, PyTorch 1.6 (Preview) required if you want to use AMP for training.
- All BasicSR dependencies as documented at victorca25's branch.
- ImageMagick for the image manipulation library. Manually install IM dev library, or just get the library via
apt-get install libmagickwand-dev - Python package:
pip install wand, to access IM from Python.
Implemented AMP, which will automatically cast tensors to 16-bit floating point depending on usage. The reason for this is for the newer Volta/Turing card to take advantage of their Tensor Cores. Testing this feature shows a speed-up of about ~50% during training. You can read more about AMP at nvidia's dev site.
These features are configured in the training .yml file. Because of the nature of the changes, set training mode to LRHROTF beforehand. Using any other modes will behave as the original branch.
- Lower end graphics card with low VRAM may have difficulty resuming from a state. If you get a out of memory error when continuing a training session, then set
load2CPU: trueso that it is loaded to the system RAM instead.
- Random flipping, 90 degree rotate and HR rotate are all independent from each other, and can be applied together.
Currently only usable with LRHROTF mode only.
- When training with no LR data sources set, transformations are done only on the HR tile and LR tile are only auto-generated at the last step.
- If
hr_downscale: trueis set, large dataset images are randomly downscaled before cropping to the training tile size. This also applies to the LR dataset if same-scale training is used. - If dataset image is smaller than training tile size, then it is automatically padded to the proper size with a random colour. This is different from original branch which scales the tile up, thus potentially compromising image quality.
- If
hr_rrot: trueis set, a different image rotate function is used which does not scale up the result. This function is used in conjunction with cropping, so the image tile is built directly from the dataset image.
- Since the tile cropping happens at the same time as rotation, and will be applied to a downscaled input image if HR downscale is used as well. If all transformations are used in tandem, the results in a highly robust model with a low possibility of over-fitting. The downside is takes longer for the model to take shape.
imditheruses Imagemagick's dither engine to create more colour-accurate ordered dithering. Unlike the default ordereddithernoise, this produces more random varying levels of colour depth that may help represent the original image colours more accurately. A noticeable trend when usingditherto train models was that the colour contrast slowly declined over time, which is due to the extreme colours in the generated image being mapped to less vibrant colours. Even when using low colour depth,imditherhas slightly better colour assignment. This approach emulates how the Fatality model's undithering training is done. As a bonus, it requires less processing time than the normal dithering method. By default, the higher colour depth is clamped out. You can reenable it by increasing the colour depth inscripts/augmentations.pyif required.
imranditheruses Imagemagick's dither engine to create mapped scattered dithering. Main difference between the implemention in the originaldithernoise is that it allows for higher colour depth, and adds Riemersma dither apart from Floyd-Steinberg. Care must be taken place when enabling this noise because the result may be almost similar to how some pixel art portray detail. Use only if you need extra denoising & blending strength.
imquantizeis similar to the standardquantizenoise, except has better colour accuracy at higher colour depths. Used to train your model to blend. Also helps with antialiasing of sharp diagonals.
imkuwaharauses Imagemagick's Kuwahara filter that basically reduces fine details in the image by simplifying to a general shape. This theoretically helps to train inpainting, though it is recommended to be used sparringly because of its tendency to create artifacts.
imtonephotouses Imagemagick's dither engine to simulate screen angle tones used in printing. Use if you want to train a de-toner model.
imtonecomicis same as above, except the black channel is not screentoned. Use to emulate how old comics were printed via colour-seperation.
123will use Imagemagick's RGB scale, which supposedly maintains contrast when downscaling.420will use Imagemagick's liquid scale, which in theory has no use whatsoever. However in practice, it forces the model to keep certain details while blurring out all other. Use only if one needs to get high.
Migrate to BasicSR Lite fork.DONE.Update PyTorch 1.6 compatibility, and creation of BasicSR-AMP branch.DONE.
If you have any questions, we have a discord server where you can ask them and a Wiki with more information.
- Big thanks to victorca25 for encouraging the creation of this fork.
- Thanks to Twittman for sharing how Fatality's training tiles and screentones were generated.
- Thanks to BlueAmulet for his various bug fixes to BasicSR.
- Thanks to JoeyBallentine for his fixes as well.
- Code architecture is inspired by pytorch-cyclegan.
- Thanks to Wai Ho Kwok, who contributes to the initial version.
@InProceedings{wang2018esrgan,
author = {Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Qiao, Yu and Loy, Chen Change},
title = {ESRGAN: Enhanced super-resolution generative adversarial networks},
booktitle = {The European Conference on Computer Vision Workshops (ECCVW)},
month = {September},
year = {2018}
}
@InProceedings{wang2018sftgan,
author = {Wang, Xintao and Yu, Ke and Dong, Chao and Loy, Chen Change},
title = {Recovering realistic texture in image super-resolution by deep spatial feature transform},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}
@article{Hui-PPON-2019,
title={Progressive Perception-Oriented Network for Single Image Super-Resolution},
author={Hui, Zheng and Li, Jie and Gao, Xinbo and Wang, Xiumei},
booktitle={arXiv:1907.10399v1},
year={2019}
}
@InProceedings{Liu2019abpn,
author = {Liu, Zhi-Song and Wang, Li-Wen and Li, Chu-Tak and Siu, Wan-Chi},
title = {Image Super-Resolution via Attention based Back Projection Networks},
booktitle = {IEEE International Conference on Computer Vision Workshop(ICCVW)},
month = {October},
year = {2019}
}






