Super-Resolution-CNN

It is possible to represent the entire process of Super-Resolution as a Deep Convolution Neural Network. The start-of-the-art model for Super-resolution is based on GANs. This repository contains the CNN-based implementaion which is an end to end mapping between low and high-resolution images. It takes as input a 64x64 image and outputs a 128x128 image.

Dataset

Here Linnaeus 5 dataset, which contains 6000 train images and 2000 test images, has been used. The resolution of all images is 256x256. For this model I have resized images to 64x64(which serve as the input data) and 128x128(which serve as ground truth for the respective images).

Model Architecture

The CNN architecture is similar to one described in 'Reconstructing Obfuscated Human Faces'.

Click here to view the model architecture.

Loss Function

In the un-optimized version MeanSquaredError is used as loss function. This resembles with the Pixel Loss which is given as-

In the optimized version a linear combination of Pixel Loss and Perceptual Loss is used. Perceptual loss gives an estimate of difference between feature map of image between this model and, say, a pre-trained VGGNet. The Perceptual loss is given as-

Here Φ denotes the activation of the 6th layer of a pre-trained VGGNet16 model.

To view the architecture of custom VGG model click here

The final loss function looks something like-

Where,

Results

1. Un-Optimized Model

Input(64x64)	Ground Truth	Predicted

Input(64x64)	Ground Truth	Predicted

Input(64x64)	Ground Truth	Predicted

Input(64x64)	Ground Truth	Predicted

These images quite clearly show that model performs pretty well with it comes to smoothening out curves and edges. However, it can be seen that the images are blurry and miss intricate details. The can be resolved by adding the Perceptual Loss to the Pixel Loss function. This forces the model to focus more on detailed structures of the objects in the image.

2. Optimized Model

Input(64x64)	Ground Truth	Predicted

Input(64x64)	Ground Truth	Predicted

Input(64x64)	Ground Truth	Predicted

Input(64x64)	Ground Truth	Predicted

After taking into consideration the Perceptual Loss the model performs way better. Though there is one drawback. The images have a checkerboard like pattern in which is solely due to the perceptual loss. This model also gives a value of around 35-36db for a few images when PSNR(Peak Signal to Noise Ratio) is calculated.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md
super-resolution-64-loss-optimized.ipynb		super-resolution-64-loss-optimized.ipynb
super-resolution-64.ipynb		super-resolution-64.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Super-Resolution-CNN

Dataset

Model Architecture

Loss Function

Results

1. Un-Optimized Model

2. Optimized Model

About

Releases

Packages

Languages

aseem09/Super-Resolution-CNN

Folders and files

Latest commit

History

Repository files navigation

Super-Resolution-CNN

Dataset

Model Architecture

Loss Function

Results

1. Un-Optimized Model

2. Optimized Model

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages