Skip to content

aseem09/Super-Resolution-CNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

Super-Resolution-CNN

It is possible to represent the entire process of Super-Resolution as a Deep Convolution Neural Network. The start-of-the-art model for Super-resolution is based on GANs. This repository contains the CNN-based implementaion which is an end to end mapping between low and high-resolution images. It takes as input a 64x64 image and outputs a 128x128 image.

Dataset

Here Linnaeus 5 dataset, which contains 6000 train images and 2000 test images, has been used. The resolution of all images is 256x256. For this model I have resized images to 64x64(which serve as the input data) and 128x128(which serve as ground truth for the respective images).

Model Architecture

The CNN architecture is similar to one described in 'Reconstructing Obfuscated Human Faces'.

Click here to view the model architecture.

Loss Function

In the un-optimized version MeanSquaredError is used as loss function. This resembles with the Pixel Loss which is given as-

In the optimized version a linear combination of Pixel Loss and Perceptual Loss is used. Perceptual loss gives an estimate of difference between feature map of image between this model and, say, a pre-trained VGGNet. The Perceptual loss is given as-

Here Φ denotes the activation of the 6th layer of a pre-trained VGGNet16 model.

To view the architecture of custom VGG model click here

The final loss function looks something like-

Where,

Results

1. Un-Optimized Model

Input(64x64) Ground Truth Predicted
Input(64x64) Ground Truth Predicted
Input(64x64) Ground Truth Predicted
Input(64x64) Ground Truth Predicted

These images quite clearly show that model performs pretty well with it comes to smoothening out curves and edges. However, it can be seen that the images are blurry and miss intricate details. The can be resolved by adding the Perceptual Loss to the Pixel Loss function. This forces the model to focus more on detailed structures of the objects in the image.

2. Optimized Model

Input(64x64) Ground Truth Predicted
Input(64x64) Ground Truth Predicted
Input(64x64) Ground Truth Predicted
Input(64x64) Ground Truth Predicted

After taking into consideration the Perceptual Loss the model performs way better. Though there is one drawback. The images have a checkerboard like pattern in which is solely due to the perceptual loss. This model also gives a value of around 35-36db for a few images when PSNR(Peak Signal to Noise Ratio) is calculated.

About

Deep CNN based implementation of Super-Resolution

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published