You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A backward-mode warp operation involves two functions inside: an backward coordinate map ϕ and value estimator τ. Warp operation is applied pointwisely: Y[p] = τ(ϕ(p), X).
Depends on whether we make them trainable or not, there are four options:
[Option 1] ϕ and τ are not trainable
[Option 2] ϕ is trainable and τ is not trainable
[Option 3] ϕ is not trainable and τ is trainable
[Option 4] ϕ and τ are trainable
Option 1 is the typical and classical case: ϕ is build heuristically and τ is typically one interpolation/extrapolation.
To make option 2, 3, and 4 work, we need to backprop three gradients: ∂L/∂Wϕ, ∂L/∂Wτ, and ∂L/∂X.
Applications
These are just very simple and naive ideas, there are more possibilities and there will be more issues that need to solve.
Registration and mapping
This is a direction application of option 2
Idea: instead of using predefined transformation ϕ, we can train a small network to do the job
Issues that need to identify and address:
Real distortions are usually non-linear and elastic, so using one linear transformation can't do the job. How do we design the forward pass of ϕ to overfit the model for real-world images?
How to train it on batches?
How many parameters do we need, and how's the performance in both loss and running time?
Super resolution
This is a direct application of the option 3.
Idea: Instead of using any fixed weight interpolation function τ, we can train a very very small network to do the job.
Issues that need to identify and address:
The performance: unlike typical CNN network, the interpolation network τ requires length(Y) times forward pass to generate entire image, which is gigantic.
How to train it on batches.
Roadmap
This roadmap is organized very coarsely. How we approach each task belongs to other more specific issues.
Framework support:
AD support to warp in our playground DiffImages. This is a must before we approaching to concrete applications.
move associated adjoints to upstream packages: ImageCore, ImageTransformations, Interpolations and others. This is for maintenance purpose so we can leave it as a post cleanup work.
Applications:
image registration or other similar project to apply our option 2 idea
super resolution or other similar project to apply our option 3 idea
We should treat our applications a separate project so every application idea shoauld be managed in one repo with associated Project.toml and Manifest.toml.
Timeline and evaluation
Our progress is currently delayed a lot due to the unfamiliarity with AD and warp mechanisms. It's hard to make a strict timeline for research projects, so this is a loose one for reference purposes:
We should make our warp AD work before Aug 7. This is the main focus of the mid-term evaluation.
build one small demo for each application idea before the final evaluation.
making it a real work(i.e., beating the existing typical CNN-based heavy model) leaves it to future work. They do not belong to the scope of this JSoC project evaluation.
The text was updated successfully, but these errors were encountered:
johnnychen94
changed the title
JSoC 2021: warp AD and its application
JSoC 2021: warp AD and its application -- the roadmap
Jul 28, 2021
Abstract: build a trainable warp pipeline and apply it to image processing and computer vision fields.
author: @SomTambe
Mentors: @johnnychen94, and @DhairyaLGandhi
Introduction
A backward-mode warp operation involves two functions inside: an backward coordinate map ϕ and value estimator τ. Warp operation is applied pointwisely:
Y[p] = τ(ϕ(p), X)
.Depends on whether we make them trainable or not, there are four options:
Option 1 is the typical and classical case: ϕ is build heuristically and τ is typically one interpolation/extrapolation.
To make option 2, 3, and 4 work, we need to backprop three gradients: ∂L/∂Wϕ, ∂L/∂Wτ, and ∂L/∂X.
Applications
These are just very simple and naive ideas, there are more possibilities and there will be more issues that need to solve.
Registration and mapping
This is a direction application of option 2
Idea: instead of using predefined transformation ϕ, we can train a small network to do the job
Issues that need to identify and address:
Super resolution
This is a direct application of the option 3.
Idea: Instead of using any fixed weight interpolation function τ, we can train a very very small network to do the job.
Issues that need to identify and address:
length(Y)
times forward pass to generate entire image, which is gigantic.Roadmap
This roadmap is organized very coarsely. How we approach each task belongs to other more specific issues.
Framework support:
warp
in our playground DiffImages. This is a must before we approaching to concrete applications.ImageCore
,ImageTransformations
,Interpolations
and others. This is for maintenance purpose so we can leave it as a post cleanup work.Applications:
We should treat our applications a separate project so every application idea shoauld be managed in one repo with associated
Project.toml
andManifest.toml
.Timeline and evaluation
Our progress is currently delayed a lot due to the unfamiliarity with AD and warp mechanisms. It's hard to make a strict timeline for research projects, so this is a loose one for reference purposes:
warp
AD work before Aug 7. This is the main focus of the mid-term evaluation.The text was updated successfully, but these errors were encountered: