This repository contains the implementation for Active Shift Layer (ASL).
Please see the paper Constructing Fast Network through Deconstruction of Convolution.
This paper is accepted in NIPS 2018 as spotlight session (slide, poster)
The code is based on Caffe
Tensorflow implementation is also available at ASL-TF
Naive spatial convolution can be deconstructed into a shift layer and a 1x1 convolution.
This figure shows the basic concept of deconstruction.
For the efficient shift, we proposed active shift layer.
- Uses depthwise shift
- Introduced new shift parameters for each channel
- New shift parameters(alpha, beta) are learnable
ASL has 2 parameters : the shift amount (alpha,beta)
Using asl_param, you can control hyper-parameters for ASL. Please see the caffe.proto
This is the example of a usage. Please refer CIFAR10 prototxt for more details.
layer {
name: "shift0"
type: "ActiveShift"
bottom: "conv0"
top: "shift0"
param {
lr_mult: 0.001
decay_mult: 0.0
}
param {
lr_mult: 0.001
decay_mult: 0.0
}
asl_param {
normalize: true
}
}
You can validate backpropagation using test code. Because it is not differentiable on lattice points, you should not use integer point position when you are testing code. It is simply possible to define "TEST_ASHIFT_ENV" macro in active_shift_layer.hpp
- Define "TEST_ASHIFT_ENV" macro in active_shift_layer.hpp
- > make test
- > ./build/test/test_active_shift_layer.testbin
You should pass all tests. Before the start, don't forget to undefine TEST_ASHIFT_ENV macro and make again.
You can download trained ImageNet model here.
- Update Readme
- Upload trained ImageNet model
- Upload CIFAR10 prototxt