This repository contains the codes for training and testing Stocastic Quantization described in the paper "Learning Accurate Low-bit Deep Neural Networks with Stochastic Quantization" (BMVC 2017, Oral).
We implement our codes based on Caffe framework. Our codes can be used for training BWN (Binary Weighted Networks), TWN (Ternary Weighted Networks), SQ-BWN and SQ-TWN.
Please follow the standard installation of Caffe.
cd caffe/
make
cd ..
For CIFAR-10(100), we provide two network architectures VGG-9 and ResNet-56 (See details in the paper). For example, use the following commands to train ResNet-56:
- FWN
./CIFAR/ResNet-56/FWN/train.sh
- BWN
./CIFAR/ResNet-56/BWN/train.sh
- TWN
./CIFAR/ResNet-56/TWN/train.sh
- SQ-BWN
./CIFAR/ResNet-56/SQ-BWN/train.sh
- SQ-TWN
./CIFAR/ResNet-56/SQ-TWN/train.sh
For ImageNet, we provide AlexNet-BN and ResNet-18 network architectures. For example, use the following commands to train ResNet-18:
- FWN
./ImageNet/ResNet-18/FWN/train.sh
- BWN
./ImageNet/ResNet-18/BWN/train.sh
- TWN
./ImageNet/ResNet-18/TWN/train.sh
- SQ-BWN
./ImageNet/ResNet-18/SQ-BWN/train.sh
- SQ-TWN
./ImageNet/ResNet-18/SQ-TWN/train.sh
We add BinaryConvolution, BinaryInnerProduct, TernaryConvolution and TernaryInnerProduct layers to train binary or ternary networks. We also put useful functions of low-bits DNNs in lowbit-functions.
We add two more parameters in convolution_param
and inner_product_param
, which are sq
and ratio
. sq
means whether to use stochastic quantization (default to false
). ratio
is the SQ ratio (default to 100).
Our codes can only run appropriately on GPU. CPU version should be further implemented.
Have fun to deploy your own low-bits DNNs!