Skip to content
Ben Graham edited this page May 26, 2015 · 10 revisions

SparseConvNet

A spatially-sparse convolutional neural network

Ben Graham, University of Warwick, 2013-2015, GPLv3

SparseConvNet is a convolutional neural network for processing sparse data on a variety of lattices, i.e.
(i) the square lattice, (ii) the triangular lattice, (iii) the cubic lattice, (iv) the tetrahedral lattice, ...
lattice
... and of course the hyper-cubic and hyper-tetrahedral 4D lattices as well.

Data is sparse if most sites take the value zero. For example, if a loop of string has a knot in it, and you trace the shape of the string in a 3D lattice, most sites will not form part of the knot (left). Applying a 2x2x2 convolution (middle), and a pooling operation (right), the set of non-zero sites stays fairly small: lattice

This can be used to analyse 3D models, or space-time paths. Here are some examples from a 3D object dataset. The insides are hollow, so the data is fairly sparse. The computational complexity of processing the models is related to the fractal dimension of the underlying objects.

lattice Top row: four exemplars of snakes. Bottom row: an ant, an elephant, a robot and a tortoise.

References

  1. Spatially-sparse convolutional neural networks
  2. Sparse 3D convolutional neural networks
  3. ICDAR 2013 Chinese Handwriting Recognition Competition 2013 (In Task 3, a sparse CNN won with test error of 2.61%. Human performance on the test set was 4.81%.)
  4. Kaggle CIFAR-10 competition (Training a sparse CNN, with affine and color-space distortions produced a Kaggle CIFAR-10 test error of 4.47%.)
  5. Fractional max-pooling (Using fraction max-pooling, the error rate for CIFAR-10 is reduced to 3.47%. Human performance on the CIFAR-10 test set is approximately 6%.)
Clone this wiki locally