Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python net specification #2086

Merged
merged 3 commits into from
Jun 30, 2015
Merged

Python net specification #2086

merged 3 commits into from
Jun 30, 2015

Conversation

longjon
Copy link
Contributor

@longjon longjon commented Mar 10, 2015

master edition of #1733. Still rough, but should be usable by the intrepid.

  • Now including an AlexNet (CaffeNet variant) generation example.
  • Fixed an error in uncamel which broke acronym names (e.g., LRN) (thanks @sontran).
  • Now supports fillers (thanks @sontran).

longjon added a commit to longjon/caffe that referenced this pull request Mar 10, 2015
longjon added a commit to longjon/caffe that referenced this pull request Mar 10, 2015
longjon added a commit to longjon/caffe that referenced this pull request Mar 10, 2015
@longjon
Copy link
Contributor Author

longjon commented Mar 17, 2015

This should now support repeated Messages as lists of dicts (like param or dummy_data_param's shape; @erictzeng, you asked about this earlier).

I think that means you can now specify any NetParameter as Python code. Once layer naming has been cleaned up a bit (for which I have an idea in mind), I think this will have reached mergeability as a thin wrapper around prototxt.

weiliu89 added a commit to weiliu89/caffe that referenced this pull request Apr 1, 2015
@muupan
Copy link

muupan commented Apr 5, 2015

uncamel('HDF5Data') wrongly returns 'hd_f5_data'.

@muupan
Copy link

muupan commented Apr 5, 2015

And uncamel('PReLU') wrongly returns 'p_relu'.

I think 'HDF5Data' -> 'hdf5_data' is normal uncamelling but 'PReLU' -> 'prelu' is exceptional. Maybe we have to find some other way.

@seanbell
Copy link

@muupan The layers that break the rule could just be hardcoded in the "uncamel" function, e.g. with a dictionary. Something like this:

_UNCAMEL_EXCEPTIONS = {
    'HDF5Data': 'hdf5_data',
    'PReLU': 'prelu',
}

def uncamel(s):
    """Convert CamelCase to underscore_case."""
    return _UNCAMEL_EXCEPTIONS.get(s, 
        re.sub('(?!^)([A-Z])(?=[^A-Z])', r'_\1', s).lower())

@Shaunakde
Copy link

@longjon I am having an issue using this PR. Running the example: http://nbviewer.ipython.org/github/BVLC/caffe/blob/tutorial/examples/01-learning-lenet.ipynb gives me the following error:

 File "/home/shaunak/caffe-pr2086/examples/wine/classify.py", line 18, in lenet
    n = caffe.NetSpec()

  File "../../python/caffe/layers.py", line 84, in __init__
    super(NetSpec, self).__setattr__('tops', OrderedDict())

TypeError: must be type, not None

Update

I tried adding an import statement for caffe as well and the following happened:

import numpy as np
import matplotlib.pyplot as plt

# Make sure that caffe is on the python path:
caffe_root = '../../'  # this file is expected to be in {caffe_root}/examples/wine
import sys
sys.path.insert(0, caffe_root + 'python')

from pylab import *

import caffe 

from caffe import layers as L
from caffe import params as P

def logreg(hdf5, batch_size):
    # logistic regression: data, matrix multiplication, and 2-class softmax loss
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=2, weight_filler=dict(type='xavier'))
    n.accuracy = L.Accuracy(n.ip1, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip1, n.label)
    return n.to_proto()

with open('../../examples/hdf5_classification/logreg_auto_train.prototxt', 'w') as f:
    f.write(str(logreg('examples/hdf5_classification/data/train.txt', 10)))

with open('../../examples/hdf5_classification/logreg_auto_test.prototxt', 'w') as f:
    f.write(str(logreg('examples/hdf5_classification/data/test.txt', 10)))

causes this error:

runfile('/home/shaunak/caffe-pr2086/examples/wine/classify.py', wdir='/home/shaunak/caffe-pr2086/examples/wine')
Reloaded modules: caffe, caffe.proto, caffe._caffe, caffe.pycaffe, caffe.detector, caffe.proto.caffe_pb2, caffe.io, caffe.classifier, caffe.layers
Traceback (most recent call last):

  File "<ipython-input-9-694741de221d>", line 1, in <module>
    runfile('/home/shaunak/caffe-pr2086/examples/wine/classify.py', wdir='/home/shaunak/caffe-pr2086/examples/wine')

  File "/home/shaunak/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "/home/shaunak/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 78, in execfile
    builtins.execfile(filename, *where)

  File "/home/shaunak/caffe-pr2086/examples/wine/classify.py", line 26, in <module>
    f.write(str(logreg('examples/hdf5_classification/data/train.txt', 10)))

  File "/home/shaunak/caffe-pr2086/examples/wine/classify.py", line 23, in logreg
    return n.to_proto()

  File "../../python/caffe/layers.py", line 97, in to_proto
    top.fn._to_proto(layers, names, autonames)

  File "../../python/caffe/layers.py", line 78, in _to_proto
    assign_proto(layer, k, v)

  File "../../python/caffe/layers.py", line 25, in assign_proto
    setattr(proto, name, val)

AttributeError: 'LayerParameter' object has no attribute 'source'

Discussion: http://stackoverflow.com/questions/29774793/typeerror-python-class

elleryrussell pushed a commit to elleryrussell/caffe that referenced this pull request May 1, 2015
@escorciav
Copy link

Hi @Shaunakde , you safe me hours of reading code so I felt that I should help you (sorry if you already noticed that). You should use the comment of seanbell for your example. I guess that the reason is that HDF5Data is an CamelCase tricky layer.

Thank you longjon for this tool. I have spent hours debugging prototxt without noticing minor difference such as layers instead of layer.

@@ -0,0 +1,54 @@
from caffe import layers as L, params as P, to_proto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking over this and from what I can tell the to_proto function from PR #1733 has moved
into NetSpec here.

In this case NetSpec should be imported and all layer variables should be defined to it. This would then also print the layers in their correct order.

The alexnet function would look as follows:

def alexnet(lmdb, batch_size=256, include_acc=False):
    n = NetSpec()

    n.data, n.label = L.Data(source=lmdb, backend=P.Data.LMDB, batch_size=batch_size, ntop=2,
        transform_param=dict(crop_size=227, mean_value=[104, 117, 123], mirror=True))

    # the net itself
    n.conv1, n.relu1 = conv_relu(n.data, 11, 96, stride=4)
    n.pool1 = max_pool(n.relu1, 3, stride=2)
    n.norm1 = L.LRN(n.pool1, local_size=5, alpha=1e-4, beta=0.75)
    n.conv2, n.relu2 = conv_relu(n.norm1, 5, 256, pad=2, group=2)
    n.pool2 = max_pool(n.relu2, 3, stride=2)
    n.norm2 = L.LRN(n.pool2, local_size=5, alpha=1e-4, beta=0.75)
    n.conv3, n.relu3 = conv_relu(n.norm2, 3, 384, pad=1)
    n.conv4, n.relu4 = conv_relu(n.relu3, 3, 384, pad=1, group=2)
    n.conv5, n.relu5 = conv_relu(n.relu4, 3, 256, pad=1, group=2)
    n.pool5 = max_pool(n.relu5, 3, stride=2)
    n.fc6, n.relu6 = fc_relu(n.pool5, 4096)
    n.drop6 = L.Dropout(n.relu6, in_place=True)
    n.fc7, n.relu7 = fc_relu(n.drop6, 4096)
    n.drop7 = L.Dropout(n.relu7, in_place=True)
    n.fc8 = L.InnerProduct(n.drop7, num_output=1000)
    n.loss = L.SoftmaxWithLoss(n.fc8, n.label)

    if include_acc:
        n.acc = L.Accuracy(n.fc8, n.label)

    return n.to_proto()

BR, Max

@BlGene
Copy link
Contributor

BlGene commented Jun 15, 2015

Hi,

I've been using this PR for a few days and I have been able to write all the models I wanted to using it. I am very pleased with it and Ty for writing it. 👍 @longjon

The PR mentions that it fillers, but I don't see how these can be accessed from python, was this omitted from the PR?

I was wondering if there is an easy way to specify parameters for python layers, here a few ideas came to mind:

  1. Just generating the python layer code with the variables in place
  2. Extending PythonParameter in order to smuggle a dict through to the python layer:
message PythonParameter {
  optional string module = 1;
  optional string layer = 2;
  repeated string param_keys = 3;
  repeated string param_values = 3;
}
  1. Extending PythonParameter with param_string, then people can put pickled parameters in it/ do whatever they want.

So for this I was wondering if there is an easier way.

Also in case it was overlooked, its worthwhile looking at the Theano version of this, called Mariana.

@bhack
Copy link
Contributor

bhack commented Jun 15, 2015

See also Lasagne on Theano.

@BlGene BlGene mentioned this pull request Jun 15, 2015
@longjon
Copy link
Contributor Author

longjon commented Jun 16, 2015

Getting ready to update this; here's list of TODOs:

Cleanups and fixes before first mergeable version, which should be considered basically a "protobuf wrapper/generator":

  • Try a better solution to the implicit layer <-> parameter correspondence (currently the messy uncamel function); the fact that the current protobuf system leaves this unspecified means there is no clean answer, but I have an idea for a hack that's a little more robust.
  • Finish the transition to the NetSpec class (basically a namespace to avoid the previous atrocious hack of abusing locals (this was left as a WIP commit).
  • Minimal tests.
  • Minimal docs/examples (as begun).

Desiderata and future work to come after merge (PRs welcome!):

  • Operator overloading for arithmetic.
  • Update in accord with the solution to Treat bottoms and params more uniformly? #1474.
  • Make Python layer integration more natural (possibly also by updating the way Python layers are specified in protobuf, in accord with Switch LayerType to string for extensibility #1685).
  • Provide a direct path from spec -> Net without serializing through a file.
  • Check things that can be checked at specification time (e.g., layer name validity, unused tops).
  • Support silence layers.

@longjon longjon force-pushed the python-net-spec branch 3 times, most recently from 9409178 to 1d9546e Compare June 18, 2015 21:50
@longjon
Copy link
Contributor Author

longjon commented Jun 18, 2015

TODOs all done for now; marking this ready for review!

@muupan @seanbell and others, uncamel is gone; the layer-parameter correspondence is now determined through inspection of the caffe_pb2 module. The only assumption is that the parameter type of a layer named X is XParameter, which is true for all existing layers and should remain true.

@Shaunakde, note that we generally aren't able to provide support for PRs, especially in-progress ones. You're welcome to contribute to the development discussion, but otherwise please use caffe-users or other venues.

@BlGene, see the tests for an example of specifying fillers. Parameters for Python layers are a different (though related) issue; see, e.g., #2001.

def max_pool(bottom, ks, stride=1):
return L.Pooling(bottom, pool=P.Pooling.MAX, kernel_size=ks, stride=stride)

def alexnet(lmdb, batch_size=256, include_acc=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial, but this should be called caffenet since it has our usual inversion.

@shelhamer
Copy link
Member

@longjon amended names and the (AttributeError, KeyError) exception handling and pushed. Merging.

shelhamer added a commit that referenced this pull request Jun 30, 2015
@shelhamer shelhamer merged commit 1d6cac2 into BVLC:master Jun 30, 2015
@kjmonaghan
Copy link

Very helpful, thank you!

How do I go about changing the decay_mult parameter?

@jeffdonahue
Copy link
Contributor

e.g., L.Convolution(data, kernel_size=5, num_output=20, param=[dict(decay_mult=0.5)]). Any proto-generated object (in this case a ParamSpec) can be specified using a dict. The brackets [] are needed to make a list of them, since param is a repeated field (to support multiple parameters). You could also explicitly use a ParamSpec object (rather than a dict that gets converted into one) by doing:

filter_spec = caffe_pb2.ParamSpec()
filter_spec.decay_mult = 0.5
L.Convolution(data, kernel_size=5, num_output=20, param=[filter_spec])

(untested, could be slightly wrong)

twerdster pushed a commit to twerdster/caffe that referenced this pull request Jul 19, 2015
@jmerkow
Copy link

jmerkow commented Jul 30, 2015

Is it possible to specify dummy data with this?
I've tried about 1000 things to get it to work and can't seem to get it...

Based on caffenet.py, I would expect that it would be something like this:

import caffe
from caffe import layers as L, params as P, to_proto
from caffe.proto import caffe_pb2
from __future__ import print_function
def gen_inputs():
    data = L.DummyData(name="data",ntop=1,shape=dict(dim=[1,2,3]))
    label = L.DummyData(name="label",ntop=1,shape=dict(dim=[1,2,3]))
    return data,label

def caffenet(include_acc=False):
    data,label=gen_inputs()

    loss = L.SoftmaxWithLoss(data, label)
    return to_proto(loss)

def make_net(output_dir='./',net_name='train'):
    fname = os.path.join(output_dir,net_name+'.prototxt')
    with open(fname, 'w') as f:
        print(caffenet(), file=f)
make_net()

this produces:

layer {
  name: "data"
  type: "DummyData"
  top: "DummyData1"
}
layer {
  name: "label"
  type: "DummyData"
  top: "DummyData2"
}
layer {
  name: "SoftmaxWithLoss1"
  type: "SoftmaxWithLoss"
  bottom: "DummyData1"
  bottom: "DummyData2"
  top: "SoftmaxWithLoss1"
}

I've tried various other things using caffe_pb2 and caffe.params.
And I tried assigning dummy_data_param explicitly and with dictionaries:

def gen_inputs():
    data = L.DummyData(name="data",ntop=1,dummy_data_param=dict(shape=dict(dim=[1,2,3])))
    label = L.DummyData(name="label",ntop=1,dummy_data_param=dict(shape=dict(dim=[1,2,3])))
    return data,label
def gen_inputs():
    data_shape = caffe_pb2.BlobShape()
    data_shape.dim = [1,2,3]
    data_param = caffe_pb2.DummyDataParameter()
    data_param.shape = data_shape
    data = L.DummyData(name="data",ntop=1,dummy_data_param=data_param)
    label = L.DummyData(name="label",ntop=1,dummy_data_param=data_param)
    return data,label

Any thoughts?
--Jameson

@BlGene
Copy link
Contributor

BlGene commented Jul 31, 2015

Maybe this?:
(Because shape is a repeated field in DummyDataParameter it should be given as a list.)

from __future__ import print_function
import caffe
from caffe import layers as L, params as P, to_proto
from caffe.proto import caffe_pb2
import os

def gen_inputs():
    data  = L.DummyData(name="data", ntop=1,dummy_data_param=dict(shape=[dict(dim=[1,2,3])]))
    label = L.DummyData(name="label",ntop=1,dummy_data_param=dict(shape=[dict(dim=[1,2,3])]))
    return data,label

def caffenet(include_acc=False):
    data,label=gen_inputs()

    loss = L.SoftmaxWithLoss(data, label)
    return to_proto(loss)

def make_net(output_dir='./',net_name='train'):
    fname = os.path.join(output_dir,net_name+'.prototxt')
    with open(fname, 'w') as f:
        print(caffenet(), file=f)

make_net()

BR, Max

@longjon
Copy link
Contributor Author

longjon commented Jul 31, 2015

@BlGene is right. If bad parameters are being silently ignored, however (in current master), that's a bug and you're welcome to open an issue.

@jmerkow
Copy link

jmerkow commented Jul 31, 2015

I can open it with this as a test case. It may be the name 'shape' which has meaning in some contexts?

@dfagnan
Copy link

dfagnan commented Oct 9, 2015

@longjon Does this caffenet example get all the weight_filler and bias_fillers correct? I'm not able to easily see that the default weight_filler is somehow gaussian with std = 0.01 here. Is this true?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.