Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Improve multi-GPU performance #3241

Merged
merged 13 commits into from
Sep 13, 2016
Merged

Improve multi-GPU performance #3241

merged 13 commits into from
Sep 13, 2016

Conversation

mli
Copy link
Contributor

@mli mli commented Sep 7, 2016

This is a major refactor of src/kvstore, we should obtain multiple confirmations from users before merging.

performance improvement

will try to use gpu pear-to-pear communication if available for kvstore=device. together with PR #3238 , it potentially improves the performance using >=4 gpus training, or training multiple jobs at the same time. for example, using 8 m40, we can improve resnet 152 layers from 300 img/sec to 353 img/sec. even larger improvement is from distributed training.

we also provide tools in tools/bandwidth to measure the GPU bandwidth for various neural networks and hardwares.

changes for the interface

  1. new kvstore types dist_sync_device and dist_async_device for using GPU p2p communication in distributed training.
  2. which type kvstore will be created? (no different to current naming convention)
    • if "device" in name, then will use GPU p2p for push and pull. otherwise all data go to cpu memory directly.
    • if "dist" in name, use distributed kvstore.
      • if "async" in name, use async communication; otherwise use synced communication
  3. the optimizer added into kvstore will run on GPUs when kvstore=device. previous we will copy data to cpu first. so the new way may accelerate things, but require the optimizer to be able to run GPUs

4. the function _create_kvstore in model.py always returns update_on_kvstore=True to simplify things. we can further remove all logic related to update_on_kvstore.

further things

we can use unified memory to solve #2919 we only need to

  • remove some context checks
  • remove the gpu buffer, do elementwisesum on ndarray from different gpus directly

but i observed decreased performance on using unified memory. so we probably only enable it for very large arrays, e.g. fullc weight in vgg.

@piiswrong
Copy link
Contributor

is update on kvstore always a good thing?

@mli
Copy link
Contributor Author

mli commented Sep 7, 2016

i didn't see any disadvantage yet. updating on device also should be a
little bit faster.

On Wed, Sep 7, 2016 at 1:17 AM, Eric Junyuan Xie notifications@github.com
wrote:

is update on kvstore always a good thing?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3241 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAZv4WLlaKA4Yxdvki8Np11d7mj9IKRgks5qnnMRgaJpZM4J2gq4
.

@tqchen
Copy link
Member

tqchen commented Sep 8, 2016

For single machine, update on device might be a bit better, so we might still want to keep the original device aggregation option

@tqchen
Copy link
Member

tqchen commented Sep 8, 2016

is PCI-E topology being exploited in the reduction in current PR?

@tqchen
Copy link
Member

tqchen commented Sep 8, 2016

@mli Please see if you can fix the issue of bringing device aggregation back, and we merge this in

@piiswrong
Copy link
Contributor

was the kvstore=local slow down problem reported by @tornadomeet fixed?

@mli
Copy link
Contributor Author

mli commented Sep 9, 2016

rolled back to use the previous strategy to set update_on_kvstore for
local. but need to confirm that it is fixed.

On Fri, Sep 9, 2016 at 6:14 AM, Eric Junyuan Xie notifications@github.com
wrote:

was the kvstore=local slow down problem reported by @tornadomeet
https://github.com/tornadomeet fixed?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3241 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAZv4cHUeSFtkjb2lFERpH7IPSsx5iidks5qoVuygaJpZM4J2gq4
.

@tqchen
Copy link
Member

tqchen commented Sep 10, 2016

please merge in if it is ready

@tornadomeet
Copy link
Contributor

i will test kv_store=local today.

@tornadomeet
Copy link
Contributor

@mli @piiswrong i just tried the newest update again, and when kv_store='local', this pr is more slower than fix it before(so kv_store='local' maybe some different as it before). but when kv_store='device', this pr will bring faster.
k80+cudnn v5.1+cuda7.5

@mli
Copy link
Contributor Author

mli commented Sep 12, 2016

@tornadomeet how many k80 are you using? i tested the performance on both m40 and k80, local should keep the same performance.

@tornadomeet
Copy link
Contributor

@mli i have test 1 k80 and 3 k80 last week, use resnet-50, the speed of kv-store=local of this pr is only (<) 70% of this:73a0f6e

@mli
Copy link
Contributor Author

mli commented Sep 13, 2016

@tornadomeet can you double check? the pr should be the same performance as the current master for kvstore=local. i rechecked on resnet-50. while the major difference comparing to 73a0f6e is due to #3238, and the current master should be faster. at least it is true on my machine.

@tornadomeet
Copy link
Contributor

@mli ok, i'll test the current master this afternoon~

@tornadomeet
Copy link
Contributor

tornadomeet commented Sep 13, 2016

@mli hello,
i just test the current master, this pr is almost the same with current master when kv-store=local.

4gpu, resent-50:
current master:

INFO:root:Namespace(aug_level=2, batch_size=256, bn_mom=0.9, data_dir='/home/work/data/ImageClassify/imagenet/CLS-LOC/limu', data_type='imagenet', depth=50, gpus='4,5,6,7', kv_store='local', list_dir='./', lr=0.1, model_load_epoch=0, mom=0.9, num_examples=1281167, retrain=False, wd=0.0001)
[12:06:17] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/train_480_q90.rec, use 4 threads for decoding..
[12:06:18] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/val_256_q90.rec, use 4 threads for decoding..
INFO:root:Start training with [gpu(4), gpu(5), gpu(6), gpu(7)]
INFO:root:Epoch[0] Batch [10]    Speed: 70.35 samples/sec    Train-accuracy=0.002344
INFO:root:Epoch[0] Batch [10]    Speed: 70.35 samples/sec    Train-top_k_accuracy_5=0.006641
INFO:root:Epoch[0] Batch [20]    Speed: 62.73 samples/sec    Train-accuracy=0.000391
INFO:root:Epoch[0] Batch [20]    Speed: 62.73 samples/sec    Train-top_k_accuracy_5=0.004687
INFO:root:Epoch[0] Batch [30]    Speed: 62.19 samples/sec    Train-accuracy=0.002344
INFO:root:Epoch[0] Batch [30]    Speed: 62.19 samples/sec    Train-top_k_accuracy_5=0.006250

this pr:

INFO:root:Namespace(aug_level=2, batch_size=256, bn_mom=0.9, data_dir='/home/work/data/ImageClassify/imagenet/CLS-LOC/limu', data_type='imagenet', depth=50, gpus='4,5,6,7', kv_store='local', list_dir='./', lr=0.1, model_load_epoch=0, mom=0.9, num_examples=1281167, retrain=False, wd=0.0001)
[12:08:49] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/train_480_q90.rec, use 4 threads for decoding..
[12:08:50] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/val_256_q90.rec, use 4 threads for decoding..
INFO:root:Start training with [gpu(4), gpu(5), gpu(6), gpu(7)]
INFO:root:Epoch[0] Batch [10]    Speed: 65.96 samples/sec    Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [10]    Speed: 65.96 samples/sec    Train-top_k_accuracy_5=0.005078
INFO:root:Epoch[0] Batch [20]    Speed: 59.16 samples/sec    Train-accuracy=0.001563
INFO:root:Epoch[0] Batch [20]    Speed: 59.16 samples/sec    Train-top_k_accuracy_5=0.005469
INFO:root:Epoch[0] Batch [30]    Speed: 58.33 samples/sec    Train-accuracy=0.000391
INFO:root:Epoch[0] Batch [30]    Speed: 58.33 samples/sec    Train-top_k_accuracy_5=0.005469

the branch before or near 73a0f6e, i'm not sure the accurate branch ,i installed it on 2016.08.13

INFO:root:Epoch[0] Batch [10]    Speed: 99.71 samples/sec    Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [10]    Speed: 99.71 samples/sec    Train-top_k_accuracy_5=0.006641
INFO:root:Epoch[0] Batch [20]    Speed: 88.18 samples/sec    Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [20]    Speed: 88.18 samples/sec    Train-top_k_accuracy_5=0.005078
INFO:root:Epoch[0] Batch [30]    Speed: 88.21 samples/sec    Train-accuracy=0.001563
INFO:root:Epoch[0] Batch [30]    Speed: 88.21 samples/sec    Train-top_k_accuracy_5=0.005859

the speed gap is come from other place, not from this pr, so this pr is ok.

@mli
Copy link
Contributor Author

mli commented Sep 13, 2016

can you check it by git reset 73a0f6e <https://github.com/dmlc/mxnet/commit/73a0f6eb7f5570c3a8aa93f9e1fa6bf257a7bdd8> --hard

73a0f6e

On Mon, Sep 12, 2016 at 9:21 PM, Wei Wu notifications@github.com wrote:

@mli https://github.com/mli hello,
i just test the current master, this pr is almost the same with current
master when kv-store=local.

4gpu, resent-50:
current master:

INFO:root:Namespace(aug_level=2, batch_size=256, bn_mom=0.9, data_dir='/home/work/data/ImageClassify/imagenet/CLS-LOC/limu', data_type='imagenet', depth=50, gpus='4,5,6,7', kv_store='local', list_dir='./', lr=0.1, model_load_epoch=0, mom=0.9, num_examples=1281167, retrain=False, wd=0.0001)
[12:06:17] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/train_480_q90.rec, use 4 threads for decoding..
[12:06:18] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/val_256_q90.rec, use 4 threads for decoding..
INFO:root:Start training with [gpu(4), gpu(5), gpu(6), gpu(7)]
INFO:root:Epoch[0] Batch [10] Speed: 70.35 samples/sec Train-accuracy=0.002344
INFO:root:Epoch[0] Batch [10] Speed: 70.35 samples/sec Train-top_k_accuracy_5=0.006641
INFO:root:Epoch[0] Batch [20] Speed: 62.73 samples/sec Train-accuracy=0.000391
INFO:root:Epoch[0] Batch [20] Speed: 62.73 samples/sec Train-top_k_accuracy_5=0.004687
INFO:root:Epoch[0] Batch [30] Speed: 62.19 samples/sec Train-accuracy=0.002344
INFO:root:Epoch[0] Batch [30] Speed: 62.19 samples/sec Train-top_k_accuracy_5=0.006250

this pr:

INFO:root:Namespace(aug_level=2, batch_size=256, bn_mom=0.9, data_dir='/home/work/data/ImageClassify/imagenet/CLS-LOC/limu', data_type='imagenet', depth=50, gpus='4,5,6,7', kv_store='local', list_dir='./', lr=0.1, model_load_epoch=0, mom=0.9, num_examples=1281167, retrain=False, wd=0.0001)
[12:08:49] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/train_480_q90.rec, use 4 threads for decoding..
[12:08:50] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/val_256_q90.rec, use 4 threads for decoding..
INFO:root:Start training with [gpu(4), gpu(5), gpu(6), gpu(7)]
INFO:root:Epoch[0] Batch [10] Speed: 65.96 samples/sec Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [10] Speed: 65.96 samples/sec Train-top_k_accuracy_5=0.005078
INFO:root:Epoch[0] Batch [20] Speed: 59.16 samples/sec Train-accuracy=0.001563
INFO:root:Epoch[0] Batch [20] Speed: 59.16 samples/sec Train-top_k_accuracy_5=0.005469
INFO:root:Epoch[0] Batch [30] Speed: 58.33 samples/sec Train-accuracy=0.000391
INFO:root:Epoch[0] Batch [30] Speed: 58.33 samples/sec Train-top_k_accuracy_5=0.005469

the branch before or near 73a0f6e
73a0f6e,
i'm not sure the accurate branch ,i installed it on 2016.08.13

INFO:root:Epoch[0] Batch [10] Speed: 99.71 samples/sec Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [10] Speed: 99.71 samples/sec Train-top_k_accuracy_5=0.006641
INFO:root:Epoch[0] Batch [20] Speed: 88.18 samples/sec Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [20] Speed: 88.18 samples/sec Train-top_k_accuracy_5=0.005078
INFO:root:Epoch[0] Batch [30] Speed: 88.21 samples/sec Train-accuracy=0.001563
INFO:root:Epoch[0] Batch [30] Speed: 88.21 samples/sec Train-top_k_accuracy_5=0.005859

the speed gap is come from other please, not from this pr.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3241 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAZv4R1xhtxWAZ8beDe5YFaR_kJoC_bKks5qpiTKgaJpZM4J2gq4
.

@tornadomeet
Copy link
Contributor

tornadomeet commented Sep 13, 2016

the log of 73a0f6e :

INFO:root:Namespace(aug_level=2, batch_size=256, bn_mom=0.9, data_dir='/home/work/data/ImageClassify/imagenet/CLS-LOC/limu', data_type='imagenet', depth=50, gpus='4,5,6,7', kv_store='local', list_dir='./', lr=0.1, model_load_epoch=0, mom=0.9, num_examples=1281167, retrain=False, wd=0.0001)
[12:51:52] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/train_480_q90.rec, use 4 threads for decoding..
[12:51:53] src/io/iter_image_recordio.cc:209: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/val_256_q90.rec, use 4 threads for decoding..
INFO:root:Start training with [gpu(4), gpu(5), gpu(6), gpu(7)]
INFO:root:Epoch[0] Batch [10]   Speed: 62.31 samples/sec    Train-accuracy=0.002344
INFO:root:Epoch[0] Batch [10]   Speed: 62.31 samples/sec    Train-top_k_accuracy_5=0.006641
INFO:root:Epoch[0] Batch [20]   Speed: 55.83 samples/sec    Train-accuracy=0.000391
INFO:root:Epoch[0] Batch [20]   Speed: 55.83 samples/sec    Train-top_k_accuracy_5=0.004687
INFO:root:Epoch[0] Batch [30]   Speed: 54.36 samples/sec    Train-accuracy=0.002344
INFO:root:Epoch[0] Batch [30]   Speed: 54.36 samples/sec    Train-top_k_accuracy_5=0.006250

but it still slow

and the log of 2016.08.10 is :
d54a2e6 is :

INFO:root:Namespace(aug_level=2, batch_size=256, bn_mom=0.9, data_dir='/home/work/data/ImageClassify/imagenet/CLS-LOC/limu', data_type='imagenet', depth=50, gpus='4,5,6,7', kv_store='local', list_dir='./', lr=0.1, model_load_epoch=0, mom=0.9, num_examples=1281167, retrain=False, wd=0.0001)
[13:17:30] src/io/iter_image_recordio.cc:211: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/train_480_q90.rec, use 4 threads for decoding..
[13:17:31] src/io/iter_image_recordio.cc:211: ImageRecordIOParser: /home/work/data/ImageClassify/imagenet/CLS-LOC/limu/val_256_q90.rec, use 4 threads for decoding..
INFO:root:Start training with [gpu(4), gpu(5), gpu(6), gpu(7)]
INFO:root:Epoch[0] Batch [10]   Speed: 82.22 samples/sec    Train-accuracy=0.000781
INFO:root:Epoch[0] Batch [10]   Speed: 82.22 samples/sec    Train-top_k_accuracy_5=0.006250
INFO:root:Epoch[0] Batch [20]   Speed: 72.60 samples/sec    Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [20]   Speed: 72.60 samples/sec    Train-top_k_accuracy_5=0.005859
INFO:root:Epoch[0] Batch [30]   Speed: 71.60 samples/sec    Train-accuracy=0.001953
INFO:root:Epoch[0] Batch [30]   Speed: 71.60 samples/sec    Train-top_k_accuracy_5=0.006641

a little better, but not good as my installed version.
i'm confused about which version i installed at that time! i test some version near 2016.08.13, but it the same as above, i.e. better than currently master, but worse than my installed vesion.

@tornadomeet
Copy link
Contributor

tornadomeet commented Sep 13, 2016

@mli @tqchen @piiswrong i think we can merge this pr now.
and i'll deep check what's wrong if needed (when my gpu is idle:))

@mli mli merged commit 9dfb354 into apache:master Sep 13, 2016
@taoari
Copy link
Contributor

taoari commented Sep 14, 2016

@tornadomeet You may check with commit 2196588 (20160726), on Windows, 4 K40 GPU, CUDA7.0+CUDNN4.0, I am able to achieve 87 samples/s for ResNet-50.

@tornadomeet
Copy link
Contributor

thanks, i will check it after holiday.

piiswrong pushed a commit that referenced this pull request Sep 22, 2016
* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Sep 23, 2016
* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin
piiswrong pushed a commit that referenced this pull request Sep 23, 2016
* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin
piiswrong pushed a commit that referenced this pull request Oct 9, 2016
* NNVM Refactor (#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (#3214)

* [PYTHON] Optional Cython Module for Symbols (#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (#3261)

* Fix path in setup.py (#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (#3290)

* Fix breaking changes after pull master (#3291)

* [CYTHON] Cython module for NDArray (#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (#3303)

* Update run_test.sh

* add nnvm cmake with windows (#3255)

* [WIP] binary broadcast wip (#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (#3308)

* [IO] Python based ImageIter and Augumenter (#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (#3314)

* fix cpython in windows (#3309)

* Add Mathematical functions (#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (#3335)

* add recent examples, collect some missing tutorials (#3340)

* Improving docs & utilities for distributed training example. (#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (#3346)

* Add channel_ to Shape2D calculation (#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (#3369)

* Fix metric & im2rec.py (#3375)

image io fix

* Update legacy op FBackwardInGradIndex (#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Oct 19, 2016
* NNVM Refactor (apache#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (apache#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (apache#3214)

* [PYTHON] Optional Cython Module for Symbols (apache#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (apache#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (apache#3261)

* Fix path in setup.py (apache#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (apache#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (apache#3290)

* Fix breaking changes after pull master (apache#3291)

* [CYTHON] Cython module for NDArray (apache#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (apache#3303)

* Update run_test.sh

* add nnvm cmake with windows (apache#3255)

* [WIP] binary broadcast wip (apache#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (apache#3308)

* [IO] Python based ImageIter and Augumenter (apache#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (apache#3314)

* fix cpython in windows (apache#3309)

* Add Mathematical functions (apache#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (apache#3335)

* add recent examples, collect some missing tutorials (apache#3340)

* Improving docs & utilities for distributed training example. (apache#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (apache#3346)

* Add channel_ to Shape2D calculation (apache#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (apache#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (apache#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (apache#3369)

* Fix metric & im2rec.py (apache#3375)

image io fix

* Update legacy op FBackwardInGradIndex (apache#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (apache#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (apache#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (apache#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Oct 19, 2016
* NNVM Refactor (apache#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (apache#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (apache#3214)

* [PYTHON] Optional Cython Module for Symbols (apache#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (apache#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (apache#3261)

* Fix path in setup.py (apache#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (apache#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (apache#3290)

* Fix breaking changes after pull master (apache#3291)

* [CYTHON] Cython module for NDArray (apache#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (apache#3303)

* Update run_test.sh

* add nnvm cmake with windows (apache#3255)

* [WIP] binary broadcast wip (apache#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (apache#3308)

* [IO] Python based ImageIter and Augumenter (apache#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (apache#3314)

* fix cpython in windows (apache#3309)

* Add Mathematical functions (apache#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (apache#3335)

* add recent examples, collect some missing tutorials (apache#3340)

* Improving docs & utilities for distributed training example. (apache#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (apache#3346)

* Add channel_ to Shape2D calculation (apache#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (apache#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (apache#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (apache#3369)

* Fix metric & im2rec.py (apache#3375)

image io fix

* Update legacy op FBackwardInGradIndex (apache#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (apache#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (apache#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (apache#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
tqchen pushed a commit that referenced this pull request Oct 31, 2016
* NNVM Refactor (#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (#3214)

* [PYTHON] Optional Cython Module for Symbols (#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (#3261)

* Fix path in setup.py (#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (#3290)

* Fix breaking changes after pull master (#3291)

* [CYTHON] Cython module for NDArray (#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (#3303)

* Update run_test.sh

* add nnvm cmake with windows (#3255)

* [WIP] binary broadcast wip (#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (#3308)

* [IO] Python based ImageIter and Augumenter (#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (#3314)

* fix cpython in windows (#3309)

* Add Mathematical functions (#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (#3335)

* add recent examples, collect some missing tutorials (#3340)

* Improving docs & utilities for distributed training example. (#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (#3346)

* Add channel_ to Shape2D calculation (#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (#3369)

* Fix metric & im2rec.py (#3375)

image io fix

* Update legacy op FBackwardInGradIndex (#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Nov 17, 2016
* NNVM Refactor (apache#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (apache#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (apache#3214)

* [PYTHON] Optional Cython Module for Symbols (apache#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (apache#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (apache#3261)

* Fix path in setup.py (apache#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (apache#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (apache#3290)

* Fix breaking changes after pull master (apache#3291)

* [CYTHON] Cython module for NDArray (apache#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (apache#3303)

* Update run_test.sh

* add nnvm cmake with windows (apache#3255)

* [WIP] binary broadcast wip (apache#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (apache#3308)

* [IO] Python based ImageIter and Augumenter (apache#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (apache#3314)

* fix cpython in windows (apache#3309)

* Add Mathematical functions (apache#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (apache#3335)

* add recent examples, collect some missing tutorials (apache#3340)

* Improving docs & utilities for distributed training example. (apache#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (apache#3346)

* Add channel_ to Shape2D calculation (apache#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (apache#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (apache#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (apache#3369)

* Fix metric & im2rec.py (apache#3375)

image io fix

* Update legacy op FBackwardInGradIndex (apache#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (apache#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (apache#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (apache#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit that referenced this pull request Nov 18, 2016
* NNVM Refactor (#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (#3214)

* [PYTHON] Optional Cython Module for Symbols (#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (#3261)

* Fix path in setup.py (#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (#3290)

* Fix breaking changes after pull master (#3291)

* [CYTHON] Cython module for NDArray (#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (#3303)

* Update run_test.sh

* add nnvm cmake with windows (#3255)

* [WIP] binary broadcast wip (#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (#3308)

* [IO] Python based ImageIter and Augumenter (#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (#3314)

* fix cpython in windows (#3309)

* Add Mathematical functions (#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (#3335)

* add recent examples, collect some missing tutorials (#3340)

* Improving docs & utilities for distributed training example. (#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (#3346)

* Add channel_ to Shape2D calculation (#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (#3369)

* Fix metric & im2rec.py (#3375)

image io fix

* Update legacy op FBackwardInGradIndex (#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit that referenced this pull request Nov 30, 2016
* NNVM Refactor (#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (#3214)

* [PYTHON] Optional Cython Module for Symbols (#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (#3261)

* Fix path in setup.py (#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (#3290)

* Fix breaking changes after pull master (#3291)

* [CYTHON] Cython module for NDArray (#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (#3303)

* Update run_test.sh

* add nnvm cmake with windows (#3255)

* [WIP] binary broadcast wip (#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (#3308)

* [IO] Python based ImageIter and Augumenter (#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (#3314)

* fix cpython in windows (#3309)

* Add Mathematical functions (#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (#3335)

* add recent examples, collect some missing tutorials (#3340)

* Improving docs & utilities for distributed training example. (#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (#3346)

* Add channel_ to Shape2D calculation (#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (#3369)

* Fix metric & im2rec.py (#3375)

image io fix

* Update legacy op FBackwardInGradIndex (#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit that referenced this pull request Dec 11, 2016
* NNVM Refactor (#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (#3214)

* [PYTHON] Optional Cython Module for Symbols (#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (#3261)

* Fix path in setup.py (#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (#3290)

* Fix breaking changes after pull master (#3291)

* [CYTHON] Cython module for NDArray (#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (#3303)

* Update run_test.sh

* add nnvm cmake with windows (#3255)

* [WIP] binary broadcast wip (#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (#3308)

* [IO] Python based ImageIter and Augumenter (#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (#3314)

* fix cpython in windows (#3309)

* Add Mathematical functions (#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (#3335)

* add recent examples, collect some missing tutorials (#3340)

* Improving docs & utilities for distributed training example. (#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (#3346)

* Add channel_ to Shape2D calculation (#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (#3369)

* Fix metric & im2rec.py (#3375)

image io fix

* Update legacy op FBackwardInGradIndex (#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Dec 24, 2016
* NNVM Refactor (apache#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (apache#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (apache#3214)

* [PYTHON] Optional Cython Module for Symbols (apache#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (apache#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (apache#3261)

* Fix path in setup.py (apache#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (apache#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (apache#3290)

* Fix breaking changes after pull master (apache#3291)

* [CYTHON] Cython module for NDArray (apache#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (apache#3303)

* Update run_test.sh

* add nnvm cmake with windows (apache#3255)

* [WIP] binary broadcast wip (apache#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (apache#3308)

* [IO] Python based ImageIter and Augumenter (apache#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (apache#3314)

* fix cpython in windows (apache#3309)

* Add Mathematical functions (apache#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (apache#3335)

* add recent examples, collect some missing tutorials (apache#3340)

* Improving docs & utilities for distributed training example. (apache#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (apache#3346)

* Add channel_ to Shape2D calculation (apache#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (apache#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (apache#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (apache#3369)

* Fix metric & im2rec.py (apache#3375)

image io fix

* Update legacy op FBackwardInGradIndex (apache#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (apache#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (apache#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (apache#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Dec 29, 2016
* NNVM Refactor (apache#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (apache#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (apache#3214)

* [PYTHON] Optional Cython Module for Symbols (apache#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (apache#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (apache#3261)

* Fix path in setup.py (apache#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (apache#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (apache#3290)

* Fix breaking changes after pull master (apache#3291)

* [CYTHON] Cython module for NDArray (apache#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (apache#3303)

* Update run_test.sh

* add nnvm cmake with windows (apache#3255)

* [WIP] binary broadcast wip (apache#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (apache#3308)

* [IO] Python based ImageIter and Augumenter (apache#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (apache#3314)

* fix cpython in windows (apache#3309)

* Add Mathematical functions (apache#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (apache#3335)

* add recent examples, collect some missing tutorials (apache#3340)

* Improving docs & utilities for distributed training example. (apache#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (apache#3346)

* Add channel_ to Shape2D calculation (apache#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (apache#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (apache#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (apache#3369)

* Fix metric & im2rec.py (apache#3375)

image io fix

* Update legacy op FBackwardInGradIndex (apache#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (apache#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (apache#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (apache#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit to piiswrong/mxnet that referenced this pull request Dec 29, 2016
* NNVM Refactor (apache#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (apache#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (apache#3214)

* [PYTHON] Optional Cython Module for Symbols (apache#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (apache#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (apache#3261)

* Fix path in setup.py (apache#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (apache#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (apache#3290)

* Fix breaking changes after pull master (apache#3291)

* [CYTHON] Cython module for NDArray (apache#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (apache#3303)

* Update run_test.sh

* add nnvm cmake with windows (apache#3255)

* [WIP] binary broadcast wip (apache#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (apache#3308)

* [IO] Python based ImageIter and Augumenter (apache#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (apache#3314)

* fix cpython in windows (apache#3309)

* Add Mathematical functions (apache#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (apache#3335)

* add recent examples, collect some missing tutorials (apache#3340)

* Improving docs & utilities for distributed training example. (apache#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (apache#3346)

* Add channel_ to Shape2D calculation (apache#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (apache#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (apache#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (apache#3369)

* Fix metric & im2rec.py (apache#3375)

image io fix

* Update legacy op FBackwardInGradIndex (apache#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (apache#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (apache#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (apache#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit that referenced this pull request Dec 29, 2016
* NNVM Refactor (#3194)

* Init nnvm change

* temp checkin

* Move TShape to NNVM

* Redirect Symbolic API to NNVM

* Add Op Prop Adapter

* Finish migrate in shape infer

* Pass all symbolic test

* temp commit

* enable aux data

* [EXEC] Basic version of exec for forward only

* [EXEC] Enable most optimizations, still wait grad and context

* fix legacy op with latest one

* Update NNVM NodeRef

* Adapt to newer interface

* ALl registry of backop is complete

* temp commit

* Hack finish backward pass

* [EXEC] One day pass

* [EXEC] Pass all operator unittest

* [EXEC] enable model parallel

* Fully pass all legacy tests

* Remove legacy symbolic code

* update news

* Make travis compile

* Fix python3

* Update viz module to new json format

* [NNVM] Imperative Invoke (#3208)

* [Engine] Deduplicate Variable Util

* [NNVM] NNVM Imperative Invoke

* [NNVM] Imperative improve speed

* fix

* fix

* [scala] link libnnvm.a (#3214)

* [PYTHON] Optional Cython Module for Symbols (#3242)

* [CYTHON] Checkin cython enhancement

* fix lint

* [DOC] Move common doc to base

* [EXEC] Support fcompute (#3249)

* [EXEC] Support fcompute

* Fix lint

* fix lint

* [OP] Add alias support (#3261)

* Fix path in setup.py (#3276)

* Fix path in setup.py

* revert the nnvm version

* [WIP] Element wise op refactor (#3245)

* [OPERATOR] Refactor Unary Ops

* [OPERATOR] Refactor Binary Scalar Ops

* Use alias

* update nnvm version (#3290)

* Fix breaking changes after pull master (#3291)

* [CYTHON] Cython module for NDArray (#3292)

* [NDARRAY] Cython module for ndarray

* More strict tests

* [NNVM] change of attr to set_attr (#3303)

* Update run_test.sh

* add nnvm cmake with windows (#3255)

* [WIP] binary broadcast wip (#3301)

* [WIP] binary broadcast wip

[OPERATOR] Binary Broadcast ops

fix lint

lint

fix

max and min

update submodule

before removing reduce axis

broad cast reduce ops

* update

* fix

* fix warning

* fix

* x (#3308)

* [IO] Python based ImageIter and Augumenter (#3227)

* [IO] Python based ImageIter and Augumenter

* fix

* fix

* fix

* [OPT] NNVM Optimizer (#3314)

* fix cpython in windows (#3309)

* Add Mathematical functions (#3317)

* fix image io

* add hypot degrees radians cosh sinh tanh arcsinh arccosh arctanh (#3335)

* add recent examples, collect some missing tutorials (#3340)

* Improving docs & utilities for distributed training example. (#3341)

* add init dict

* disable SSE for arm hardware e.g. Raspberry Pi (#3346)

* Add channel_ to Shape2D calculation (#3181)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)

[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* Fix metric & im2rec.py

* [Scala] Nnvm ops for NDArray & Symbol (#3361)

* [scala] nnvm op support

* [scala] remove unused codes

* fix scala native code style

* [R] Fix the R interface (#3334)

* [R] Fix the R interface. remove man

* Fix BN legacy issue

* Locate compiled library on Windows (#3369)

* Fix metric & im2rec.py (#3375)

image io fix

* Update legacy op FBackwardInGradIndex (#3376)

* Update legacy op FBackwardInGradIndex

* fix test

* Fix for LRN Layer (#3366)

* fixed cpu forward bug

* added out_data[lrn_enum::kOut] as backward req.

* removed lint

* removed duplicate out_data[lrn_enum::kTmpNorm],

* removed inplace option

* add backward index

* include some special functions (#3337)

- gamma
- gammaln
- log1p
- expm1

* fix kv build (#3385)

* initial profiler branch based on dmlc/mxnet:nnvm

* [profiler] add profiler & modify engine API

* [profiler] add USE_PROFILER compile flag & modify code for changed engine api

* [profiler] add c_api interface & modify graph_executor

* [profiler] add python api

* [profiler] typo & lint error

* [profiler] reduce overhead & add PROFIELR_MESSAGE_FUNCNAME macro

* [profiler] remove profiling argument from PushSync/PushAsync

* [profiler] refactor profiler.h/.cc

* [profiler] improve readability

* [profiler] typo && add TODO comment

* [profiler] fix ndarray op name & add WaitForVar back

* [profiler] add example/profiler/profiler_ndarray.py

* [profiler] fix memleak by using op->name

* [profiler] fix lint

* [profiler] fix lint
piiswrong pushed a commit that referenced this pull request Jan 12, 2017
#4641)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (#3186)

* RNN cell demo with ptb LSTM language model (#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (#3238)

* Fix little bug on context (#3202)

* add PennTreeBank Language Model using lstm model in R (#2659)

* Add function 'print_summary' and some revise (#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (#3260)

* Copy slice along arbitrary axis (#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (#3264)

* fix PReLU backward computing (#3277)

* Add `reverse` option in Reshape (#3280)

* add scala example, end2end neural-style (#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (#3293)

* Fix newer version of gtest and cpptest (#3294)

* when set use_global_stats then do not use cudnn (#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (#3270)

* add support for building on power (#3302)

* add recent examples, collect some missing tutorials (#3340)

* CMake for caffe plugin

* CMake python deployment changes

* CMake python deployment changes

* CMake python deployment changes

* CMake python deployment changes
rravu3 pushed a commit to rravu3/mxnet that referenced this pull request Jan 21, 2017
apache#4641)

* Add channel_ to Shape2D calculation

* scalapkg, add example multitask (apache#3186)

* RNN cell demo with ptb LSTM language model (apache#3197)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* Bulk lint fix (apache#3211)

* [TENSOR] Add FlatTo1D for all elementwise ops (apache#3238)

* Fix little bug on context (apache#3202)

* add PennTreeBank Language Model using lstm model in R (apache#2659)

* Add function 'print_summary' and some revise (apache#3161)

* Add function 'print_summary' and some revise

Add function 'print_summary' for print detail information of network, and format argument was add in 'plot_network'.
You can use 'print_summary' like:
"""
net = get_symbol(1000)
shape = {'softmax_label': (64, 12), 'data': (64, 3, 224, 224)}
mx.viz.print_summary(net, shape=shape)
"""
If without shape, the number of arguments would be nonsense currently.

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Update visualization.py

* Added my CmakeLists.txt for caffe plugin, etc.

* Revert "fix travis scala test config" (apache#3246)

This reverts parts of commit 3e15f62.
Reenables testing the Julia bindings

* [Scala] Code generation for Symbol (apache#3217)


[scala] auto-generate Symbol functions

* fix spelling errors (apache#3258)

Also align grammar and punctuation in short descriptions of features

* fix typo in run_test.sh (apache#3260)

* Copy slice along arbitrary axis (apache#3259)

* rnn-cell demo (push to server for testing)

* a running example with cuDNN RNN cell

* add copyslice along arbitrary axis for NDArray

* copy_slice_to as an ndarray operator

* Python interface to the _copy_slice_to operator

* fix lint error

* Enable concatenation for dim-1 vectors (apache#3264)

* fix PReLU backward computing (apache#3277)

* Add `reverse` option in Reshape (apache#3280)

* add scala example, end2end neural-style (apache#3267)

add scala example, end2end neural-style

* Improve multi-GPU performance (apache#3241)

* update kvstore

* update model.py

* bandwith tool

* update readme

* tiny

* fix lint

* fix batch size of dist_device_sync

* fix

* fix perf problem of kvstore when only using a single device

* roll back to previous strategy how to choose update_on_kvsotre

* add an optionl MXNET_ENABLE_GPU_P2P to control whether or not use p2p

* update dmlccore (apache#3293)

* Fix newer version of gtest and cpptest (apache#3294)

* when set use_global_stats then do not use cudnn (apache#3289)

* when set use_global_stats then do not use cudnn

* fix batch norm with use_global_stats

* Fix req+reserve_space in cudnn_rnn (apache#3274)

Fix req

Fix reserve_space

Allocate reserve_space using Storage

* add cudnn off option in Convolution (apache#3270)

* add support for building on power (apache#3302)

* add recent examples, collect some missing tutorials (apache#3340)

* CMake for caffe plugin

* CMake python deployment changes

* CMake python deployment changes

* CMake python deployment changes

* CMake python deployment changes
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants