Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
spelling/typo fixes (#3815)
Browse files Browse the repository at this point in the history
  • Loading branch information
andremoeller authored and piiswrong committed Nov 14, 2016
1 parent b341bfb commit 5c255f2
Show file tree
Hide file tree
Showing 49 changed files with 95 additions and 95 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ who are willing to help maintaining and lead the project. Committers comes from
* Made substantial contribution to the project.
* Willing to actively spent time on maintaining and lead the project.

New committers will be proposed by current comitter memembers, with support from more than two of current comitters.
New committers will be proposed by current committers, with support from more than two of current committers.

List of Contributors
--------------------
Expand Down
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ MXNet Change Log
- Support bucketing API for variable length input by @pluskid
- Support CuDNN v5 by @antinucleon
- More applications
- Speech recoginition by @yzhang87
- Speech recognition by @yzhang87
- [Neural art](https://github.com/dmlc/mxnet/tree/master/example/neural-style) by @antinucleon
- [Detection](https://github.com/dmlc/mxnet/tree/master/example/rcnn), RCNN bt @precedenceguo
- [Segmentation](https://github.com/dmlc/mxnet/tree/master/example/fcn-xs), FCN by @tornadomeet
Expand Down
10 changes: 5 additions & 5 deletions R-package/vignettes/classifyRealImageWithPretrainedModel.Rmd
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
Classify Real-World Images with Pre-trained Model
=================================================

MXNet is a flexible and efficient deep learning framework. One of the cool thing that a deep learning
MXNet is a flexible and efficient deep learning framework. One of the cool things that a deep learning
algorithm can do is to classify real world images.

In this example we will show how to use a pretrained Inception-BatchNorm Network to predict the class of
real world image. The network architecture is decribed in [1].
real world image. The network architecture is described in [1].

The pre-trained Inception-BatchNorm network is able to be downloaded from [this link](http://data.dmlc.ml/mxnet/data/Inception.zip)
This model gives the recent state-of-art prediction accuracy on image net dataset.
Expand All @@ -16,7 +16,7 @@ This tutorial is written in Rmarkdown.
- You can directly view the hosted version of the tutorial from [MXNet R Document](http://mxnet.readthedocs.io/en/latest/packages/r/classifyRealImageWithPretrainedModel.html)
- You can find the download the Rmarkdown source from [here](https://github.com/dmlc/mxnet/blob/master/R-package/vignettes/classifyRealImageWithPretrainedModel.Rmd)

Pacakge Loading
Package Loading
---------------
To get started, we load the mxnet package by require mxnet.
```{r}
Expand Down Expand Up @@ -58,7 +58,7 @@ plot(im)

Before feeding the image to the deep net, we need to do some preprocessing
to make the image fit the input requirement of deepnet. The preprocessing
include cropping, and substraction of the mean.
include cropping, and subtraction of the mean.
Because mxnet is deeply integerated with R, we can do all the processing in R function.

The preprocessing function:
Expand All @@ -76,7 +76,7 @@ preproc.image <- function(im, mean.image) {
# convert to array (x, y, channel)
arr <- as.array(resized) * 255
dim(arr) <- c(224, 224, 3)
# substract the mean
# subtract the mean
normed <- arr - mean.img
# Reshape to format needed by mxnet (width, height, channel, num)
dim(normed) <- c(224, 224, 3, 1)
Expand Down
4 changes: 2 additions & 2 deletions R-package/vignettes/ndarrayAndSymbolTutorial.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ d <- c / a - 5
as.array(d)
```

If two `NDArray`s sit on different divices, we need to explicitly move them
If two `NDArray`s sit on different devices, we need to explicitly move them
into the same one. For instance:

```{r, eval=FALSE}
Expand Down Expand Up @@ -227,7 +227,7 @@ which provides a detailed explanation of concepts in pictures.

### How Efficient is Symbolic API

In short, they design to be very efficienct in both memory and runtime.
In short, they are designed to be very efficient in both memory and runtime.

The major reason for us to introduce Symbolic API, is to bring the efficient C++
operations in powerful toolkits such as cxxnet and caffe together with the
Expand Down
2 changes: 1 addition & 1 deletion amalgamation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ Add
#include <Accelerate/Accelerate.h>
```

Comment all occurences of
Comment all occurrences of
```
#include <emmintrin.h>
```
Expand Down
4 changes: 2 additions & 2 deletions docs/api/python/kvstore.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Basic operation over multiple devices (gpus) on a single machine.
### Initialization

Let's first consider a simple example. It initializes
a (`int`, `NDAarray`) pair into the store, and then pull the value out.
a (`int`, `NDarray`) pair into the store, and then pulls the value out.

```python
>>> kv = mx.kv.create('local') # create a local kv store.
Expand Down Expand Up @@ -137,4 +137,4 @@ update on key: 9
```

# Recommended Next Steps
* [Python Tutorials](http://mxnet.io/tutorials/index.html#Python-Tutorials)
* [Python Tutorials](http://mxnet.io/tutorials/index.html#Python-Tutorials)
4 changes: 2 additions & 2 deletions docs/architecture/note_data_loading.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Data loading is to load the packed data into RAM. One ultimate goal is to load a

Since the training of deep neural network always involves huge amount of data, the format we choose should works efficient and convenient in such scenario.

To achieve the goals described in insight, we need to pack binary data into a splitable format. In MXNet, we use binary recordIO format implemented in dmlc-core as our basic data saving format.
To achieve the goals described in insight, we need to pack binary data into a splittable format. In MXNet, we use binary recordIO format implemented in dmlc-core as our basic data saving format.

### Binary Record

Expand Down Expand Up @@ -160,4 +160,4 @@ for deep learning libraries. You are more welcomed to contribute to this Note, b

# Recommended next steps

* [Survey of RNN Interface](http://mxnet.io/architecture/rnn_interface.html)
* [Survey of RNN Interface](http://mxnet.io/architecture/rnn_interface.html)
4 changes: 2 additions & 2 deletions docs/architecture/note_engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Dependency Scheduling Problem
While most of the users want to take advantage of parallel computation,
most of us are more used to serial programs. So it is interesting to ask
if we can write serial programs, and build a library to automatically parallelize
operations for you in an asynchronized way.
operations for you asynchronously.

For example, in the following code snippet. We can actually run ```B = A + 1```
and ```C = A + 2``` in any order, or in parallel.
Expand Down Expand Up @@ -322,4 +322,4 @@ You can find more descriptions in the [here](engine.md). You are also welcome to

* [Squeeze the Memory Consumption of Deep Learning](http://mxnet.io/architecture/note_memory.html)
* [Efficient Data Loading Module for Deep Learning](http://mxnet.io/architecture/note_data_loading.html)
* [Survey of RNN Interface](http://mxnet.io/architecture/rnn_interface.html)
* [Survey of RNN Interface](http://mxnet.io/architecture/rnn_interface.html)
2 changes: 1 addition & 1 deletion docs/architecture/note_memory.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Hopefully you are convinced that the computation graph is a good way to discuss
As you can see some memory saving can already been bought by using explicit backward graph. Let us discuss more about
what optimization we can do, and what is the baseline.

Asumme we want to build a neural net with ```n``` layers. A typical implementation of neural net will
Assume we want to build a neural net with ```n``` layers. A typical implementation of neural net will
need to allocate node space for output of each layer, as well as gradient values for back-propagation.
This means we need roughly ```2 n``` memory cells. This is the same in the explicit backward graph case, as
the number of nodes in backward pass in roughly the same as forward pass.
Expand Down
6 changes: 3 additions & 3 deletions docs/architecture/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,7 @@ It is possible that one convolution has several implementations and users want t
struct ResourceRequest {
enum Type {
kRandom, // get an mshadow::Random<xpu> object
kTempSpace, // request temporay space
kTempSpace, // request temporary space
};
Type type;
};
Expand Down Expand Up @@ -276,7 +276,7 @@ It is possible that one convolution has several implementations and users want t
### Create Operator from Operator Property
As mentioned above `OperatorProperty` includes all *semantical* attributes of an operation. It is also in charge of creating `Operator` pointer for actual computation.
As mentioned above, `OperatorProperty` includes all *semantic* attributes of an operation. It is also in charge of creating `Operator` pointer for actual computation.
#### Create Operator
Implement following interface in `OperatorProperty`:
Expand Down Expand Up @@ -487,7 +487,7 @@ void SmoothL1Forward_(const TBlob& src,
After obtaining `mshadow::Stream` from `RunContext`, we get `mshadow::Tensor` from `mshadow::TBlob`.
`mshadow::F` is a shortcut to initiate a `mshadow` expression. The macro `MSHADOW_TYPE_SWITCH(type, DType, ...)`
handles details on different types and the macro `ASSIGN_DISPATCH(out, req, exp)` checks `OpReqType` and
performs actions accordingly. `sigma2` is a special parameter in this loss, which we will cover in addtional usages.
performs actions accordingly. `sigma2` is a special parameter in this loss, which we will cover in additional usages.
### Define Gradients (optional)
Create a gradient function with various types of inputs.
Expand Down
2 changes: 1 addition & 1 deletion docs/get_started/overview_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ refer to our [NIPS LearningSys paper](http://learningsys.org/papers/LearningSys_

神经网络本质上是一种语言,我们通过它来表达对应用问题的理解。例如我们用卷积层来表达空间相关性,RNN来表达时间连续性。根据问题的复杂性和信息如何从输入到输出一步步提取,我们将不同大小的层按一定原则连接起来。近年来随着数据的激增和计算能力的大幅提升,神经网络也变得越来越深和大。例如最近几次imagnet竞赛的冠军都使用有数十至百层的网络。对于这一类神经网络我们通常称之为深度学习。从应用的角度而言,对深度学习最重要的是如何方便地表述神经网络,以及如何快速训练得到模型。

对于一个优秀的深度学习系统,或者更广来说优秀的科学计算系统,最重要的是编程接口的设计。他们都采用将一个*领域特定语言(domain specific language)*嵌入到一个主语言中。例如numpy将矩阵运算嵌入到python中。这类嵌入一般分为两种,其中一种嵌入的较浅,其中每个语句都按原来的意思执行,且通常采用*命令式编程(imperative programming)*,其中numpy和Torch就是属于这种。而另一种则用一种深的嵌入方式,提供一整套针对具体应用的迷你语言。这一种通常使用*声明式语言(declarative programing)*,既用户只需要声明要做什么,而具体执行则由系统完成。这类系统包括Caffe,theano和刚公布的TensorFlow。
对于一个优秀的深度学习系统,或者更广来说优秀的科学计算系统,最重要的是编程接口的设计。他们都采用将一个*领域特定语言(domain specific language)*嵌入到一个主语言中。例如numpy将矩阵运算嵌入到python中。这类嵌入一般分为两种,其中一种嵌入的较浅,其中每个语句都按原来的意思执行,且通常采用*命令式编程(imperative programming)*,其中numpy和Torch就是属于这种。而另一种则用一种深的嵌入方式,提供一整套针对具体应用的迷你语言。这一种通常使用*声明式语言(declarative programming)*,既用户只需要声明要做什么,而具体执行则由系统完成。这类系统包括Caffe,theano和刚公布的TensorFlow。

这两种方式各有利弊,总结如下

Expand Down
2 changes: 1 addition & 1 deletion docs/how_to/perf.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The following factors may significantly affect the performance:

1. Use a fast back-end. A fast BLAS library, e.g. openblas, altas,
1. Use a fast back-end. A fast BLAS library, e.g. openblas, atlas,
and mkl, is necessary if only using CPU. While for Nvidia GPUs, we strongly
recommend to use CUDNN.
2. Three important things for the input data:
Expand Down
8 changes: 4 additions & 4 deletions docs/tutorials/computer_vision/image_classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ We can train a model using multiple machines.
../../tools/launch.py -n 2 python train_mnist.py --kv-store dist_sync
```

here we can either use synchronized SGD `dist_sync` or use asynchronized SGD
here we can either use synchronous SGD `dist_sync` or use asynchronous SGD
`dist_async`

- assume there are several ssh-able machines, and this mxnet folder is
Expand Down Expand Up @@ -149,7 +149,7 @@ model.fit(X = train_iter, eval_data = val_iter)

The following factors may significant affect the performance:

1. Use a fast backend. A fast BLAS library, e.g. openblas, altas,
1. Use a fast backend. A fast BLAS library, e.g. openblas, atlas,
and mkl, is necessary if only using CPU. While for Nvidia GPUs, we strongly
recommend to use CUDNN.
2. Three important things for the input data:
Expand Down Expand Up @@ -279,11 +279,11 @@ python train_cifar10.py --batch-size 128 --lr 0.1 --lr-factor .94 --num-epoch 50
```

*Note: S3 is unstable sometimes, if your training hangs or getting error
freqently, you cant download data to `/mnt` first*
frequently, you cant download data to `/mnt` first*

Accuracy vs epoch ([the interactive figure](https://docs.google.com/spreadsheets/d/1AEesHjWUZOzCN0Gp_PYI1Cw4U1kZMKot360p9Fowmjw/pubchart?oid=1740787404&format=interactive)):

<img src=https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/image/inception-with-bn-imagnet1k.png width=600px/>

# Recommended Next Steps
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
4 changes: 2 additions & 2 deletions docs/tutorials/computer_vision/imagenet_full.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ After packing, together with threaded buffer iterator, we can simply achieve an
Now we have data. We need to consider which network structure to use. We use Inception-BN [3] style model, compared to other models such as VGG, it has fewer parameters, less parameters simplified sync problem. Considering our problem is much more challenging than 1k classes problem, we add suitable capacity into original Inception-BN structure, by increasing the size of filter by factor of 1.5 in bottom layers of original Inception-BN network.
This however, creates a challenge for GPU memory. As GTX980 only have 4G of GPU RAM. We really need to minimize the memory consumption to fit larger batch-size into the training. To solve this problem we use the techniques such as node memory reuse, and inplace optimization, which reduces the memory consumption by half, more details can be found in [memory optimization note](http://mxnet.io/architecture/note_memory.html)

Finally, we cannot train the model using a single GPU because this is a really large net, and a lot of data. We use data parallelism on four GPUs to train this model, which involves smart synchronization of parameters between different GPUs, and overlap the communication and computation. A [runtime denpdency engine](http://mxnet.io/architecture/note_engine.html) is used to simplify this task, allowing us to run the training at around 170 images/sec.
Finally, we cannot train the model using a single GPU because this is a really large net, and a lot of data. We use data parallelism on four GPUs to train this model, which involves smart synchronization of parameters between different GPUs, and overlap the communication and computation. A [runtime dependency engine](http://mxnet.io/architecture/note_engine.html) is used to simplify this task, allowing us to run the training at around 170 images/sec.

Here is a learning curve of the training process:
![alt text](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/imagenet_full/curve.png "Learning Curve")
Expand Down Expand Up @@ -103,4 +103,4 @@ There is no doubt that directly use probability over 21k classes loss diversity
[3] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." *arXiv preprint arXiv:1502.03167* (2015).

# Recommended Next Steps
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
12 changes: 6 additions & 6 deletions docs/tutorials/nlp/cnn.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

You can get the source code for below example [here](https://github.com/dmlc/mxnet/tree/master/example/cnn_text_classification)

It is slightly simplified implementation of Kim's [Convolutional Neural Networks for Sentence Classification](http://arxiv.org/abs/1408.5882) paper in MXNet.
It is a slightly simplified implementation of Kim's [Convolutional Neural Networks for Sentence Classification](http://arxiv.org/abs/1408.5882) paper in MXNet.

Recently, I have been learning mxnet for Natural Language Processing (NLP). I followed this nice blog ["Implementing a CNN for Text Classification in Tensorflow" blog post.](http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/) to reimplement it by mxnet framwork.
Data preprocessing code and courpus are directly borrowed from original author [cnn-text-classification-tf](https://github.com/dennybritz/cnn-text-classification-tf).
Recently, I have been learning MXNet for Natural Language Processing (NLP). I followed this nice blog ["Implementing a CNN for Text Classification in Tensorflow" blog post.](http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/) to reimplement it in the MXNet framework.
Data preprocessing code and corpus are directly borrowed from original author [cnn-text-classification-tf](https://github.com/dennybritz/cnn-text-classification-tf).

## Performance compared to original paper
I use the same pretrained word2vec [GoogleNews-vectors-negative300.bin](https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing) in Kim's paper. However, I don't implement L2-normalization of weight on penultimate layer, but provide a L2-normalization of gradients.
Finally, I got a best dev accuracy 80.1%, close to 81% that reported in the orginal paper.
Finally, I got a best dev accuracy 80.1%, close to 81% that is reported in the original paper.

## Data
Please download the corpus from this repository [cnn-text-classification-tf](https://github.com/dennybritz/cnn-text-classification-tf), :)
Expand All @@ -20,11 +20,11 @@ this corpus is small (contains about 10K sentences).
When using GoogleNews word2vec, this code loads it with gensim tools [gensim](https://github.com/piskvorky/gensim/tree/develop/gensim/models).

## Remark
If I were wrong in CNN implementation via mxnet, please correct me.
If I am wrong in CNN implementation via MXNet, please correct me.

## References
- ["Implementing a CNN for Text Classification in Tensorflow" blog post.](http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/)
- [Convolutional Neural Networks for Sentence Classification](http://arxiv.org/abs/1408.5882)

# Recommended Next Steps
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
4 changes: 2 additions & 2 deletions docs/tutorials/python/kvstore.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ and pull data out.
## Initialization

Let's first consider a simple example: initialize
a (`int`, `NDAarray`) pair into the store, and then pull the value out.
a (`int`, `NDarray`) pair into the store, and then pulls the value out.

```python
>>> kv = mx.kv.create('local') # create a local kv store.
Expand Down Expand Up @@ -133,4 +133,4 @@ This section will be updated when the distributed version is ready.
<!-- flexibly as your choice. To mix is to maximize the performance and flexibility. -->

# Recommended Next Steps
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
* [MXNet tutorials index](http://mxnet.io/tutorials/index.html)
2 changes: 1 addition & 1 deletion docs/tutorials/r/classifyRealImageWithPretrainedModel.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ preproc.image <- function(im, mean.image) {
# convert to array (x, y, channel)
arr <- as.array(resized) * 255
dim(arr) <- c(224, 224, 3)
# substract the mean
# subtract the mean
normed <- arr - mean.img
# Reshape to format needed by mxnet (width, height, channel, num)
dim(normed) <- c(224, 224, 3, 1)
Expand Down
Loading

0 comments on commit 5c255f2

Please sign in to comment.