Merge pull request #116 from vchuravy/vc/doc

Cleanup `` and :class:
apache · Aug 18, 2016 · 8c22211 · 8c22211
2 parents 63772b8 + 71eefbb
commit 8c22211
Show file tree

Hide file tree

Showing 14 changed files with 236 additions and 234 deletions.
diff --git a/docs/src/tutorial/mnist.md b/docs/src/tutorial/mnist.md
@@ -28,7 +28,7 @@ data = mx.Variable(:data)
 
 and then cascading fully-connected layers and activation functions:
 
-``` {.sourceCode .julia}
+```julia
 fc1  = mx.FullyConnected(data = data, name=:fc1, num_hidden=128)
 act1 = mx.Activation(data = fc1, name=:relu1, act_type=:relu)
 fc2  = mx.FullyConnected(data = act1, name=:fc2, num_hidden=64)

diff --git a/docs/src/user-guide/overview.md b/docs/src/user-guide/overview.md
@@ -59,7 +59,7 @@ The followings are common ways to create NDArray objects:
 
 -   `mx.empty(shape[, context])`: create on uninitialized array of a
     given shape on a specific device. For example,
-    `` mx.empty(2,3)`, `mx.((2,3), mx.gpu(2)) ``.
+    ` mx.empty(2,3)`, `mx.((2,3), mx.gpu(2)) `.
 -   `mx.zeros(shape[, context])` and `mx.ones(shape[, context])`:
     similar to the Julia's built-in `zeros` and `ones`.
 -   `mx.copy(jl_arr, context)`: copy the contents of a Julia `Array` to

diff --git a/src/callback.jl b/src/callback.jl
@@ -28,7 +28,7 @@ end
 """
     every_n_batch(callback :: Function, n :: Int; call_on_0 = false)
 
-A convenient function to construct a callback that runs every ``n`` mini-batches.
+A convenient function to construct a callback that runs every `n` mini-batches.
 
 # Arguments
 * `call_on_0::Bool`: keyword argument, default false. Unless set, the callback
@@ -64,7 +64,7 @@ end
 """
     speedometer(; frequency=50)
 
-Create an :class:`AbstractBatchCallback` that measure the training speed
+Create an `AbstractBatchCallback` that measure the training speed
    (number of samples processed per second) every k mini-batches.
 
 # Arguments
@@ -95,7 +95,7 @@ end
 """
     every_n_epoch(callback :: Function, n :: Int; call_on_0 = false)
 
-A convenient function to construct a callback that runs every ``n`` full data-passes.
+A convenient function to construct a callback that runs every `n` full data-passes.
 
 * Int call_on_0: keyword argument, default false. Unless set, the callback
           will **not** be run on epoch 0. Epoch 0 means no training has been performed
@@ -120,7 +120,7 @@ end
 """
     do_checkpoint(prefix; frequency=1, save_epoch_0=false)
 
-Create an :class:`AbstractEpochCallback` that save checkpoints of the model to disk.
+Create an `AbstractEpochCallback` that save checkpoints of the model to disk.
 The checkpoints can be loaded back later on.
 
 # Arguments

diff --git a/src/context.jl b/src/context.jl
@@ -19,7 +19,7 @@ end
 """
     cpu(dev_id)
 
-Get a CPU context with a specific id. ``cpu()`` is usually the default context for many
+Get a CPU context with a specific id. `cpu()` is usually the default context for many
 operations when no context is specified.
 
 # Arguments

diff --git a/src/executor.jl b/src/executor.jl
@@ -1,7 +1,7 @@
 """
     Executor
 
-An executor is a realization of a symbolic architecture defined by a :class:`SymbolicNode`.
+An executor is a realization of a symbolic architecture defined by a `SymbolicNode`.
 The actual forward and backward computation specified by the network architecture can
 be carried out with an executor.
 """
@@ -68,12 +68,12 @@ end
 """
     bind(sym, ctx, args; args_grad=Dict(), aux_states=Dict(), grad_req=GRAD_WRITE)
 
-Create an :class:`Executor` by binding a :class:`SymbolicNode` to concrete :class:`NDArray`.
+Create an `Executor` by binding a `SymbolicNode` to concrete `NDArray`.
 
 # Arguments
 * `sym::SymbolicNode`: the network architecture describing the computation graph.
 * `ctx::Context`: the context on which the computation should run.
-* `args`: either a list of :class:`NDArray` or a dictionary of name-array pairs. Concrete
+* `args`: either a list of `NDArray` or a dictionary of name-array pairs. Concrete
           arrays for all the inputs in the network architecture. The inputs typically include
           network parameters (weights, bias, filters, etc.), data and labels. See :func:`list_arguments`
           and :func:`infer_shape`.

diff --git a/src/initializer.jl b/src/initializer.jl
@@ -64,7 +64,7 @@ end
 """
     UniformInitializer(scale=0.07)
 
-Construct a :class:`UniformInitializer` with the specified scale.
+Construct a `UniformInitializer` with the specified scale.
 """
 UniformInitializer() = UniformInitializer(0.07)
 
@@ -84,7 +84,7 @@ end
 """
     NormalIninitializer(; mu=0, sigma=0.01)
 
-Construct a :class:`NormalInitializer` with mean ``mu`` and variance ``sigma``.
+Construct a `NormalInitializer` with mean `mu` and variance `sigma`.
 """
 NormalInitializer(; mu=0, sigma=0.01) = NormalInitializer(mu, sigma)
 
@@ -106,9 +106,9 @@ a normal distribution with μ = 0 and σ² or a uniform distribution from -σ to
 Several different ways of calculating the variance are given in the literature or are
 used by various libraries.
 
-* [Bengio and Glorot 2010]: ``mx.XavierInitializer(distribution = mx.xv_uniform, regularization = mx.xv_avg, magnitude = 1)``
-* [K. He, X. Zhang, S. Ren, and J. Sun 2015]: ``mx.XavierInitializer(distribution = mx.xv_gaussian, regularization = mx.xv_in, magnitude = 2)``
-* caffe_avg: ``mx.XavierInitializer(distribution = mx.xv_uniform, regularization = mx.xv_avg, magnitude = 3)``
+* [Bengio and Glorot 2010]: `mx.XavierInitializer(distribution = mx.xv_uniform, regularization = mx.xv_avg, magnitude = 1)`
+* [K. He, X. Zhang, S. Ren, and J. Sun 2015]: `mx.XavierInitializer(distribution = mx.xv_gaussian, regularization = mx.xv_in, magnitude = 2)`
+* caffe_avg: `mx.XavierInitializer(distribution = mx.xv_uniform, regularization = mx.xv_avg, magnitude = 3)`
 """
 
 @enum XavierDistribution xv_uniform xv_normal

diff --git a/src/io.jl b/src/io.jl
@@ -25,7 +25,7 @@ The root type for all data provider. A data provider should implement the follow
    training stage, both *data* and *label* will be feeded into the model, while during
    prediction stage, only *data* is loaded. Otherwise, they could be anything, with any names, and
    of any shapes. The provided data and label names here should match the input names in a target
-   :class:`SymbolicNode`.
+   `SymbolicNode`.
 
    A data provider should also implement the Julia iteration interface, in order to allow iterating
    through the data set. The provider will be called in the following way:
@@ -48,7 +48,7 @@ The root type for all data provider. A data provider should implement the follow
 
    By default, :func:`eachbatch` simply returns the provider itself, so the iterator interface
    is implemented on the provider type itself. But the extra layer of abstraction allows us to
-   implement a data provider easily via a Julia ``Task`` coroutine. See the
+   implement a data provider easily via a Julia `Task` coroutine. See the
    data provider defined in :doc:`the char-lstm example
    </tutorial/char-lstm>` for an example of using coroutine to define data
    providers.
@@ -58,7 +58,7 @@ The detailed interface functions for the iterator API is listed below:
     Base.eltype(provider) -> AbstractDataBatch
 
    :param AbstractDataProvider provider: the data provider.
-   :return: the specific subtype representing a data batch. See :class:`AbstractDataBatch`.
+   :return: the specific subtype representing a data batch. See `AbstractDataBatch`.
 
     Base.start(provider) -> AbstractDataProviderState
 
@@ -91,7 +91,7 @@ case, you can safely assume that
   not be called.
 
 With those assumptions, it will be relatively easy to adapt any existing iterator. See the implementation
-of the built-in :class:`MXDataProvider` for example.
+of the built-in `MXDataProvider` for example.
 
 .. caution::
 
@@ -137,7 +137,7 @@ abstract AbstractDataProviderState
       :return: a vector of data in this batch, should be in the same order as declared in
                :func:`provide_data() <AbstractDataProvider.provide_data>`.
 
-               The last dimension of each :class:`NDArray` should always match the batch_size, even when
+               The last dimension of each `NDArray` should always match the batch_size, even when
                :func:`count_samples` returns a value less than the batch size. In this case,
                the data provider is free to pad the remaining contents with any value.
 
@@ -167,7 +167,7 @@ abstract AbstractDataProviderState
       :type targets: Vector{Vector{SlicedNDArray}}
 
       The targets is a list of the same length as number of data provided by this provider.
-      Each element in the list is a list of :class:`SlicedNDArray`. This list described a
+      Each element in the list is a list of `SlicedNDArray`. This list described a
       spliting scheme of this data batch into different slices, each slice is specified by
       a slice-ndarray pair, where *slice* specify the range of samples in the mini-batch
       that should be loaded into the corresponding *ndarray*.
@@ -189,7 +189,7 @@ abstract AbstractDataBatch
 """
     DataBatch
 
-   A basic subclass of :class:`AbstractDataBatch`, that implement the interface by
+   A basic subclass of `AbstractDataBatch`, that implement the interface by
    accessing member fields.
 """
 type DataBatch <: AbstractDataBatch
@@ -204,7 +204,7 @@ get_label{Provider<:AbstractDataProvider}(::Provider, batch :: DataBatch) = batc
 """
     SlicedNDArray
 
-   A alias type of ``Tuple{UnitRange{Int},NDArray}``.
+   A alias type of `Tuple{UnitRange{Int},NDArray}`.
 """
 typealias SlicedNDArray Tuple{UnitRange{Int},NDArray}
 
@@ -257,7 +257,7 @@ eachbatch(provider :: AbstractDataProvider) = provider
 """
     ArrayDataProvider
 
-   A convenient tool to iterate :class:`NDArray` or Julia ``Array``.
+   A convenient tool to iterate `NDArray` or Julia `Array`.
 """
 type ArrayDataProvider <: AbstractDataProvider
   data_arrays   :: Vector{Array{MX_float}}
@@ -277,26 +277,26 @@ end
 """
     ArrayDataProvider(data[, label]; batch_size, shuffle, data_padding, label_padding)
 
-   Construct a data provider from :class:`NDArray` or Julia Arrays.
+   Construct a data provider from `NDArray` or Julia Arrays.
 
    :param data: the data, could be
 
-          - a :class:`NDArray`, or a Julia Array. This is equivalent to ``:data => data``.
-          - a name-data pair, like ``:mydata => array``, where ``:mydata`` is the name of the data
-            and ``array`` is an :class:`NDArray` or a Julia Array.
+          - a `NDArray`, or a Julia Array. This is equivalent to `:data => data`.
+          - a name-data pair, like `:mydata => array`, where `:mydata` is the name of the data
+            and `array` is an `NDArray` or a Julia Array.
           - a list of name-data pairs.
 
-   :param label: the same as the ``data`` parameter. When this argument is omitted, the constructed
+   :param label: the same as the `data` parameter. When this argument is omitted, the constructed
           provider will provide no labels.
    :param Int batch_size: the batch size, default is 0, which means treating the whole array as a
           single mini-batch.
    :param Bool shuffle: turn on if the data should be shuffled at every epoch.
    :param Real data_padding: when the mini-batch goes beyond the dataset boundary, there might
           be less samples to include than a mini-batch. This value specify a scalar to pad the
           contents of all the missing data points.
-   :param Real label_padding: the same as ``data_padding``, except for the labels.
+   :param Real label_padding: the same as `data_padding`, except for the labels.
 
-   TODO: remove ``data_padding`` and ``label_padding``, and implement rollover that copies
+   TODO: remove `data_padding` and `label_padding`, and implement rollover that copies
    the last or first several training samples to feed the padding.
 """
 # Julia's type system is sometimes very frustrating. You cannot specify a function
@@ -563,16 +563,16 @@ function _define_data_iter_creator(hdr :: MX_handle; gen_docs::Bool=false)
 
   if gen_docs
     if endswith(string(iter_name), "Iter")
-      f_desc = "Can also be called with the alias ``$(string(iter_name)[1:end-4] * "Provider")``.\n"
+      f_desc = "Can also be called with the alias `$(string(iter_name)[1:end-4] * "Provider")`.\n"
     else
       f_desc = ""
     end
     f_desc *= unsafe_string(ref_desc[]) * "\n\n"
-    f_desc *= ":param Base.Symbol data_name: keyword argument, default ``:data``. The name of the data.\n"
-    f_desc *= ":param Base.Symbol label_name: keyword argument, default ``:softmax_label``. " *
-              "The name of the label. Could be ``nothing`` if no label is presented in this dataset.\n\n"
+    f_desc *= ":param Base.Symbol data_name: keyword argument, default `:data`. The name of the data.\n"
+    f_desc *= ":param Base.Symbol label_name: keyword argument, default `:softmax_label`. " *
+              "The name of the label. Could be `nothing` if no label is presented in this dataset.\n\n"
     f_desc *= _format_docstring(Int(ref_narg[]), ref_arg_names, ref_arg_types, ref_arg_descs)
-    f_desc *= ":return: the constructed :class:`MXDataProvider`."
+    f_desc *= ":return: the constructed `MXDataProvider`."
     return (iter_name, f_desc)
   end
 

diff --git a/src/metric.jl b/src/metric.jl
@@ -22,8 +22,8 @@ interfaces.
 
       Get the accumulated metrics.
 
-      :return: ``Vector{Tuple{Base.Symbol, Real}}``, a list of name-value pairs. For
-               example, ``[(:accuracy, 0.9)]``.
+      :return: `Vector{Tuple{Base.Symbol, Real}}`, a list of name-value pairs. For
+               example, `[(:accuracy, 0.9)]`.
 """
 abstract AbstractEvalMetric
 

diff --git a/src/model.jl b/src/model.jl
@@ -48,9 +48,9 @@ end
     FeedForward(arch :: SymbolicNode, ctx)
 
 * arch: the architecture of the network constructed using the symbolic API.
-* ctx: the devices on which this model should do computation. It could be a single :class:`Context`
-               or a list of :class:`Context` objects. In the latter case, data parallelization will be used
-               for training. If no context is provided, the default context ``cpu()`` will be used.
+* ctx: the devices on which this model should do computation. It could be a single `Context`
+               or a list of `Context` objects. In the latter case, data parallelization will be used
+               for training. If no context is provided, the default context `cpu()` will be used.
 """
 function FeedForward(arch :: SymbolicNode; context :: Union{Context, Vector{Context}, Void} = nothing)
   if isa(context, Void)
@@ -74,7 +74,7 @@ end
 * AbstractInitializer initializer: an initializer describing how the weights should be initialized.
 * Bool overwrite: keyword argument, force initialization even when weights already exists.
 * input_shapes: the shape of all data and label inputs to this model, given as keyword arguments.
-                        For example, ``data=(28,28,1,100), label=(100,)``.
+                        For example, `data=(28,28,1,100), label=(100,)`.
 """
 function init_model(self :: FeedForward, initializer :: AbstractInitializer; overwrite::Bool=false, input_shapes...)
   # all arg names, including data, label, and parameters
@@ -177,12 +177,12 @@ end
 
 * FeedForward self: the model.
 * AbstractDataProvider data: the data to perform prediction on.
-* Bool overwrite: an :class:`Executor` is initialized the first time predict is called. The memory
-                          allocation of the :class:`Executor` depends on the mini-batch size of the test
+* Bool overwrite: an `Executor` is initialized the first time predict is called. The memory
+                          allocation of the `Executor` depends on the mini-batch size of the test
                           data provider. If you call predict twice with data provider of the same batch-size,
-                          then the executor can be potentially be re-used. So, if ``overwrite`` is false,
-                          we will try to re-use, and raise an error if batch-size changed. If ``overwrite``
-                          is true (the default), a new :class:`Executor` will be created to replace the old one.
+                          then the executor can be potentially be re-used. So, if `overwrite` is false,
+                          we will try to re-use, and raise an error if batch-size changed. If `overwrite`
+                          is true (the default), a new `Executor` will be created to replace the old one.
 
    .. note::
 
@@ -196,9 +196,9 @@ end
 
    .. note::
 
-      If you perform further after prediction. The weights are not automatically synchronized if ``overwrite``
+      If you perform further after prediction. The weights are not automatically synchronized if `overwrite`
       is set to false and the old predictor is re-used. In this case
-      setting ``overwrite`` to true (the default) will re-initialize the predictor the next time you call
+      setting `overwrite` to true (the default) will re-initialize the predictor the next time you call
       predict and synchronize the weights again.
 
    :seealso: :func:`train`, :func:`fit`, :func:`init_model`, :func:`load_checkpoint`
@@ -319,28 +319,28 @@ end
 """
     fit(model :: FeedForward, optimizer, data; kwargs...)
 
-Train the ``model`` on ``data`` with the ``optimizer``.
+Train the `model` on `data` with the `optimizer`.
 
 * FeedForward model: the model to be trained.
 * AbstractOptimizer optimizer: the optimization algorithm to use.
 * AbstractDataProvider data: the training data provider.
 * Int n_epoch: default 10, the number of full data-passes to run.
-* AbstractDataProvider eval_data: keyword argument, default ``nothing``. The data provider for
+* AbstractDataProvider eval_data: keyword argument, default `nothing`. The data provider for
           the validation set.
-* AbstractEvalMetric eval_metric: keyword argument, default ``Accuracy()``. The metric used
-          to evaluate the training performance. If ``eval_data`` is provided, the same metric is also
+* AbstractEvalMetric eval_metric: keyword argument, default `Accuracy()`. The metric used
+          to evaluate the training performance. If `eval_data` is provided, the same metric is also
           calculated on the validation set.
-* kvstore: keyword argument, default ``:local``. The key-value store used to synchronize gradients
+* kvstore: keyword argument, default `:local`. The key-value store used to synchronize gradients
           and parameters when multiple devices are used for training.
-   :type kvstore: :class:`KVStore` or ``Base.Symbol``
-* AbstractInitializer initializer: keyword argument, default ``UniformInitializer(0.01)``.
+   :type kvstore: `KVStore` or `Base.Symbol`
+* AbstractInitializer initializer: keyword argument, default `UniformInitializer(0.01)`.
 * Bool force_init: keyword argument, default false. By default, the random initialization using the
-          provided ``initializer`` will be skipped if the model weights already exists, maybe from a previous
+          provided `initializer` will be skipped if the model weights already exists, maybe from a previous
           call to :func:`train` or an explicit call to :func:`init_model` or :func:`load_checkpoint`. When
           this option is set, it will always do random initialization at the begining of training.
-* callbacks: keyword argument, default ``[]``. Callbacks to be invoked at each epoch or mini-batch,
-          see :class:`AbstractCallback`.
-   :type callbacks: ``Vector{AbstractCallback}``
+* callbacks: keyword argument, default `[]`. Callbacks to be invoked at each epoch or mini-batch,
+          see `AbstractCallback`.
+   :type callbacks: `Vector{AbstractCallback}`
 """
 function fit(self :: FeedForward, optimizer :: AbstractOptimizer, data :: AbstractDataProvider; kwargs...)
   opts = TrainingOptions(; kwargs...)