-
Notifications
You must be signed in to change notification settings - Fork 6.8k
DataBatch and NDArrayIter doc modified #6091
Conversation
python/mxnet/io.py
Outdated
@@ -80,7 +80,22 @@ def get_list(shapes, types): | |||
return [DataDesc(x[0], x[1]) for x in shapes] | |||
|
|||
class DataBatch(object): | |||
"""A data batch. | |||
"""Returns a batch of data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a class, you are not describing what is does but what it encapsulates.
python/mxnet/io.py
Outdated
"""A data batch. | ||
"""Returns a batch of data. | ||
|
||
MXNet's data iterator returns a batch of data in each `next` call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in -> for
python/mxnet/io.py
Outdated
If not provided, the order of arg_names of the executor is assumed. | ||
When working with Module this is the order of the data_names argument. | ||
The *i*-th element describes the name and shape of ``data[i]``. | ||
If not provided, by default the order of `arg_names` of the executor is assumed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comma after default.
python/mxnet/io.py
Outdated
"""Returns a batch of data. | ||
|
||
MXNet's data iterator returns a batch of data in each `next` call. | ||
This data often contains `batch_size` number of examples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this ever be different from batch_size?
python/mxnet/io.py
Outdated
examples read is less than the batch size. | ||
The number of examples padded at the end of a batch. It is used when the | ||
total number of examples read is not divisible by the `batch_size`. | ||
These are ignored in the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"These extra padded examples are ignored during processing."
Is the above a correct statement to make?
python/mxnet/io.py
Outdated
MXNet's data iterator returns a batch of data for each `next` call. | ||
This data contains `batch_size` number of examples. | ||
|
||
If the input data consists of images then, these images should be stored in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comma issue -> " If the input data consists of images, then these images..."
python/mxnet/io.py
Outdated
>>> labels = np.ones([10, 1]) | ||
>>> dataiter = mx.io.NDArrayIter(datas, labels, 3, True, last_batch_handle='discard') | ||
>>> dataiter | ||
<mxnet.io.NDArrayIter object at 0x10bb2fd90> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it descriptive here to have an example where every component of every example has value 1?
python/mxnet/io.py
Outdated
>>> for batch in dataiter: | ||
... batchidx += 1 | ||
... | ||
>>> batchidx # Padding added after the examples read are over |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this more clear? "Padding added after the examples read are over"
python/mxnet/io.py
Outdated
This data contains `batch_size` number of examples. | ||
|
||
If the input data consists of images then, these images should be stored in a | ||
4-D matrix of shape ``(batch_size, num_channel, height, width)``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on the layout. if provide_data gives DataDesc(layout='NHWC') then its (batch_size,, height, width, num_channel)
python/mxnet/io.py
Outdated
|
||
Example usage: | ||
---------- | ||
>>> class CustomBatch(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we don't want users to do this. Use mx.io.DataBatch
python/mxnet/io.py
Outdated
index : numpy.array, optional | ||
The example indices in this batch. | ||
bucket_key : int, optional | ||
The key of the bucket, used for bucket IO. | ||
The bucket key, used for bucketing module. | ||
provide_data : list of (name, shape), optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is deprecated.
It should be a list of DataDesc now
Addressed all comments. |
python/mxnet/io.py
Outdated
When working with Module this is the order of the label_names argument. | ||
The bucket key, used for bucketing module. | ||
provide_data : list of `DataDesc`, optional | ||
A list of `DataDesc` objects having attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain what the DataDescs are for
python/mxnet/io.py
Outdated
>>> labels = np.ones([10, 1]) | ||
>>> dataiter = mx.io.NDArrayIter(data, labels, 3, True, last_batch_handle='discard') | ||
>>> dataiter | ||
<mxnet.io.NDArrayIter object at 0x10bb2fd90> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we generally don't need to show object pointers like this. Users know what they are because they are just constructed in previous line.
python/mxnet/io.py
Outdated
name, shape, type and layout information of the data. | ||
provide_label : list of `DataDesc`, optional | ||
A list of `DataDesc` objects. `DataDesc` is used to store | ||
name, shape, type and layout information of the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data -> label
The i-th elements describes the ... of data[i]
.
keep this sentence
python/mxnet/io.py
Outdated
If `layout` is set to 'NHWC' then, images should be stored in a 4-D matrix | ||
of shape ``(batch_size, height, width, num_channel)``. | ||
The channels are often in RGB order. | ||
|
||
Parameters | ||
---------- | ||
data : list of NDArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list of NDArray
, each array containing batch_size
examples.
python/mxnet/io.py
Outdated
If `layout` is set to 'NHWC' then, images should be stored in a 4-D matrix | ||
of shape ``(batch_size, height, width, num_channel)``. | ||
The channels are often in RGB order. | ||
|
||
Parameters | ||
---------- | ||
data : list of NDArray | ||
A list of input data. | ||
label : list of NDArray, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list of NDArray
, each array often containing a 1-dimensional array.
* DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review
* DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review
* updated docstring for set_lr_mult and set_wd_mult * updated docstring per review * Fixed imdecode crash bug when flag=0 (#6134) * Fix (#6131) * Docs for MXRecordIO, MXIndexedRecordIO modified (#6013) * docs for MXIndexedRecordIO modified * changes after review * recordIO doc modified * changes after review * lint error * minor change * minor change after review * empty commit to retrigger build * changes after review * Update documentation for mx.callback.Speedometer. (#6058) * Update documentation for mx.callback.Speedometer. * Minor doc changes. * Use module instead of model in example code. * update doc for Load (#6092) * Installation instructions for MacOS and Cloud (#6012) * Fix NDArray bool checking (#6130) * fix shape order bug (#6136) * TOC click unfold (#6133) * [doc] new sphnix plugin (#6105) * update doc * rm * update * update ndarray * update mds * update * update * update * update * update * update * update image.md and others * update * [doc] use debug mode to build (#6151) * move ctc loss to contrib (#6154) * Fix for invalid numpy float indexing (#6144) * Fix python3 compatibilities (#6143) * [doc] small changes to tutorials (#6164) * [doc] Fix left toc link (#6162) * [example]ADD practical functions and options for speech_recognition example (#6141) * ADD practical functions and options for speech_recognition example * add missing stt_bi_graphemes_util.py and deepspeech.cfg template * Added reflection padding (#6123) * Added reflection padding * Lint fix * Added 5d reflection padding * Added failure in forward/backward for input dimensions other than 4 of 5 * Improved sanity check readability * Fixing LICENSE file and adding NOTICE (#6172) * Creating NOTICE. When code moves to Apache, it will need adjusting to the Apache format. * Replacing source header with full license text * doc improvement - softmax, metrics, and initializer (#5945) * doc improvement, softmaxoutput, initializer-constant, minor fixes * doc improvement, metrics * fix softmax doc, fix metric lint * softmax more fixes * add doc change in initializer.py. some minor fix in softmax_cross_entropy * doc change in initializer.py * fix grammer * fix * fix * fix * minor fix * fix * minor fix * DataBatch and NDArrayIter doc modified (#6091) * DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review * [Scala] Change version to 0.9.5-SNAPSHOT (#6173) * [scala] change version to 0.9.5-SNAPSHOT * API doc improvement Dropout and SoftmaxActivation (#6088) * doc improve for dropout oper * doc improve for SoftmaxActivation oper * fix * fix * Update documentation for mx.callback.do_checkpoint (#6059) * Update documentation for mx.callback.do_checkpoint * Use module instead of model for example code. * Update documentation for plot_graph. (#6098) * Update documentation for plot_graph. * Minor doc fix. * Restruct get started (#6167) * Change get started page * Small fix * Improve * Update documentation of Initializer.dumps() (#6128) * Doc Improvement - RMSProp and RMSPropAlex (#6107) * rmsprop * rmsprop alex * add link in optimizer.py * fix * fix * missed fix.. * Docforcs,fft,ifft (#6145) * fft.cc * add all * changed the description of set_lr_mult and set_wd_mult * Explicitly specify quiet in R install_version (#6171)
* DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review
* updated docstring for set_lr_mult and set_wd_mult * updated docstring per review * Fixed imdecode crash bug when flag=0 (apache#6134) * Fix (apache#6131) * Docs for MXRecordIO, MXIndexedRecordIO modified (apache#6013) * docs for MXIndexedRecordIO modified * changes after review * recordIO doc modified * changes after review * lint error * minor change * minor change after review * empty commit to retrigger build * changes after review * Update documentation for mx.callback.Speedometer. (apache#6058) * Update documentation for mx.callback.Speedometer. * Minor doc changes. * Use module instead of model in example code. * update doc for Load (apache#6092) * Installation instructions for MacOS and Cloud (apache#6012) * Fix NDArray bool checking (apache#6130) * fix shape order bug (apache#6136) * TOC click unfold (apache#6133) * [doc] new sphnix plugin (apache#6105) * update doc * rm * update * update ndarray * update mds * update * update * update * update * update * update * update image.md and others * update * [doc] use debug mode to build (apache#6151) * move ctc loss to contrib (apache#6154) * Fix for invalid numpy float indexing (apache#6144) * Fix python3 compatibilities (apache#6143) * [doc] small changes to tutorials (apache#6164) * [doc] Fix left toc link (apache#6162) * [example]ADD practical functions and options for speech_recognition example (apache#6141) * ADD practical functions and options for speech_recognition example * add missing stt_bi_graphemes_util.py and deepspeech.cfg template * Added reflection padding (apache#6123) * Added reflection padding * Lint fix * Added 5d reflection padding * Added failure in forward/backward for input dimensions other than 4 of 5 * Improved sanity check readability * Fixing LICENSE file and adding NOTICE (apache#6172) * Creating NOTICE. When code moves to Apache, it will need adjusting to the Apache format. * Replacing source header with full license text * doc improvement - softmax, metrics, and initializer (apache#5945) * doc improvement, softmaxoutput, initializer-constant, minor fixes * doc improvement, metrics * fix softmax doc, fix metric lint * softmax more fixes * add doc change in initializer.py. some minor fix in softmax_cross_entropy * doc change in initializer.py * fix grammer * fix * fix * fix * minor fix * fix * minor fix * DataBatch and NDArrayIter doc modified (apache#6091) * DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review * [Scala] Change version to 0.9.5-SNAPSHOT (apache#6173) * [scala] change version to 0.9.5-SNAPSHOT * API doc improvement Dropout and SoftmaxActivation (apache#6088) * doc improve for dropout oper * doc improve for SoftmaxActivation oper * fix * fix * Update documentation for mx.callback.do_checkpoint (apache#6059) * Update documentation for mx.callback.do_checkpoint * Use module instead of model for example code. * Update documentation for plot_graph. (apache#6098) * Update documentation for plot_graph. * Minor doc fix. * Restruct get started (apache#6167) * Change get started page * Small fix * Improve * Update documentation of Initializer.dumps() (apache#6128) * Doc Improvement - RMSProp and RMSPropAlex (apache#6107) * rmsprop * rmsprop alex * add link in optimizer.py * fix * fix * missed fix.. * Docforcs,fft,ifft (apache#6145) * fft.cc * add all * changed the description of set_lr_mult and set_wd_mult * Explicitly specify quiet in R install_version (apache#6171)
* DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review
* updated docstring for set_lr_mult and set_wd_mult * updated docstring per review * Fixed imdecode crash bug when flag=0 (apache#6134) * Fix (apache#6131) * Docs for MXRecordIO, MXIndexedRecordIO modified (apache#6013) * docs for MXIndexedRecordIO modified * changes after review * recordIO doc modified * changes after review * lint error * minor change * minor change after review * empty commit to retrigger build * changes after review * Update documentation for mx.callback.Speedometer. (apache#6058) * Update documentation for mx.callback.Speedometer. * Minor doc changes. * Use module instead of model in example code. * update doc for Load (apache#6092) * Installation instructions for MacOS and Cloud (apache#6012) * Fix NDArray bool checking (apache#6130) * fix shape order bug (apache#6136) * TOC click unfold (apache#6133) * [doc] new sphnix plugin (apache#6105) * update doc * rm * update * update ndarray * update mds * update * update * update * update * update * update * update image.md and others * update * [doc] use debug mode to build (apache#6151) * move ctc loss to contrib (apache#6154) * Fix for invalid numpy float indexing (apache#6144) * Fix python3 compatibilities (apache#6143) * [doc] small changes to tutorials (apache#6164) * [doc] Fix left toc link (apache#6162) * [example]ADD practical functions and options for speech_recognition example (apache#6141) * ADD practical functions and options for speech_recognition example * add missing stt_bi_graphemes_util.py and deepspeech.cfg template * Added reflection padding (apache#6123) * Added reflection padding * Lint fix * Added 5d reflection padding * Added failure in forward/backward for input dimensions other than 4 of 5 * Improved sanity check readability * Fixing LICENSE file and adding NOTICE (apache#6172) * Creating NOTICE. When code moves to Apache, it will need adjusting to the Apache format. * Replacing source header with full license text * doc improvement - softmax, metrics, and initializer (apache#5945) * doc improvement, softmaxoutput, initializer-constant, minor fixes * doc improvement, metrics * fix softmax doc, fix metric lint * softmax more fixes * add doc change in initializer.py. some minor fix in softmax_cross_entropy * doc change in initializer.py * fix grammer * fix * fix * fix * minor fix * fix * minor fix * DataBatch and NDArrayIter doc modified (apache#6091) * DataBatch and NDArrayIter doc modified * fixes after review * fixes after review * wording changed * some more fixes * improvement * desc fix * Datadesc info added * minor addition * fix * fix * fix after review * [Scala] Change version to 0.9.5-SNAPSHOT (apache#6173) * [scala] change version to 0.9.5-SNAPSHOT * API doc improvement Dropout and SoftmaxActivation (apache#6088) * doc improve for dropout oper * doc improve for SoftmaxActivation oper * fix * fix * Update documentation for mx.callback.do_checkpoint (apache#6059) * Update documentation for mx.callback.do_checkpoint * Use module instead of model for example code. * Update documentation for plot_graph. (apache#6098) * Update documentation for plot_graph. * Minor doc fix. * Restruct get started (apache#6167) * Change get started page * Small fix * Improve * Update documentation of Initializer.dumps() (apache#6128) * Doc Improvement - RMSProp and RMSPropAlex (apache#6107) * rmsprop * rmsprop alex * add link in optimizer.py * fix * fix * missed fix.. * Docforcs,fft,ifft (apache#6145) * fft.cc * add all * changed the description of set_lr_mult and set_wd_mult * Explicitly specify quiet in R install_version (apache#6171)
@mli @zackchase @madjam @nswamy @indhub @jiajiechen Please take a look.