Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-1209] Tutorial transpose reshape #13208

Merged
merged 14 commits into from
Dec 14, 2018

Conversation

NRauschmayr
Copy link
Contributor

@NRauschmayr NRauschmayr commented Nov 9, 2018

Description

Adding a tutorial that explains the difference between reshape and transpose operators
@ThomasDelteil @Ishitori can you please have a look? Thanks!

http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-13208/3/tutorials/basic/reshape_transpose.html

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)

@NRauschmayr NRauschmayr requested a review from szha as a code owner November 9, 2018 22:47
@anirudhacharya
Copy link
Member

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Nov 12, 2018

As we can see width and height changed, by rotating pixel values by 90 degrees. Transpose does the following:

<img src="https://raw.githubusercontent.com/NRauschmayr/web-data/tutorial_transpose_reshape/mxnet/doc/tutorials/basic/transpose_reshape/transpose.png" style="width:700px;height:300px;">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please reverify this image. Transposing this

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 9. 10. 11. 12.]]

returns the following -

[[ 1.  5.  9.]
 [ 2.  6. 10.]
 [ 3.  7. 11.]
 [ 4.  8. 12.]]

But your diagram does not reflect that.

batch_size = 100
input_data = mx.random.uniform(shape=(20,100,batch_size))
reshaped = input_data.reshape(-1,batch_size)
print inpout_data.shape, reshaped.shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show the result of the print statement.

reshaped = input_data.reshape(-1,batch_size)
print inpout_data.shape, reshaped.shape
```
The reshape function of [MXNet's NDArray API](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html?highlight=reshape#mxnet.ndarray.NDArray.reshape) allows even more advanced transformations: For instance: with -2 you copy all/remainder of the input dimensions to the output shape. With -3 reshape will use the product of two consecutive dimensions of the input shape as the output dim.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the documentation does not describe what -2 and -3 values does. Can this tutorial describe where and how those values are used with reshape. For example, is it with video data where we have 5-d arrays etc..

This was just a toy example. But such transformations are for instance done in image superresolution where you increase width and height of the input image and ```x``` would be the output of a CNN that computes an upscale feature vector.

#### Check out the MXNet documentation for more details
http://mxnet.incubator.apache.org/test/api/python/ndarray.html#mxnet.ndarray.NDArray.reshape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows as just text, it should be a clickable link.

But a bigger point is I don't think we need this section at all. Can these two documentation links be added as a hyperlink to the first occurrence of the terms "Reshape" and "Transpose" at the beginning of the tutorial and remove this section.

You are saying that, "go to the the documentation for more details" but I think this tutorial contains more details than the documentation. :)

can you also please update your PR description with this link - http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-13208/3/tutorials/basic/reshape_transpose.html. It will help reviewers, especially when reviewing tutorials.

@anirudhacharya
Copy link
Member

There are a bunch of links pointing to a personal repo instead of dmlc/web-data. Please replace those.

The rest LGTM.

@simoncorstonoliver
Copy link
Contributor

simoncorstonoliver commented Nov 14, 2018

It would be helpful for users to include some discussion of the common errors that they see e.g. when there's a tensor shape mismatch between layers of a NN. If we included the actual error message then people would find this tutorial when they Googled for the error message. This would segue nicely to the example where you show how you can't just perform these operations to make the error go away; you have to actually know what you're doing.

For users who already know what tensor reshaping is, the information they need (the "how" rather than the "why") is in a few non-obvious places. Maybe add something near the top with pointers to ndarray docs for more comprehensive documentation: https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html

Some of the cases of users asking about mismatch errors on the discussion forum might give good background for the confusion of people who are new to such operations.

@NRauschmayr
Copy link
Contributor Author

I added more examples e.g. common pitfalls and errors
I also changed a couple of minor things.

@@ -0,0 +1,190 @@

## Difference between reshape and transpose operators
Modyfing the shape of tensors is a very common operation in Deep Learning. For instance, when using pretrained neural networks it is often required to adjust input data dimensions to correspond to what the network has been trained on, e.g. tensors of shape `[batch_size, channels, width, height]`. This notebook discusses briefly the difference between the operators [Reshape](http://mxnet.incubator.apache.org/test/api/python/ndarray.html#mxnet.ndarray.NDArray.reshape) and [Transpose](http://mxnet.incubator.apache.org/test/api/python/ndarray.html#mxnet.ndarray.transpose). Both allow to change the shape, however they are not the same and are commonly mistaken.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modyfing -> Modifying

often required to adjust input data dimension --> often necessary to adjust the input data dimension

Both allow to --> Both allow you to

* height: 200 pixels
* colors: 3 (RGB)

Now lets reshape the image in order to exchange width and height dimension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

width and height dimensions

![png](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/basic/transpose_reshape/reshaped_image.png) <!--notebook-skip-line-->


As we can see the first and second dimensions have changed. However the image can't be identified as cat anylonger. In order to understand what happened, let's have a look at the image below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any longer


<img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/basic/transpose_reshape/transpose.png" style="width:700px;height:300px;">

As shown in the diagram, the axis have been flipped: pixel values that have been in the first row are now in the first column.
Copy link
Contributor

@simoncorstonoliver simoncorstonoliver Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the axis have --> the axes have

that have been --> that were

## When to transpose/reshape with MXNet
In this chapter we discuss when transpose and reshape is used in MXNet.
#### Channel first for images
Images are usually stored in the format height, wight, channel. When working with [convolutional](https://mxnet.incubator.apache.org/api/python/gluon/nn.html#mxnet.gluon.nn.Conv1D) layers, MXNet expects the layout to be `NCHW` (batch, channel, height, width). MXNet uses this layout because of performance reasons on the GPU. Consequently, images need to be transposed to have the right format. For instance, you may have a function like the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth calling out that this channel ordering is different from TF?

```
(1, 999, 128) <!--notebook-skip-line-->
#### Advanced reshaping with MXNet ndarrays
It is sometimes useful to automatically infer the shape of tensors. Especially when you deal with very deep neural networks, it may not always be clear what the shape of a tensor is after a specific layer. For instance you may want the tensor to be two-dimensional where one dimension is the known batch_size. With ```mx.nd.array(-1, batch_size)``` the first dimension will be automatically inferred. Here a simplified example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here a simplified --> Here is a simplified

[...]

```
This is happening when you your data does not have the shape ```[batch_size, channel, width, height]``` e.g. your data may be a one-dimensional vector or when the color channel may be the last dimension instead of the second one.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is happening --> This happens

@vandanavk
Copy link
Contributor

LGTM.

@NRauschmayr Could you trigger the CI again?

@simoncorstonoliver @ThomasDelteil for another round of review

@@ -1,7 +1,6 @@
## Difference between reshape and transpose operators

What does it mean if MXNet gives you an error like the this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @NRauschmayr , There's also another way of pushing an empty commit in case you need to re-trigger the CI.
You can try git commit --allow-empty -m "<commit-message>" :)

@NRauschmayr
Copy link
Contributor Author

Can we merge?

@ThomasDelteil
Copy link
Contributor

@NRauschmayr checking with @marcoabreu why the build status was not propagated back to the PR status check, which is necessary for merging

@NRauschmayr
Copy link
Contributor Author

Any update on this?

@ThomasDelteil ThomasDelteil merged commit 77fe96e into apache:master Dec 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants