Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][Training] Add more missing gradients #6767

Merged
merged 5 commits into from
Oct 28, 2020
Merged

Conversation

altanh
Copy link
Contributor

@altanh altanh commented Oct 26, 2020

Added the following gradients:

  • take
  • reverse_reshape
  • stack
  • squeeze
  • expand_dims
  • arange

Also fixed a typo in type solver diagnostics. I had to use a Relay loop in take_grad to support Any size in indices.

cc @t-vi @SWu @MarisaKirisame

@jroesch
Copy link
Member

jroesch commented Oct 26, 2020

@antinucleon

@antinucleon
Copy link
Contributor

Please try to pass lint locally first~



@register_gradient("take")
def take_grad(orig, grad):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can get by by defining a 'put' operator, that put a scalar into an index of a tensor, and leave other palces unchanged. put and take has some classic property which I assume will be better for the optimizer. It also allow other optimization (e.g. put and reduce_sum, using grad + (put vala at idxa in 0_array) + (put valb at idxb in 0_array) will be collapsed into a long chain of put on grad, allowing COW to kick in and all take grad mutation update (instead of creating another tensor).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jroesch please look and comment as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point that I was wondering about. The loop is basically just implementing a put operation (like I described in the comment), so it would make sense to have it be a separate op since I imagine it will be useful in general. Should I remove this gradient for now, or keep it and replace it with put once I implement it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are fine.

@tqchen tqchen merged commit 39cd612 into apache:main Oct 28, 2020
@tqchen
Copy link
Member

tqchen commented Oct 28, 2020

merging per reviews, feel free to send followup improvements per @MarisaKirisame 's comment

Thanks @jroesch @altanh @antinucleon @MarisaKirisame

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Dec 2, 2020
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Dec 4, 2020
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants