Make `learning rate` tensor (Backend) #3287

spcyppt · 2024-10-29T04:05:59Z

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/386

Context problem from Microve:
pt2 adds a guard on the float inputs, and if it is changed, it will be recompiled. Because compilation itself is expensive, each recompilation could take several minutes to >20 mins.
In e2e training, there is a warm up stage where the learning rate is gradually increased to a pre-defined value
e.g., say the final learning rate is 0.02 and the warm step is 10k, learning rate will increase from 0 to 0.02 with a step 0.00002 (each iteration, it increases by 0.00002). So, if we let pt2 recompile, it will recompile 10k times.
For a tensor, the guard is only on its shape; if its shape remains the same, it will not trigger recompilation.

To prevent recompilation, we change learning rate from float to tensor.
This, however, affects existing TBE frontend and backend.

We will enable learning rate being tensor through the new unified interface (D50481991).
For backward compatibility, the old interface (V1), i.e., split_embedding_codegen_lookup_{{ optimizer }}_function and split_embedding_codegen_lookup_{{ optimizer }}_function_cpu will continue to take learning rate as float.

This diff

make learning rate tensor in codegen
keep learning rate as float for kernel arguments
create optional argument to OptimizerArgs for v1 signature
make old interface takes tensor as float and converts to tensor before passing to autograd
converts learning rate back to float before passing to kernels

Old interface:

          python -> C++ lookup -> autograd -> backend -> kernel
lr type:  (float)     (float)     (tensor)   (tensor)   (float)

PT2 unified interface (D50481991):

          python -> C++ lookup -> autograd -> backend -> kernel
lr type:  (tensor)   (tensor)     (tensor)   (tensor)   (float)

Differential Revision: D62784577

facebook-github-bot · 2024-10-29T04:06:10Z

This pull request was exported from Phabricator. Differential Revision: D62784577

netlify · 2024-10-29T04:06:14Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`304ba8e`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6723fd7427a57200084d7415
😎 Deploy Preview	https://deploy-preview-3287--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary: X-link: facebookresearch/FBGEMM#386 Context problem from Microve: pt2 adds a guard on the float inputs, and if it is changed, it will be recompiled. Because compilation itself is expensive, each recompilation could take several minutes to >20 mins. In e2e training, there is a warm up stage where the learning rate is gradually increased to a pre-defined value e.g., say the final learning rate is 0.02 and the warm step is 10k, learning rate will increase from 0 to 0.02 with a step 0.00002 (each iteration, it increases by 0.00002). So, if we let pt2 recompile, it will recompile 10k times. For a tensor, the guard is only on its shape; if its shape remains the same, it will not trigger recompilation. ---- To prevent recompilation, we change learning rate from float to tensor. This, however, affects existing TBE frontend and backend. We will enable learning rate being tensor through the new unified interface (D50481991). For backward compatibility, the old interface (V1), i.e., `split_embedding_codegen_lookup_{{ optimizer }}_function` and `split_embedding_codegen_lookup_{{ optimizer }}_function_cpu` will continue to take learning rate as `float`. This diff - make learning rate tensor in codegen - keep learning rate as float for kernel arguments - create optional argument to OptimizerArgs for v1 signature - make old interface takes tensor as float and converts to tensor before passing to autograd - converts learning rate back to float before passing to kernels Old interface: ``` python -> C++ lookup -> autograd -> backend -> kernel lr type: (float) (float) (tensor) (tensor) (float) ``` PT2 unified interface (D50481991): ``` python -> C++ lookup -> autograd -> backend -> kernel lr type: (tensor) (tensor) (tensor) (tensor) (float) ``` Reviewed By: q10 Differential Revision: D62784577

facebook-github-bot · 2024-10-29T21:48:23Z

This pull request was exported from Phabricator. Differential Revision: D62784577

Summary: X-link: facebookresearch/FBGEMM#386 Context problem from Microve: pt2 adds a guard on the float inputs, and if it is changed, it will be recompiled. Because compilation itself is expensive, each recompilation could take several minutes to >20 mins. In e2e training, there is a warm up stage where the learning rate is gradually increased to a pre-defined value e.g., say the final learning rate is 0.02 and the warm step is 10k, learning rate will increase from 0 to 0.02 with a step 0.00002 (each iteration, it increases by 0.00002). So, if we let pt2 recompile, it will recompile 10k times. For a tensor, the guard is only on its shape; if its shape remains the same, it will not trigger recompilation. ---- To prevent recompilation, we change learning rate from float to tensor. This, however, affects existing TBE frontend and backend. We will enable learning rate being tensor through the new unified interface (D50481991). For backward compatibility, the old interface (V1), i.e., `split_embedding_codegen_lookup_{{ optimizer }}_function` and `split_embedding_codegen_lookup_{{ optimizer }}_function_cpu` will continue to take learning rate as `float`. This diff - make learning rate tensor in codegen - keep learning rate as float for kernel arguments - create optional argument to OptimizerArgs for v1 signature - make old interface takes tensor as float and converts to tensor before passing to autograd - converts learning rate back to float before passing to kernels Old interface: ``` python -> C++ lookup -> autograd -> backend -> kernel lr type: (float) (float) (tensor) (tensor) (float) ``` PT2 unified interface (D50481991): ``` python -> C++ lookup -> autograd -> backend -> kernel lr type: (tensor) (tensor) (tensor) (tensor) (float) ``` Reviewed By: q10 Differential Revision: D62784577

facebook-github-bot · 2024-10-31T21:58:19Z

This pull request was exported from Phabricator. Differential Revision: D62784577

facebook-github-bot · 2024-11-01T01:41:44Z

This pull request has been merged in fc822f2.

facebook-github-bot · 2024-11-01T23:16:24Z

This pull request has been reverted by dab5144.

Summary: X-link: pytorch#3287 Pull Request resolved: facebookresearch/FBGEMM#386 Context problem from Microve: pt2 adds a guard on the float inputs, and if it is changed, it will be recompiled. Because compilation itself is expensive, each recompilation could take several minutes to >20 mins. In e2e training, there is a warm up stage where the learning rate is gradually increased to a pre-defined value e.g., say the final learning rate is 0.02 and the warm step is 10k, learning rate will increase from 0 to 0.02 with a step 0.00002 (each iteration, it increases by 0.00002). So, if we let pt2 recompile, it will recompile 10k times. For a tensor, the guard is only on its shape; if its shape remains the same, it will not trigger recompilation. ---- To prevent recompilation, we change learning rate from float to tensor. This, however, affects existing TBE frontend and backend. We will enable learning rate being tensor through the new unified interface (D50481991). For backward compatibility, the old interface (V1), i.e., `split_embedding_codegen_lookup_{{ optimizer }}_function` and `split_embedding_codegen_lookup_{{ optimizer }}_function_cpu` will continue to take learning rate as `float`. This diff - make learning rate tensor in codegen - keep learning rate as float for kernel arguments - create optional argument to OptimizerArgs for v1 signature - make old interface takes tensor as float and converts to tensor before passing to autograd - converts learning rate back to float before passing to kernels Old interface: ``` python -> C++ lookup -> autograd -> backend -> kernel lr type: (float) (float) (tensor) (tensor) (float) ``` PT2 unified interface (D50481991): ``` python -> C++ lookup -> autograd -> backend -> kernel lr type: (tensor) (tensor) (tensor) (tensor) (float) ``` Reviewed By: q10, egienvalue Differential Revision: D62784577 fbshipit-source-id: 11e43f1103ceab6220d1736b68df5d581a17c7fc

facebook-github-bot added the cla signed label Oct 29, 2024

facebook-github-bot added the fb-exported label Oct 29, 2024

spcyppt force-pushed the export-D62784577 branch from c69cd9b to 302de99 Compare October 29, 2024 21:48

spcyppt force-pushed the export-D62784577 branch from 302de99 to 304ba8e Compare October 31, 2024 21:58

facebook-github-bot closed this in fc822f2 Nov 1, 2024

facebook-github-bot added the Merged label Nov 1, 2024

facebook-github-bot added the Reverted label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make `learning rate` tensor (Backend) #3287

Make `learning rate` tensor (Backend) #3287

Uh oh!

spcyppt commented Oct 29, 2024

Uh oh!

facebook-github-bot commented Oct 29, 2024

Uh oh!

netlify bot commented Oct 29, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Oct 29, 2024

Uh oh!

facebook-github-bot commented Oct 31, 2024

Uh oh!

facebook-github-bot commented Nov 1, 2024

Uh oh!

facebook-github-bot commented Nov 1, 2024

Uh oh!

Uh oh!

Make learning rate tensor (Backend) #3287

Make learning rate tensor (Backend) #3287

Uh oh!

Conversation

spcyppt commented Oct 29, 2024

Uh oh!

facebook-github-bot commented Oct 29, 2024

Uh oh!

netlify bot commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

facebook-github-bot commented Oct 29, 2024

Uh oh!

facebook-github-bot commented Oct 31, 2024

Uh oh!

facebook-github-bot commented Nov 1, 2024

Uh oh!

facebook-github-bot commented Nov 1, 2024

Uh oh!

Uh oh!

Make `learning rate` tensor (Backend) #3287

Make `learning rate` tensor (Backend) #3287

netlify bot commented Oct 29, 2024 •

edited

Loading