Add gradient accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) #210

SamitHuang · 2023-04-24T09:19:42Z

…de after accumulate, consistent lr scheduling fixed)

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

You have read the Contributing Guidelines on pull requests
Your code builds clean without any errors or warnings
You are using approved terminology
You have added unit tests

Motivation

Add gradient accumulation for training with large batch size under limited memory. Improved over previous PR: grad reduce after accumulate, make lr scheduling consistent by stepping without updating network when accumulating.
Add grad clip to configure more stable training.
Update readme

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

…de after accumulate, consistent lr scheduling fixed)

SamitHuang requested review from zhtmike and HaoyangLee April 24, 2023 09:20

zhtmike approved these changes Apr 24, 2023

View reviewed changes

add gradident accumulation and grad clip and update readme (grad recu…

4fefdb1

…de after accumulate, consistent lr scheduling fixed)

SamitHuang force-pushed the accu_fix2 branch from da13a45 to 4fefdb1 Compare April 24, 2023 09:26

HaoyangLee approved these changes Apr 24, 2023

View reviewed changes

HaoyangLee merged commit 2fdf534 into mindspore-lab:main Apr 24, 2023

SamitHuang changed the title ~~Add gradident accumulation and grad clip and update readme (grad recu…~~ Add gradident accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) Apr 24, 2023

SamitHuang changed the title ~~Add gradident accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent)~~ Add gradient accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) Apr 24, 2023

colawyee pushed a commit that referenced this pull request Jan 2, 2024

support unclip-train (#210)

cda5602

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add gradient accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) #210

Add gradient accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) #210

Uh oh!

SamitHuang commented Apr 24, 2023 •

edited

Loading

Uh oh!

Uh oh!

Add gradient accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) #210

Add gradient accumulation and grad clip and update readme (grad reduce after accumulate, make lr scheduling consistent) #210

Uh oh!

Conversation

SamitHuang commented Apr 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Test Plan

Related Issues and PRs

Uh oh!

Uh oh!

SamitHuang commented Apr 24, 2023 •

edited

Loading