Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking Memory Consumption of Optimizers Adam v.s. Adan #20

Open
SivilTaram opened this issue Jan 25, 2023 · 0 comments
Open

Benchmarking Memory Consumption of Optimizers Adam v.s. Adan #20

SivilTaram opened this issue Jan 25, 2023 · 0 comments

Comments

@SivilTaram
Copy link

Benchmarking Results

The memory benchmarking is conducted based on the following config:

  • vocab size: 49280
  • batch size: 1
  • sequence length: 2048
Head Layers Emb. Dim Model Size (MB) Adam Peak (MB) Adan Peak (MB) $\Delta$ (%)
6 6 768 81 4490 4490 0
12 6 768 81 5848 5848 0
16 6 768 81 6776 6776 0
6 12 768 124 7151 7153 0.03
12 12 768 124 9869 9871 0.02
16 12 768 124 11733 11735 0.02
16 6 1024 128 7302 7304 0.03
16 12 1024 203 12719 12721 0.02
6 24 768 209 12471 12475 0.03
12 24 768 209 17922 17922 0
16 24 768 209 21596 21600 0.02
6 6 1536 248 6905 8241 19.35
12 6 1536 248 8235 8539 3.69
16 6 1536 248 9141 9445 3.33
16 24 1024 354 23530 23534 0.02
16 6 2048 407 11098 12159 9.56
6 12 1536 418 11137 13778 23.71
12 12 1536 418 13390 14164 5.78
16 12 1536 418 15667 15976 1.97
16 6 2560 603 13967 18207 30.36
16 12 2048 709 18851 20954 11.16
6 24 1536 758 19660 24819 26.24
12 24 1536 758 25096 25406 1.24
16 24 1536 758 28720 29030 1.08
16 12 2560 1075 28475 32134 12.85
16 24 2048 1313 34357 38595 12.34

Conclusion

  • The extra memory consumption does not increase linearly with the size of the model.
  • In most cases Adan's additional memory footprint does not exceed 10%.
  • However, when the embedding dimension (Emb. Dim) increases, the probability that Adan's extra memory is larger also increases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant