Skip to content

Pull requests: patrick-toulme/axlearn

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

neuron changes for 1B,3B,8B models
#45 opened Jan 2, 2025 by aws-mengchiy Loading…
skip previous trained batches
#43 opened Dec 20, 2024 by aws-zhenguo Loading…
skip previous trained batches
#42 opened Dec 20, 2024 by aws-zhenguo Loading…
imported os and added ckpt scripts
#41 opened Dec 19, 2024 by dgourab-aws Loading…
Use default remat policy
#36 opened Dec 13, 2024 by apoorvtintin Loading…
resume training with next batch of data
#32 opened Dec 11, 2024 by aws-zhenguo Loading…
logit_bias support for NEW_UNSHARDED_ATTN_KERNEL
#24 opened Dec 4, 2024 by HahTK Loading…
Jit cache
#23 opened Nov 26, 2024 by amithrm Loading…
Multi graph gradient accumulation
#5 opened Apr 2, 2024 by apoorvtintin Loading…
gradient accumulation using optax multisteps*
#2 opened Mar 25, 2024 by apoorvtintin Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.