Open
Description
Right now our Optimizers are low bit so they save a bunch of memory but considering optimizers can also spike memory it's common to page them out to CPU RAM. There's a prototype of this here #425 and can combo this idea with low bit optimizers
A halfway solution would be making sure our low bit optimizers work well on CPU