How to reduce the parameters size to fit model into gpus, I have already tried 16bit precision but also need to scale the model