Open
Description
Hi everyone,
I'm really interested in the S1 approach—especially its test-time scaling using "budget forcing" with the s1-32B model. I’m exploring the possibility of combining this method with parameter-efficient fine-tuning techniques like LoRA adapters, particularly in a low precision training setup.
My Questions/Goals:
- Has anyone attempted to integrate LoRA adapters into the S1 framework while leveraging low precision training?
- What challenges might we face when combining LoRA with test-time compute enhancements like budget forcing?
- Could low precision training with LoRA adapters offer further performance gains or efficiencies, especially on tasks such as mathematical reasoning?
I believe this experiment could provide valuable insights into enhancing language model performance in resource-constrained environments. I’d love to hear if anyone has tried this or has thoughts/suggestions on how to proceed.
Looking forward to the community's feedback!
Thanks,
Gaurav Yadav.
Metadata
Metadata
Assignees
Labels
No labels