Exploring LoRA Adapters & Low Precision Training in S1 for Enhanced Test-Time Scaling

Hi everyone,

I'm really interested in the S1 approach—especially its test-time scaling using "budget forcing" with the s1-32B model. I’m exploring the possibility of combining this method with parameter-efficient fine-tuning techniques like LoRA adapters, particularly in a low precision training setup.

My Questions/Goals:

1. Has anyone attempted to integrate LoRA adapters into the S1 framework while leveraging low precision training?
2. What challenges might we face when combining LoRA with test-time compute enhancements like budget forcing?
3. Could low precision training with LoRA adapters offer further performance gains or efficiencies, especially on tasks such as mathematical reasoning?

I believe this experiment could provide valuable insights into enhancing language model performance in resource-constrained environments. I’d love to hear if anyone has tried this or has thoughts/suggestions on how to proceed.

Looking forward to the community's feedback!

Thanks,
Gaurav Yadav.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exploring LoRA Adapters & Low Precision Training in S1 for Enhanced Test-Time Scaling #101

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Exploring LoRA Adapters & Low Precision Training in S1 for Enhanced Test-Time Scaling #101

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions