Skip to content

Exploring LoRA Adapters & Low Precision Training in S1 for Enhanced Test-Time Scaling #101

Open
@goravaa

Description

@goravaa

Hi everyone,

I'm really interested in the S1 approach—especially its test-time scaling using "budget forcing" with the s1-32B model. I’m exploring the possibility of combining this method with parameter-efficient fine-tuning techniques like LoRA adapters, particularly in a low precision training setup.

My Questions/Goals:

  1. Has anyone attempted to integrate LoRA adapters into the S1 framework while leveraging low precision training?
  2. What challenges might we face when combining LoRA with test-time compute enhancements like budget forcing?
  3. Could low precision training with LoRA adapters offer further performance gains or efficiencies, especially on tasks such as mathematical reasoning?

I believe this experiment could provide valuable insights into enhancing language model performance in resource-constrained environments. I’d love to hear if anyone has tried this or has thoughts/suggestions on how to proceed.

Looking forward to the community's feedback!

Thanks,
Gaurav Yadav.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions