Add optional RMSNorm support to BitNet quantization (config + layers) #38087

Codys12 · 2025-05-12T15:15:43Z

What does this PR do?

Adds optional RMSNorm support to BitNet-style quantisation.

Introduces use_rms_norm (bool, default False) and rms_norm_eps (float, default 1e-6) to BitNetQuantConfig so the flag is serialisable via save_pretrained / from_pretrained.
Updates BitLinear and AutoBitLinear to accept use_rms_norm and apply the reference BitNetRMSNorm to activations before quantisation.

Before submitting

I read the contributor guideline, PR section.
I’ve added the new config fields to to_dict, docstrings, and the model card.
New unit tests
Ran make style && make quality && make test locally.
Documentation build passes (make docs) – pushed logs to CI.

Motivation and context

RMSNorm stabilises the activations of low-bit networks; the BitNet paper shows a consistent perplexity drop when normalising pre-quant activations. This PR brings parity with the reference implementation while keeping the previous behaviour as default.

No new external dependencies.

Who can review?

Quantization / Accelerate folks for the code:
@SunMarc @MekkCyber

Docstrings & config: @stevhliu

Feel free to jump in with any feedback!

github-actions · 2025-05-12T15:15:57Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

SunMarc

Can you share a bit about motivation for such feature ?

Codys12 · 2025-05-12T16:30:39Z

@SunMarc
Sure! I recently discovered that initializing an extra RMS norm in each Linear lets you finetune existing LLMs to BitNet format and tested this for https://huggingface.co/codys12/bitnet-r1-32b and https://huggingface.co/codys12/bitnet-r1-8b. This process should work for any model, and this is a minimal implementation for easy conversion.

MekkCyber · 2025-05-13T10:09:41Z

Hi @Codys12, thanks for the pr 🤗 ! I'm not sure I understand the idea behind adding an extra rmsnorm, in the current bitnet implementation, the authors already added that in the modelling, for example MLP looks like :

class BitNetMLP(GemmaMLP):
    def __init__(self, config: BitNetConfig):
        super().__init__(config)
        self.ffn_sub_norm = BitNetRMSNorm(config.intermediate_size, eps=config.rms_norm_eps)

    def forward(self, x):
        down_proj = self.down_proj(self.ffn_sub_norm(self.act_fn(self.gate_proj(x)) * self.up_proj(x)))
        return down_proj

Codys12 · 2025-05-13T11:39:28Z

Hi @Codys12, thanks for the pr 🤗 ! I'm not sure I understand the idea behind adding an extra rmsnorm, in the current bitnet implementation, the authors already added that in the modelling, for example MLP looks like :
class BitNetMLP(GemmaMLP):
    def __init__(self, config: BitNetConfig):
        super().__init__(config)
        self.ffn_sub_norm = BitNetRMSNorm(config.intermediate_size, eps=config.rms_norm_eps)

    def forward(self, x):
        down_proj = self.down_proj(self.ffn_sub_norm(self.act_fn(self.gate_proj(x)) * self.up_proj(x)))
        return down_proj

@MekkCyber @SunMarc Good point!

In section 2/2.1 of the origininal BitNet paper (https://arxiv.org/pdf/2310.11453), the authors describe the reason for including the RMS: it improves performance of the models at negligable compute/parameter costs. They include it in their modeling file, but others (see here) have tested this with alternative architectures and observed an improvement (Llama, Mistral, DeepSeek V3, etc).

This change in the Quantization Config is a model-agnostic approach for introducing this parameter so that a new modeling_*.py file is not required for every model you want to test this way.

Additionally, the inclusion of this norm allows you to finetune existing models to this quantization (see here) as demonstrated by https://huggingface.co/codys12/bitnet-r1-32b and https://huggingface.co/codys12/bitnet-r1-8b.

Let me know if there is anything that is still unclear!

MekkCyber

Thanks for the explanation @Codys12, I see the idea behind this !

src/transformers/integrations/bitnet.py

SunMarc

Thanks ! Just a couple of nits

src/transformers/integrations/bitnet.py

Codys12 · 2025-05-13T14:49:25Z

Thanks ! Just a couple of nits

@SunMarc Just made these changes, let me know if there is anything else I can do before merge!

MekkCyber · 2025-05-13T14:54:30Z

Thanks @Codys12, can you please run make fix-copies and make style to fix CI 🤗

Codys12 · 2025-05-13T22:04:40Z

@MekkCyber
make fix-copies is passing, but the CI tests are still failing... Should I add it to to OBJECTS_TO_IGNORE in utils/check_docstrings.py?

Codys12 · 2025-05-14T13:27:18Z

@MekkCyber @SunMarc Any ideas on CI here? Looking to help this move forward today

SunMarc

Thanks !

src/transformers/convert_slow_tokenizer.py

SunMarc · 2025-05-14T14:47:45Z

try to run make fix-copies but remove the default values for the arguments you added in BitNetQuantConfig. I think the issue is there.

Traceback (most recent call last):
  File "/root/transformers/utils/check_docstrings.py", line 1467, in <module>
    check_docstrings(overwrite=args.fix_and_overwrite, check_all=args.check_all)
  File "/root/transformers/utils/check_docstrings.py", line 1456, in check_docstrings
    raise ValueError(error_message)
ValueError: There was at least one problem when checking docstrings of public objects.
The following objects docstrings do not match their signature. Run `make fix-copies` to fix this. In some cases, this error may be raised incorrectly by the docstring checker. If you think this is the case, you can manually check the docstrings and then add the object name to `OBJECTS_TO_IGNORE` in `utils/check_docstrings.py`.
- BitNetQuantConfig

Codys12 · 2025-05-14T16:40:53Z

try to run make fix-copies but remove the default values for the arguments you added in BitNetQuantConfig. I think the issue is there.

Traceback (most recent call last):
  File "/root/transformers/utils/check_docstrings.py", line 1467, in <module>
    check_docstrings(overwrite=args.fix_and_overwrite, check_all=args.check_all)
  File "/root/transformers/utils/check_docstrings.py", line 1456, in check_docstrings
    raise ValueError(error_message)
ValueError: There was at least one problem when checking docstrings of public objects.
The following objects docstrings do not match their signature. Run `make fix-copies` to fix this. In some cases, this error may be raised incorrectly by the docstring checker. If you think this is the case, you can manually check the docstrings and then add the object name to `OBJECTS_TO_IGNORE` in `utils/check_docstrings.py`.
- BitNetQuantConfig

@SunMarc Hmm, I changed it to optional but running make fix-copies is not doing anything. It might have to do with running the CI loop on another device that I am pulling my fork into... Do you have any experience with getting the formatting fixed?

Codys12 · 2025-05-14T16:41:18Z

Wait, all tests are passing sick

HuggingFaceDocBuilderDev · 2025-05-16T10:37:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

MekkCyber · 2025-05-16T10:38:20Z

Thanks for the pr 🤗 !

steinmetzc and others added 4 commits May 8, 2025 11:47

enable optional RMS in BitLinear

adf2d57

Merge branch 'huggingface:main' into main

363bcb5

Fix naming

35da4a6

Merge branch 'main' of https://github.com/Codys12/transformers

119e0e1

github-actions bot marked this pull request as draft May 12, 2025 15:15

SunMarc reviewed May 12, 2025

View reviewed changes

MekkCyber approved these changes May 13, 2025

View reviewed changes

src/transformers/integrations/bitnet.py Outdated Show resolved Hide resolved

src/transformers/integrations/bitnet.py Outdated Show resolved Hide resolved

src/transformers/integrations/bitnet.py Outdated Show resolved Hide resolved

MekkCyber marked this pull request as ready for review May 13, 2025 13:59

SunMarc reviewed May 13, 2025

View reviewed changes

src/transformers/integrations/bitnet.py Outdated Show resolved Hide resolved

src/transformers/integrations/bitnet.py Outdated Show resolved Hide resolved

src/transformers/integrations/bitnet.py Outdated Show resolved Hide resolved

Import RMS from Llama using config.*

6eaf9d1

steinmetzc and others added 2 commits May 13, 2025 13:03

make fix-copies

d988db8

ran CI loop

892170e

rhjdvsgsgks mentioned this pull request May 14, 2025

ggml-quants : ternary packing for TriLMs and BitNet b1.58 ggml-org/llama.cpp#8151

Merged

15 tasks

SunMarc approved these changes May 14, 2025

View reviewed changes

src/transformers/convert_slow_tokenizer.py Show resolved Hide resolved

steinmetzc added 4 commits May 14, 2025 10:06

remove default BitNetQuantConfig values

3db37fc

Fix BitNetQuantConfig to be Optional

d2322cd

Fix config docstrings to match Optoinal

7380aa9

Edit docstrings to match standards

f7c2afa

Merge branch 'main' into main

08ca087

MekkCyber merged commit 1e921a3 into huggingface:main May 16, 2025
20 checks passed

Add optional RMSNorm support to BitNet quantization (config + layers) #38087

Add optional RMSNorm support to BitNet quantization (config + layers) #38087

Uh oh!

Conversation

Codys12 commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Motivation and context

Who can review?

Uh oh!

github-actions bot commented May 12, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Codys12 commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MekkCyber commented May 13, 2025

Uh oh!

Codys12 commented May 13, 2025

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Codys12 commented May 13, 2025

Uh oh!

MekkCyber commented May 13, 2025

Uh oh!

Codys12 commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Codys12 commented May 14, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SunMarc commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Codys12 commented May 14, 2025

Uh oh!

Codys12 commented May 14, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 16, 2025

Uh oh!

Uh oh!

MekkCyber commented May 16, 2025

Uh oh!

Uh oh!

Codys12 commented May 12, 2025 •

edited

Loading

Codys12 commented May 12, 2025 •

edited

Loading

Codys12 commented May 13, 2025 •

edited

Loading

SunMarc commented May 14, 2025 •

edited

Loading