When will we support NVFP4? #16668

dywszaka · 2025-10-19T16:21:05Z

dywszaka
Oct 19, 2025

Recently, I’ve been trying to run an NVFP4 model using llama.cpp, so I started digging into its source code. From what I’ve seen, it seems possible to make it work by adding a dequantization function that supports NVFP4. However, I’m not entirely sure if that’s sufficient. Has anyone already implemented NVFP4 support in llama.cpp, or is there any plan for this feature to be officially supported in the future?

am17an · 2025-10-19T17:14:16Z

am17an
Oct 19, 2025
Collaborator

Take a look at how mxfp4 is supported, nvfp4 will require similar changes

2 replies

dywszaka Oct 20, 2025
Author

Yes, I believe NVFP4 is quite similar to MXFP4. However, I ran into an issue — I couldn’t find any NVFP4 GGUF models to test with. :(

I’ve tried the following approaches:
1. Using existing models: I wasn’t able to find any NVFP4 models in GGUF format on Hugging Face.
2. Converting existing HF models: The convert_hf_to_gguf tool provided by llama.cpp failed to convert some NVFP4 models, such as NVFP4/Qwen3-0.6B-FP4, into GGUF format.

am17an Oct 20, 2025
Collaborator

Take a look at the original PR which introduced mxfp4. #15091, you need to do the same stuff for nvfp4 before you can convert to gguf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When will we support NVFP4? #16668

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

When will we support NVFP4? #16668

Uh oh!

dywszaka Oct 19, 2025

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

am17an Oct 19, 2025 Collaborator

Uh oh!

dywszaka Oct 20, 2025 Author

Uh oh!

am17an Oct 20, 2025 Collaborator

dywszaka
Oct 19, 2025

Replies: 1 comment 2 replies

am17an
Oct 19, 2025
Collaborator

dywszaka Oct 20, 2025
Author

am17an Oct 20, 2025
Collaborator