Replies: 1 comment 2 replies
-
|
Take a look at how mxfp4 is supported, nvfp4 will require similar changes |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Recently, I’ve been trying to run an NVFP4 model using llama.cpp, so I started digging into its source code. From what I’ve seen, it seems possible to make it work by adding a dequantization function that supports NVFP4. However, I’m not entirely sure if that’s sufficient. Has anyone already implemented NVFP4 support in llama.cpp, or is there any plan for this feature to be officially supported in the future?
Beta Was this translation helpful? Give feedback.
All reactions