added MXFP4 quantizer support to directly load GPT-OSS models via QEFFAutoModelForCausalLM #577

ochougul · 2025-09-26T11:12:17Z

added mxfp4 quantizer to match weights keys
added transform to dequantize mxfp4 to float32

…FAutoModelForCausalLM Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

examples/gpt_oss.py

vbaddi

LGTM, thanks :)
Let's merge this add_gpt_oss, @quic-hemagnih can you please initiate CI and merge the add_gpt_oss branch to mainline?

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

…FAutoModelForCausalLM (#577) * added mxfp4 quantizer to match weights keys * added transform to dequantize mxfp4 to float32 --------- Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

added MXFP4 quantizer support to directly load GPT-OSS models via QEF…

0b8b53d

…FAutoModelForCausalLM Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

ochougul self-assigned this Sep 26, 2025

ochougul requested review from quic-amitraj, quic-hemagnih and quic-rishinr as code owners September 26, 2025 11:12

ochougul added the enhancement New feature or request label Sep 26, 2025

ochougul added 2 commits September 26, 2025 11:13

removed tokenizer from example script

208c5d7

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

claned example file

48fcd2a

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

vbaddi reviewed Sep 26, 2025

View reviewed changes

examples/gpt_oss.py Outdated Show resolved Hide resolved

examples/gpt_oss.py Outdated Show resolved Hide resolved

ochougul added the quantization label Sep 29, 2025

vbaddi approved these changes Sep 30, 2025

View reviewed changes

ochougul added 3 commits October 1, 2025 19:01

cleaned examples script

257271d

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

ran ruff format

eb218f4

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

added missing file

7838028

Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>

ochougul merged commit 3adccf6 into add_gpt_oss Oct 8, 2025
3 checks passed

ochougul deleted the mxfp4_quantizer_for_gptoss branch November 11, 2025 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added MXFP4 quantizer support to directly load GPT-OSS models via QEFFAutoModelForCausalLM #577

added MXFP4 quantizer support to directly load GPT-OSS models via QEFFAutoModelForCausalLM #577

Uh oh!

ochougul commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

vbaddi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

added MXFP4 quantizer support to directly load GPT-OSS models via QEFFAutoModelForCausalLM #577

added MXFP4 quantizer support to directly load GPT-OSS models via QEFFAutoModelForCausalLM #577

Uh oh!

Conversation

ochougul commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

vbaddi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants