Skip to content

Conversation

@xin3he
Copy link
Contributor

@xin3he xin3he commented Aug 24, 2023

Type of Change

feature

Description

support NF4/FP4 data type in weight-only, allow tuning dtype and compressing nf4/fp4 model

Expected Behavior & Potential Risk

UT pass

How has this PR been tested?

local test

Dependency Change?

N/A

@xin3he xin3he added enhancement New feature or request new feature labels Aug 24, 2023
@xin3he xin3he changed the title Nf4 Support NF4/FP4 data type in weight-only Aug 24, 2023
xin3he added 5 commits August 25, 2023 09:11
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
@hshen14
Copy link
Contributor

hshen14 commented Aug 25, 2023

Which FP4 format? Should specify FP4_E3M0, FP4_E2M1. You can align with Penghui.

@xin3he
Copy link
Contributor Author

xin3he commented Aug 25, 2023

Which FP4 format? Should specify FP4_E3M0, FP4_E2M1. You can align with Penghui.

FLOAT_MAPPING = {'nf4': NF4, 'fp4': FP4_BNB, 'fp4_bnb': FP4_BNB, 'fp4_e2m1': FP4_E2M1, 'e2m1': FP4_E2M1}
By default, fp4 is the one bitsandbytes used. Developers can select from ['fp4_e2m1', 'fp4_bnb']

@yiliu30 yiliu30 self-requested a review August 25, 2023 07:44
Signed-off-by: Xin He <xin3.he@intel.com>
xin3he added 3 commits August 25, 2023 17:23
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
@xin3he
Copy link
Contributor Author

xin3he commented Aug 25, 2023

We still miss GTPQ and TEQ to support NF4/FP4, it will happen in late PRs.

@xin3he xin3he merged commit 3d11b5e into master Aug 26, 2023
@xin3he xin3he deleted the nf4 branch August 26, 2023 09:40
XuehaoSun pushed a commit that referenced this pull request Sep 1, 2023
* support NF4/FP4 data type in weight-only RTN & AWQ algo, allow tuning dtype and compressing nf4/fp4 mode

Signed-off-by: Xin He <xin3.he@intel.com>

---------

Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants