You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a fair comparison, we utilized 512 samples from Pile-10k for all methods during calibration. Due to memory constraints, we maintained the original sequence length of 512 for AWQ, while for GPTQ and our approach, a sequence length of 2048 is used. We have enabled act-order and true-seqential in GPTQ, and the notation GPTQ* indicates that we adjusted the random seed or data preprocessing to address issues related to the non-positive definite Hessian matrix or other issues.
1. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W4G-1.
Mmlu
Lamb.
Hella.
Wino.
Piqa
Truth.
Open.
Boolq
RTE
ARC-e
ARC-c.
Avg.
Mistral-7B
FP16
61.35
75.68
61.27
74.03
80.79
28.03
32.80
83.67
67.51
80.81
50.34
63.30
RTN
55.92
66.10
59.01
71.35
80.14
24.85
29.00
79.17
57.76
77.95
45.99
58.84
GPTQ
58.22
73.45
59.47
74.03
80.20
26.93
31.00
81.50
64.98
78.24
47.01
61.37
AWQ
57.20
71.45
59.21
73.64
79.43
25.34
30.40
82.69
68.95
79.25
47.44
61.36
Ours
59.52
73.76
60.75
73.32
80.09
27.17
33.00
82.02
66.07
80.47
49.49
62.33
V2-7B
FP16
42.69
73.90
57.15
68.90
78.07
25.21
31.40
77.74
62.82
76.35
43.52
57.98
RTN
36.87
67.96
55.63
68.51
76.82
26.19
30.60
73.64
58.84
74.07
41.30
55.49
GPTQ
39.66
71.92
55.89
68.03
77.58
25.09
30.20
76.67
62.09
75.55
41.72
56.76
AWQ
40.24
71.20
56.26
69.61
76.93
26.07
32.60
77.31
63.18
75.00
41.30
57.25
Ours
39.97
71.63
56.52
68.43
77.91
25.70
31.60
76.18
65.70
76.01
42.58
57.48
V2-13B
FP16
52.86
76.77
60.04
72.14
79.05
25.95
35.20
80.55
65.34
79.38
48.38
61.42
RTN
50.37
74.35
59.12
71.98
79.00
24.85
33.00
81.77
64.98
79.08
46.59
60.46
GPTQ
51.14
75.37
59.14
72.06
78.02
25.34
32.20
80.46
62.09
77.36
44.54
59.79
AWQ
51.16
75.98
59.51
70.80
78.40
25.21
34.60
78.26
66.79
79.12
46.59
60.58
Ours
52.30
75.96
59.79
72.30
78.84
25.58
34.00
80.15
66.79
79.38
48.12
61.20
V2-70B
FP16
66.23
79.64
64.77
77.98
82.15
30.60
37.20
83.70
67.87
82.70
54.44
66.12
RTN
63.85
77.62
63.38
76.72
81.50
28.89
37.80
83.39
68.23
81.99
54.10
65.22
GPTQ
64.81
79.27
63.86
76.87
81.61
31.46
36.40
82.23
70.04
82.53
54.18
65.75
AWQ
65.08
78.77
64.14
77.11
81.45
30.48
37.20
83.64
72.92
82.49
55.80
66.28
Ours
65.43
79.55
64.47
78.06
82.10
30.60
36.40
83.91
71.12
82.53
54.78
66.27
V1-7B
FP16
32.74
73.53
56.94
70.01
78.67
22.03
34.60
75.08
66.43
75.25
41.81
57.01
RTN
31.34
70.02
55.35
69.77
77.69
20.32
32.60
73.43
59.57
74.45
41.30
55.08
GPTQ
29.06
71.08
55.11
70.01
77.37
20.93
32.20
72.69
63.90
74.66
41.64
55.33
AWQ
33.33
70.81
55.98
68.27
78.07
21.18
31.40
74.37
64.62
74.03
41.21
55.75
Ours
31.80
71.96
56.57
69.53
79.00
21.91
33.20
75.72
66.79
74.83
43.09
56.76
V1-13B
FP16
44.21
76.21
59.92
72.77
79.16
25.70
33.20
77.89
70.76
77.40
46.42
60.33
RTN
39.57
70.93
58.82
71.98
78.02
24.85
32.00
78.20
66.43
75.67
44.62
58.28
GPTQ*
40.01
74.67
58.92
71.03
78.45
26.44
33.60
77.09
68.23
76.85
44.97
59.12
AWQ
44.56
74.13
59.13
71.27
78.94
25.83
33.20
76.42
66.06
76.89
46.67
59.37
Ours
43.94
75.82
59.51
72.22
78.78
25.70
32.80
77.34
67.51
76.47
46.67
59.71
V1-30B
FP16
55.14
77.55
63.33
75.85
81.12
28.27
36.00
82.78
66.79
80.39
52.90
63.65
RTN
53.05
75.65
62.08
74.82
80.09
25.95
35.80
81.87
63.54
79.76
50.26
62.08
GPTQ
53.04
77.22
61.95
73.80
80.69
27.29
34.60
81.07
66.06
78.79
49.15
62.15
AWQ
54.13
76.77
62.78
74.11
81.07
27.78
35.00
82.66
67.15
79.97
51.71
63.01
Ours
54.72
77.84
62.91
75.06
80.69
26.68
36.40
82.60
66.79
80.13
52.13
63.27
V1-65B
FP16
59.79
79.12
64.53
77.35
81.23
27.91
38.00
84.86
69.68
81.36
52.82
65.15
RTN
58.74
76.42
64.12
76.72
81.01
29.25
38.60
84.13
70.40
80.72
51.88
64.73
GPTQ*
59.10
78.17
63.78
75.69
81.34
28.27
38.40
83.76
68.59
80.98
51.62
64.52
AWQ
58.86
77.37
63.86
76.56
80.85
28.27
35.20
83.94
71.48
78.75
50.94
64.19
Ours
59.21
79.16
64.37
76.64
81.34
26.81
37.80
84.40
69.68
80.98
51.79
64.74
2. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W4G128.
Mmlu
Lamb.
Hella.
Wino.
Piqa
Truth.
Open.
Boolq
RTE
ARC-e
ARC-c.
Avg.
Mistral-7B
FP16
61.35
75.68
61.27
74.03
80.79
28.03
32.80
83.67
67.51
80.81
50.34
63.30
RTN
59.72
74.44
61.06
73.40
80.36
27.17
32.60
83.67
64.62
79.63
49.32
62.36
GPTQ
59.17
74.52
60.37
74.90
80.58
26.68
31.00
83.33
67.15
79.67
48.12
62.32
AWQ
60.20
75.14
60.43
73.80
80.03
27.05
30.40
84.01
62.09
80.39
50.26
62.16
Ours
60.47
75.59
61.03
73.88
80.09
27.54
31.60
83.09
66.07
79.97
49.49
62.62
V2-7B
FP16
42.69
73.90
57.15
68.90
78.07
25.21
31.40
77.74
62.82
76.35
43.52
57.98
RTN
40.91
72.44
56.91
68.35
77.58
24.97
31.20
77.61
56.32
76.26
43.52
56.92
GPTQ
42.57
73.28
56.36
69.06
78.02
25.34
30.20
75.72
57.04
75.63
42.15
56.85
AWQ
41.00
72.60
56.40
68.98
77.31
25.70
31.60
78.75
58.48
76.14
43.86
57.35
Ours
41.82
72.75
56.79
68.67
78.13
25.58
30.20
77.49
63.54
75.76
42.58
57.57
V2-13B
FP16
52.86
76.77
60.04
72.14
79.05
25.95
35.20
80.55
65.34
79.38
48.38
61.42
RTN
52.10
76.27
59.77
72.14
78.62
24.72
34.20
80.24
62.09
79.00
47.95
60.65
GPTQ
52.66
76.54
59.76
72.14
78.35
25.70
34.00
79.33
66.43
78.58
47.53
61.00
AWQ
52.39
76.89
59.97
73.24
79.00
25.21
32.60
80.40
63.54
79.04
47.70
60.91
Ours
51.92
76.46
59.87
71.67
79.00
25.83
35.20
79.60
63.54
79.25
47.01
60.85
V2-70B
FP16
66.23
79.64
64.77
77.98
82.15
30.60
37.20
83.70
67.87
82.70
54.44
66.12
RTN
64.91
79.06
63.93
78.14
81.66
30.11
37.00
83.61
68.59
82.79
54.78
65.87
GPTQ
65.63
79.22
64.45
78.22
81.88
31.09
37.00
84.19
69.31
82.79
54.61
66.22
AWQ
65.79
79.76
64.48
77.58
82.32
30.72
38.00
83.06
68.95
82.70
55.12
66.23
Ours
65.65
79.49
64.60
78.30
82.05
31.58
37.40
84.83
68.95
82.87
54.52
66.39
V1-7B
FP16
32.74
73.53
56.94
70.01
78.67
22.03
34.60
75.08
66.43
75.25
41.81
57.01
RTN
32.63
72.31
56.26
70.01
78.45
20.93
33.60
74.74
64.26
74.71
42.75
56.42
GPTQ
31.16
72.40
55.85
70.09
78.13
22.28
30.40
74.65
64.26
74.20
40.19
55.78
AWQ
33.42
72.95
56.30
68.75
77.97
21.42
32.80
74.89
62.09
75.00
41.21
56.07
Ours
32.15
72.85
56.45
70.17
78.51
22.28
32.80
75.14
67.87
75.13
41.89
56.84
V1-13B
FP16
44.21
76.21
59.92
72.77
79.16
25.70
33.20
77.89
70.76
77.40
46.42
60.33
RTN
42.71
75.26
59.30
72.53
79.54
25.95
32.60
76.76
65.34
76.98
45.82
59.34
GPTQ*
42.65
75.41
59.51
72.93
79.33
24.97
32.40
77.49
68.23
76.89
45.56
59.58
AWQ
42.66
75.76
59.50
72.77
78.89
26.56
33.60
77.46
68.59
76.94
45.48
59.84
Ours
42.27
76.17
59.53
73.56
79.33
25.70
32.80
78.20
70.04
76.94
46.25
60.07
V1-30B
FP16
55.14
77.55
63.33
75.85
81.12
28.27
36.00
82.78
66.79
80.39
52.90
63.65
RTN
54.24
77.02
62.90
74.35
80.52
27.29
34.20
81.96
67.15
80.89
52.05
62.96
GPTQ
54.20
77.41
62.79
75.14
80.41
27.54
34.60
81.93
67.51
80.05
50.51
62.92
AWQ
55.14
77.49
63.08
75.77
80.52
27.29
34.20
82.87
67.15
80.43
52.90
63.35
Ours
54.68
77.90
62.93
74.82
80.47
28.15
35.80
82.39
66.79
80.13
51.11
63.20
V1-65B
FP16
59.79
79.12
64.53
77.35
81.23
27.91
38.00
84.86
69.68
81.36
52.82
65.15
RTN
59.53
79.51
64.63
77.35
80.96
27.91
38.40
84.43
71.48
81.48
52.22
65.26
GPTQ*
60.47
78.79
64.45
76.24
81.18
28.03
37.40
83.85
68.95
81.57
53.07
64.91
AWQ
59.45
79.31
64.67
76.72
81.56
28.15
38.00
84.43
71.12
81.10
52.13
65.15
Ours
58.93
79.22
64.48
77.03
81.28
27.91
38.60
84.31
70.76
81.19
52.22
65.08
3. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W3G128.
Mmlu
Lamb.
Hella.
Wino.
Piqa
Truth.
Open.
Boolq
RTE
ARC-e
ARC-c.
Avg.
Mistral-7B
FP16
61.35
75.68
61.27
74.03
80.79
28.03
32.80
83.67
67.51
80.81
50.34
63.30
RTN
53.49
68.74
58.12
68.27
79.33
24.60
29.60
79.97
57.40
76.89
43.77
58.20
GPTQ
55.84
73.04
57.61
70.24
78.67
24.85
30.80
81.44
63.54
77.27
45.65
59.91
AWQ
55.61
73.69
57.86
71.27
79.82
26.07
29.00
81.10
59.21
79.00
46.93
59.96
Ours
57.54
73.01
59.60
72.85
79.54
25.70
31.60
81.74
58.12
78.70
46.33
60.43
V2-7B
FP16
42.69
73.90
57.15
68.90
78.07
25.21
31.40
77.74
62.82
76.35
43.52
57.98
RTN
34.22
65.96
54.90
67.56
76.28
24.48
30.80
71.68
54.51
72.98
38.57
53.81
GPTQ
36.11
69.61
53.66
68.59
76.01
21.91
27.80
73.43
54.51
73.74
40.19
54.14
AWQ
35.82
69.90
54.98
67.40
76.01
25.21
29.80
74.68
57.76
74.07
41.64
55.21
Ours
40.13
71.01
55.33
68.27
76.82
25.34
32.80
75.32
60.29
75.25
42.92
56.68
V2-13B
FP16
52.86
76.77
60.04
72.14
79.05
25.95
35.20
80.55
65.34
79.38
48.38
61.42
RTN
48.01
72.33
57.74
70.72
78.07
25.21
32.00
77.28
60.65
77.69
44.62
58.57
GPTQ
49.56
75.24
57.83
70.88
78.56
24.97
33.40
78.44
62.82
77.99
45.65
59.58
AWQ
49.77
75.22
58.58
71.82
77.75
24.11
34.20
79.97
53.43
77.95
44.62
58.86
Ours
49.64
75.20
59.11
71.59
78.29
24.85
34.20
78.47
58.12
78.58
45.82
59.44
V2-70B
FP16
66.23
79.64
64.77
77.98
82.15
30.60
37.20
83.70
67.87
82.70
54.44
66.12
RTN
61.15
77.95
61.98
77.90
80.79
29.74
36.00
81.28
64.62
81.10
52.39
64.08
GPTQ
63.15
79.06
62.94
77.66
81.45
30.72
36.20
81.53
67.87
81.65
53.67
65.08
AWQ
64.09
79.47
63.75
76.48
81.77
29.74
37.20
82.69
66.06
81.40
53.67
65.12
Ours
64.94
78.89
63.83
76.56
81.50
31.21
37.20
81.41
68.59
81.73
52.56
65.31
V1-7B
FP16
32.74
73.53
56.94
70.01
78.67
22.03
34.60
75.08
66.43
75.25
41.81
57.01
RTN
28.00
67.67
53.43
66.38
76.50
21.42
31.20
72.72
59.21
70.92
38.31
53.25
GPTQ
30.16
66.31
53.92
67.48
76.82
21.42
29.60
71.31
59.21
72.22
38.74
53.38
AWQ
30.33
70.19
54.53
68.98
76.71
20.81
31.60
74.68
64.62
73.23
38.91
54.96
Ours
25.85
70.95
55.45
69.69
77.37
21.66
32.00
73.88
60.29
73.48
39.33
54.54
V1-13B
FP16
44.21
76.21
59.92
72.77
79.16
25.70
33.20
77.89
70.76
77.40
46.42
60.33
RTN
34.87
69.65
57.25
70.48
77.31
26.93
32.00
71.44
62.82
75.63
43.94
56.57
GPTQ
35.51
73.08
57.89
70.80
77.37
24.48
31.40
77.52
62.82
74.41
43.26
57.14
AWQ
40.53
73.94
57.89
69.53
78.94
26.68
33.40
74.83
65.34
75.93
45.05
58.37
Ours
39.16
75.22
58.64
71.59
78.94
25.95
35.20
76.30
65.34
76.52
45.39
58.93
V1-30B
FP16
55.14
77.55
63.33
75.85
81.12
28.27
36.00
82.78
66.79
80.39
52.90
63.65
RTN
52.41
75.08
61.45
74.27
79.87
25.95
33.00
81.38
65.34
79.12
48.89
61.52
GPTQ
51.39
74.97
60.35
75.30
79.60
26.93
34.80
82.75
64.62
78.11
48.46
61.57
AWQ
53.84
76.71
61.94
75.14
80.03
25.34
34.40
81.90
67.15
79.59
50.77
62.44
Ours
54.39
77.49
62.13
74.03
80.47
27.30
35.00
79.76
68.59
79.46
48.98
62.51
V1-65B
FP16
59.79
79.12
64.53
77.35
81.23
27.91
38.00
84.86
69.68
81.36
52.82
65.15
RTN
57.47
77.43
63.23
75.93
80.41
28.64
38.40
82.69
66.43
80.22
51.19
63.82
GPTQ*
57.92
78.69
62.98
76.87
80.63
27.66
37.60
84.16
68.95
80.89
51.19
64.32
AWQ
58.87
77.94
63.77
75.37
80.96
27.66
36.80
85.02
71.12
81.10
50.34
64.45
Ours
58.30
78.11
63.60
76.56
80.85
29.50
37.80
84.80
70.04
80.22
50.68
64.59
4. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W2G128.