Skip to content

Latest commit

 

History

History
2524 lines (2513 loc) · 54 KB

autoround_comparative_analysis.md

File metadata and controls

2524 lines (2513 loc) · 54 KB

For a fair comparison, we utilized 512 samples from Pile-10k for all methods during calibration. Due to memory constraints, we maintained the original sequence length of 512 for AWQ, while for GPTQ and our approach, a sequence length of 2048 is used. We have enabled act-order and true-seqential in GPTQ, and the notation GPTQ* indicates that we adjusted the random seed or data preprocessing to address issues related to the non-positive definite Hessian matrix or other issues.


1. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W4G-1.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 55.92 66.10 59.01 71.35 80.14 24.85 29.00 79.17 57.76 77.95 45.99 58.84
GPTQ 58.22 73.45 59.47 74.03 80.20 26.93 31.00 81.50 64.98 78.24 47.01 61.37
AWQ 57.20 71.45 59.21 73.64 79.43 25.34 30.40 82.69 68.95 79.25 47.44 61.36
Ours 59.52 73.76 60.75 73.32 80.09 27.17 33.00 82.02 66.07 80.47 49.49 62.33
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 36.87 67.96 55.63 68.51 76.82 26.19 30.60 73.64 58.84 74.07 41.30 55.49
GPTQ 39.66 71.92 55.89 68.03 77.58 25.09 30.20 76.67 62.09 75.55 41.72 56.76
AWQ 40.24 71.20 56.26 69.61 76.93 26.07 32.60 77.31 63.18 75.00 41.30 57.25
Ours 39.97 71.63 56.52 68.43 77.91 25.70 31.60 76.18 65.70 76.01 42.58 57.48
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 50.37 74.35 59.12 71.98 79.00 24.85 33.00 81.77 64.98 79.08 46.59 60.46
GPTQ 51.14 75.37 59.14 72.06 78.02 25.34 32.20 80.46 62.09 77.36 44.54 59.79
AWQ 51.16 75.98 59.51 70.80 78.40 25.21 34.60 78.26 66.79 79.12 46.59 60.58
Ours 52.30 75.96 59.79 72.30 78.84 25.58 34.00 80.15 66.79 79.38 48.12 61.20
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 63.85 77.62 63.38 76.72 81.50 28.89 37.80 83.39 68.23 81.99 54.10 65.22
GPTQ 64.81 79.27 63.86 76.87 81.61 31.46 36.40 82.23 70.04 82.53 54.18 65.75
AWQ 65.08 78.77 64.14 77.11 81.45 30.48 37.20 83.64 72.92 82.49 55.80 66.28
Ours 65.43 79.55 64.47 78.06 82.10 30.60 36.40 83.91 71.12 82.53 54.78 66.27
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 31.34 70.02 55.35 69.77 77.69 20.32 32.60 73.43 59.57 74.45 41.30 55.08
GPTQ 29.06 71.08 55.11 70.01 77.37 20.93 32.20 72.69 63.90 74.66 41.64 55.33
AWQ 33.33 70.81 55.98 68.27 78.07 21.18 31.40 74.37 64.62 74.03 41.21 55.75
Ours 31.80 71.96 56.57 69.53 79.00 21.91 33.20 75.72 66.79 74.83 43.09 56.76
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 39.57 70.93 58.82 71.98 78.02 24.85 32.00 78.20 66.43 75.67 44.62 58.28
GPTQ* 40.01 74.67 58.92 71.03 78.45 26.44 33.60 77.09 68.23 76.85 44.97 59.12
AWQ 44.56 74.13 59.13 71.27 78.94 25.83 33.20 76.42 66.06 76.89 46.67 59.37
Ours 43.94 75.82 59.51 72.22 78.78 25.70 32.80 77.34 67.51 76.47 46.67 59.71
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 53.05 75.65 62.08 74.82 80.09 25.95 35.80 81.87 63.54 79.76 50.26 62.08
GPTQ 53.04 77.22 61.95 73.80 80.69 27.29 34.60 81.07 66.06 78.79 49.15 62.15
AWQ 54.13 76.77 62.78 74.11 81.07 27.78 35.00 82.66 67.15 79.97 51.71 63.01
Ours 54.72 77.84 62.91 75.06 80.69 26.68 36.40 82.60 66.79 80.13 52.13 63.27
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 58.74 76.42 64.12 76.72 81.01 29.25 38.60 84.13 70.40 80.72 51.88 64.73
GPTQ* 59.10 78.17 63.78 75.69 81.34 28.27 38.40 83.76 68.59 80.98 51.62 64.52
AWQ 58.86 77.37 63.86 76.56 80.85 28.27 35.20 83.94 71.48 78.75 50.94 64.19
Ours 59.21 79.16 64.37 76.64 81.34 26.81 37.80 84.40 69.68 80.98 51.79 64.74

2. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W4G128.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 59.72 74.44 61.06 73.40 80.36 27.17 32.60 83.67 64.62 79.63 49.32 62.36
GPTQ 59.17 74.52 60.37 74.90 80.58 26.68 31.00 83.33 67.15 79.67 48.12 62.32
AWQ 60.20 75.14 60.43 73.80 80.03 27.05 30.40 84.01 62.09 80.39 50.26 62.16
Ours 60.47 75.59 61.03 73.88 80.09 27.54 31.60 83.09 66.07 79.97 49.49 62.62
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 40.91 72.44 56.91 68.35 77.58 24.97 31.20 77.61 56.32 76.26 43.52 56.92
GPTQ 42.57 73.28 56.36 69.06 78.02 25.34 30.20 75.72 57.04 75.63 42.15 56.85
AWQ 41.00 72.60 56.40 68.98 77.31 25.70 31.60 78.75 58.48 76.14 43.86 57.35
Ours 41.82 72.75 56.79 68.67 78.13 25.58 30.20 77.49 63.54 75.76 42.58 57.57
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 52.10 76.27 59.77 72.14 78.62 24.72 34.20 80.24 62.09 79.00 47.95 60.65
GPTQ 52.66 76.54 59.76 72.14 78.35 25.70 34.00 79.33 66.43 78.58 47.53 61.00
AWQ 52.39 76.89 59.97 73.24 79.00 25.21 32.60 80.40 63.54 79.04 47.70 60.91
Ours 51.92 76.46 59.87 71.67 79.00 25.83 35.20 79.60 63.54 79.25 47.01 60.85
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 64.91 79.06 63.93 78.14 81.66 30.11 37.00 83.61 68.59 82.79 54.78 65.87
GPTQ 65.63 79.22 64.45 78.22 81.88 31.09 37.00 84.19 69.31 82.79 54.61 66.22
AWQ 65.79 79.76 64.48 77.58 82.32 30.72 38.00 83.06 68.95 82.70 55.12 66.23
Ours 65.65 79.49 64.60 78.30 82.05 31.58 37.40 84.83 68.95 82.87 54.52 66.39
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 32.63 72.31 56.26 70.01 78.45 20.93 33.60 74.74 64.26 74.71 42.75 56.42
GPTQ 31.16 72.40 55.85 70.09 78.13 22.28 30.40 74.65 64.26 74.20 40.19 55.78
AWQ 33.42 72.95 56.30 68.75 77.97 21.42 32.80 74.89 62.09 75.00 41.21 56.07
Ours 32.15 72.85 56.45 70.17 78.51 22.28 32.80 75.14 67.87 75.13 41.89 56.84
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 42.71 75.26 59.30 72.53 79.54 25.95 32.60 76.76 65.34 76.98 45.82 59.34
GPTQ* 42.65 75.41 59.51 72.93 79.33 24.97 32.40 77.49 68.23 76.89 45.56 59.58
AWQ 42.66 75.76 59.50 72.77 78.89 26.56 33.60 77.46 68.59 76.94 45.48 59.84
Ours 42.27 76.17 59.53 73.56 79.33 25.70 32.80 78.20 70.04 76.94 46.25 60.07
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 54.24 77.02 62.90 74.35 80.52 27.29 34.20 81.96 67.15 80.89 52.05 62.96
GPTQ 54.20 77.41 62.79 75.14 80.41 27.54 34.60 81.93 67.51 80.05 50.51 62.92
AWQ 55.14 77.49 63.08 75.77 80.52 27.29 34.20 82.87 67.15 80.43 52.90 63.35
Ours 54.68 77.90 62.93 74.82 80.47 28.15 35.80 82.39 66.79 80.13 51.11 63.20
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 59.53 79.51 64.63 77.35 80.96 27.91 38.40 84.43 71.48 81.48 52.22 65.26
GPTQ* 60.47 78.79 64.45 76.24 81.18 28.03 37.40 83.85 68.95 81.57 53.07 64.91
AWQ 59.45 79.31 64.67 76.72 81.56 28.15 38.00 84.43 71.12 81.10 52.13 65.15
Ours 58.93 79.22 64.48 77.03 81.28 27.91 38.60 84.31 70.76 81.19 52.22 65.08

3. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W3G128.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 53.49 68.74 58.12 68.27 79.33 24.60 29.60 79.97 57.40 76.89 43.77 58.20
GPTQ 55.84 73.04 57.61 70.24 78.67 24.85 30.80 81.44 63.54 77.27 45.65 59.91
AWQ 55.61 73.69 57.86 71.27 79.82 26.07 29.00 81.10 59.21 79.00 46.93 59.96
Ours 57.54 73.01 59.60 72.85 79.54 25.70 31.60 81.74 58.12 78.70 46.33 60.43
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 34.22 65.96 54.90 67.56 76.28 24.48 30.80 71.68 54.51 72.98 38.57 53.81
GPTQ 36.11 69.61 53.66 68.59 76.01 21.91 27.80 73.43 54.51 73.74 40.19 54.14
AWQ 35.82 69.90 54.98 67.40 76.01 25.21 29.80 74.68 57.76 74.07 41.64 55.21
Ours 40.13 71.01 55.33 68.27 76.82 25.34 32.80 75.32 60.29 75.25 42.92 56.68
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 48.01 72.33 57.74 70.72 78.07 25.21 32.00 77.28 60.65 77.69 44.62 58.57
GPTQ 49.56 75.24 57.83 70.88 78.56 24.97 33.40 78.44 62.82 77.99 45.65 59.58
AWQ 49.77 75.22 58.58 71.82 77.75 24.11 34.20 79.97 53.43 77.95 44.62 58.86
Ours 49.64 75.20 59.11 71.59 78.29 24.85 34.20 78.47 58.12 78.58 45.82 59.44
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 61.15 77.95 61.98 77.90 80.79 29.74 36.00 81.28 64.62 81.10 52.39 64.08
GPTQ 63.15 79.06 62.94 77.66 81.45 30.72 36.20 81.53 67.87 81.65 53.67 65.08
AWQ 64.09 79.47 63.75 76.48 81.77 29.74 37.20 82.69 66.06 81.40 53.67 65.12
Ours 64.94 78.89 63.83 76.56 81.50 31.21 37.20 81.41 68.59 81.73 52.56 65.31
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 28.00 67.67 53.43 66.38 76.50 21.42 31.20 72.72 59.21 70.92 38.31 53.25
GPTQ 30.16 66.31 53.92 67.48 76.82 21.42 29.60 71.31 59.21 72.22 38.74 53.38
AWQ 30.33 70.19 54.53 68.98 76.71 20.81 31.60 74.68 64.62 73.23 38.91 54.96
Ours 25.85 70.95 55.45 69.69 77.37 21.66 32.00 73.88 60.29 73.48 39.33 54.54
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 34.87 69.65 57.25 70.48 77.31 26.93 32.00 71.44 62.82 75.63 43.94 56.57
GPTQ 35.51 73.08 57.89 70.80 77.37 24.48 31.40 77.52 62.82 74.41 43.26 57.14
AWQ 40.53 73.94 57.89 69.53 78.94 26.68 33.40 74.83 65.34 75.93 45.05 58.37
Ours 39.16 75.22 58.64 71.59 78.94 25.95 35.20 76.30 65.34 76.52 45.39 58.93
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 52.41 75.08 61.45 74.27 79.87 25.95 33.00 81.38 65.34 79.12 48.89 61.52
GPTQ 51.39 74.97 60.35 75.30 79.60 26.93 34.80 82.75 64.62 78.11 48.46 61.57
AWQ 53.84 76.71 61.94 75.14 80.03 25.34 34.40 81.90 67.15 79.59 50.77 62.44
Ours 54.39 77.49 62.13 74.03 80.47 27.30 35.00 79.76 68.59 79.46 48.98 62.51
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 57.47 77.43 63.23 75.93 80.41 28.64 38.40 82.69 66.43 80.22 51.19 63.82
GPTQ* 57.92 78.69 62.98 76.87 80.63 27.66 37.60 84.16 68.95 80.89 51.19 64.32
AWQ 58.87 77.94 63.77 75.37 80.96 27.66 36.80 85.02 71.12 81.10 50.34 64.45
Ours 58.30 78.11 63.60 76.56 80.85 29.50 37.80 84.80 70.04 80.22 50.68 64.59

4. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W2G128.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 23.45 0.14 27.43 49.64 54.30 24.24 15.20 38.69 51.99 29.08 21.59 30.52
GPTQ 25.23 30.47 38.28 53.83 64.91 24.11 17.40 58.29 50.90 47.77 24.57 39.61
AWQ 25.38 0.00 25.71 52.01 51.58 23.99 17.60 37.83 47.29 26.98 22.27 30.06
Ours 40.46 58.61 50.87 62.90 75.84 24.85 22.80 78.56 57.04 70.88 37.03 52.71
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 23.98 0.02 26.04 49.49 52.50 24.85 15.20 41.01 49.10 27.48 19.71 29.94
GPTQ 23.65 11.72 32.59 55.17 58.32 25.95 15.80 52.14 51.99 40.45 21.25 35.37
AWQ 25.38 0.00 25.69 49.96 52.34 23.75 17.80 37.83 52.71 24.62 21.08 30.10
Ours 27.20 55.25 47.35 61.01 72.96 24.85 25.60 68.07 54.51 65.99 32.25 48.64
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 23.77 7.47 33.08 49.01 57.94 26.19 16.00 47.74 53.43 32.03 21.93 33.51
GPTQ 24.69 45.20 41.06 55.80 67.08 23.26 19.80 54.40 52.35 55.60 27.82 42.46
AWQ 27.04 0.00 25.80 51.85 52.99 23.62 13.60 62.17 47.29 26.22 23.12 32.16
Ours 34.33 63.92 53.35 64.33 76.17 25.70 26.00 72.75 61.73 71.17 38.57 53.46
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 24.20 20.18 40.88 54.85 63.87 24.11 17.60 43.06 53.07 50.51 27.22 38.14
GPTQ 23.12 0.00 25.04 49.57 49.51 0.00 27.60 37.83 52.71 25.08 22.70 28.47
AWQ 24.46 0.00 25.46 51.38 52.50 23.50 14.20 62.17 52.71 25.76 22.35 32.23
Ours 54.04 72.97 59.65 74.90 79.00 29.01 34.80 79.63 69.68 78.37 46.59 61.69
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 24.36 0.52 27.24 49.25 54.24 24.24 15.20 39.63 57.40 27.86 21.84 31.07
GPTQ 22.95 12.75 33.36 51.70 60.07 23.99 13.40 48.62 53.07 40.82 21.50 34.75
AWQ 23.12 0.00 25.37 53.28 52.56 25.21 13.80 37.83 52.71 25.63 22.53 30.18
Ours 24.46 13.53 42.16 56.99 70.02 24.60 25.20 62.91 47.29 60.90 31.74 41.80
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 24.66 4.97 29.67 49.33 57.24 25.58 12.40 44.10 53.79 32.07 22.01 32.35
GPTQ* 26.43 40.48 39.47 58.25 66.97 23.50 18.60 52.78 50.54 51.52 25.00 41.23
AWQ 27.04 0.00 25.59 50.36 53.05 24.11 15.60 62.17 47.29 25.97 23.21 32.22
Ours 31.87 59.65 51.25 67.64 76.28 25.58 27.80 69.11 58.48 70.71 37.12 52.32
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 23.24 5.55 27.22 53.99 56.80 21.79 18.20 51.65 53.07 36.74 21.33 33.60
GPTQ 30.47 49.93 45.05 61.88 68.88 23.26 22.60 68.29 51.99 60.69 30.72 46.70
AWQ 27.04 0.00 25.41 50.20 52.94 24.48 16.60 62.17 47.29 24.71 23.38 32.20
Ours 40.83 67.92 56.73 68.90 76.17 24.36 31.60 75.54 62.45 74.92 42.41 56.53
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 24.48 32.78 43.59 57.85 67.52 22.89 22.80 61.53 50.54 52.10 28.24 42.21
GPTQ* 37.06 67.44 53.97 69.46 76.44 24.36 28.00 73.64 60.29 71.34 38.57 54.60
AWQ 25.38 0.00 25.58 49.96 53.10 24.24 11.00 37.83 52.71 24.96 22.44 29.75
Ours 47.21 72.07 60.06 73.24 78.62 25.46 34.20 80.64 62.82 77.48 46.76 59.87