Slowdown and Higher Memory Consumption for GPTQ-LoRA with Bfloat16 #84

achew010 · 2024-09-12T07:57:38Z

Description

Regression Test for Loss, Memory,

Throughput
Comparisons on loss, memory and throughput for Full-FT, PEFT

QLoRA: status quo on the switch of torch_dtype=float16 (Reference) to torch_dtype=bfloat16 (New).
GPTQ-LoRA: impact in terms of increase in memory consumption and decrease in throughput with

See Outliers

Subset of Outliers processed into this table

A = pd.read_csv('outliers.1.csv', index_col=None)

As = []

for tag,G  in A.groupby('scenario'):
    reg = G.reference < G.new # those that got higher (worse)
    if tag =='train_tokens_per_second':
        reg = reg.apply(lambda x: not x) # these are those that are worse if lower

    As.append(G.loc[reg])

A = pd.concat(As)

The text was updated successfully, but these errors were encountered:

fabianlim · 2024-09-12T13:49:51Z

@achew010 are we positive the slow down only affects GPTQ-LoRA and nothing else (e.g., full, regular peft). I remember you used to print out a table, can we check it also

fabianlim mentioned this issue Sep 13, 2024

Allow Kernels for Full FT and Non-Quantized PEFT #79

Merged

fabianlim mentioned this issue Oct 12, 2024

Quickfix: Accelerate YAML and LoRA Fused Ops #92

Merged

fabianlim added the question Further information is requested label Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowdown and Higher Memory Consumption for GPTQ-LoRA with Bfloat16 #84

Slowdown and Higher Memory Consumption for GPTQ-LoRA with Bfloat16 #84

achew010 commented Sep 12, 2024 •

edited by fabianlim

Loading

fabianlim commented Sep 12, 2024 •

edited

Loading

Slowdown and Higher Memory Consumption for GPTQ-LoRA with Bfloat16 #84

Slowdown and Higher Memory Consumption for GPTQ-LoRA with Bfloat16 #84

Comments

achew010 commented Sep 12, 2024 • edited by fabianlim Loading

Description

Regression Test for Loss, Memory,

fabianlim commented Sep 12, 2024 • edited Loading

achew010 commented Sep 12, 2024 •

edited by fabianlim

Loading

fabianlim commented Sep 12, 2024 •

edited

Loading