About GPU memory usage #8

JY-CCK · 2024-04-01T04:31:47Z

Hello.
First of all, thanks for sharing a bitnet training code.

I have a question about GPU memory usage.
As I understanding, bitnet can reduce VRAM usage compared to fp16/bf16 precision.
However, by commenting code in the train_bitnet.py
model = apply_bitlinear(model, target_layers=target_layers) # comment this to train og llama
memory usage is reduced about 2G.
(with bitnet layer, it used 13G v.s. w/o bitnet layer, 11G)

Doesn't it make sense that using bitnet would actually result in lower memory usage?

Thanks.

The text was updated successfully, but these errors were encountered:

joey00072 · 2024-04-02T19:27:21Z

Oh is it, lmk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About GPU memory usage #8

About GPU memory usage #8

JY-CCK commented Apr 1, 2024

joey00072 commented Apr 2, 2024

About GPU memory usage #8

About GPU memory usage #8

Comments

JY-CCK commented Apr 1, 2024

joey00072 commented Apr 2, 2024