simple convert scripts run out of tmpfs space on Linux

By default, Linux prevents tmpfs from using more than 50% of available system memory. This is normally a good thing, but the simple convert scripts write all tensor data to tmpfs before saving the output file, causing this exception if the converted model is larger than 50% of system RAM ([ref](https://github.com/ggerganov/llama.cpp/pull/3009#issuecomment-1713178330)):
```
Traceback (most recent call last):
  File "/home/cebtenzzre/src/forks/llama.cpp/convert-baichuan-hf-to-gguf.py", line 279, in <module>
    gguf_writer.add_tensor(new_name, data)
  File "/home/cebtenzzre/src/forks/llama.cpp/gguf-py/gguf/gguf.py", line 622, in add_tensor
    tensor.tofile(self.temp_file)
OSError: Not enough free space to write 140247040 bytes
```

This is annoying. You can set TMPDIR=/var/tmp to work around this, but then you need twice as much free disk space as the size of the output file - which I don't always have.

The smallest change that would fix this problem would be to provide a way to effectively disable use_temp_file on Linux, while still supporting use of e.g. /var/tmp if desired. That way, I could leverage 100% of my RAM, and my swap space, in order to convert these models. If we choose this route, we should make sure the converted tensor data is freed as it is written, to avoid unnecessary swapping - right now it is not.

We can't just change the simple scripts to convert on-the-fly, because they load one pytorch file at a time, so making two passes over the input tensors would have a high I/O cost.

We _could_ make LazyUnpickler part of the gguf module. That way, we can significantly reduce memory usage, and avoid the need to load the entire model into memory, while hopefully still keeping the non-LLaMA conversion scripts relatively simple.

@ggerganov what do you think?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

simple convert scripts run out of tmpfs space on Linux #3433

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

simple convert scripts run out of tmpfs space on Linux #3433

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions