Controlling Continuous GPU Memory Allocation via Environment Variable #13350

thomasbergersen · 2025-05-07T09:36:48Z

thomasbergersen
May 7, 2025

Can there be an environment variable (e.g., export LLAMA_CUDA_FORCE_MAX_MEM=8192) to enforce contiguous GPU memory allocation? On Ada 6000 , if llama-server is launched first, VLLM throws an OOM error, but if VLLM is initialized first, the OOM does not occur.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Controlling Continuous GPU Memory Allocation via Environment Variable #13350

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Controlling Continuous GPU Memory Allocation via Environment Variable #13350

Uh oh!

Uh oh!

thomasbergersen May 7, 2025

Replies: 0 comments

thomasbergersen
May 7, 2025