[ROCm] Using a more precise memory profiling (vllm-project#12624)

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by: saeediy <saidakbarp@gmail.com>
Said-Akbar · Mar 7, 2025 · 20a30bf · 20a30bf
1 parent f422c49
commit 20a30bf
Showing 1 changed file with 2 additions and 1 deletion.
diff --git a/vllm/platforms/rocm.py b/vllm/platforms/rocm.py
@@ -169,4 +169,5 @@ def get_current_memory_usage(cls,
                                  device: Optional[torch.types.Device] = None
                                  ) -> float:
         torch.cuda.reset_peak_memory_stats(device)
-        return torch.cuda.max_memory_allocated(device)
+        return torch.cuda.mem_get_info(device)[1] - torch.cuda.mem_get_info(
+            device)[0]