Description
Bug Description
Everything works well when I'm using 1 GPU, but as soon as I try to load a model on 4 separate GPUs, I get this error:
MODEL_LOG - RuntimeError: [Error thrown at core/runtime/TRTEngine.cpp:42] Expected most_compatible_device to be true but got false
MODEL_LOG - No compatible device was found for instantiating TensorRT engine
To Reproduce
Steps to reproduce the behavior:
Create a (.ts) model and load it on 4 different GPUs. I don't know if this is specific to TorchServe, or a general issue.
Here's the simple version (TorchServe Handler):
def initialize(self, ctx):
properties = ctx.system_properties
self.device = torch.device("cuda:" + str(properties.get("gpu_id")) if torch.cuda.is_available() else "cpu")
self.model = torch.jit.load('model.ts')
I'm not sure if it relates to this issue. From what I can tell it seems like I need to restrict the CUDA context, however, the GPU is assigned in the handler. I tried these things, but it's still giving me the same problem.
def initialize(self, ctx):
properties = ctx.system_properties
self.device = torch.device("cuda:" + str(properties.get("gpu_id")) if torch.cuda.is_available() else "cpu")
torch.cuda.set_device(self.device)
torch_tensorrt.set_device(int(properties.get("gpu_id")))
with torch.cuda.device(int(properties.get("gpu_id"))):
self.model = torch.jit.load('model.ts')
self.model.to(self.device)
self.model.eval()
I also tried mapping the model straight to the GPU on load, but with the same problem.
Expected behavior
Load a .ts model by specifying the GPU Id without any issues.
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
Official PyTorch image: nvcr.io/nvidia/pytorch:22.12-py3
GPUs: 4x NVIDIA A10G
Pytorch: 1.14.0a0+410ce96
NVIDIA CUDA 11.8.0
TensorRT 8.5.1
Ubuntu 20.04 including Python 3.8
NVIDIA CUDA® 11.8.0
NVIDIA cuBLAS 11.11.3.6
NVIDIA cuDNN 8.7.0.84
NVIDIA NCCL 2.15.5 (optimized for NVIDIA NVLink®)
NVIDIA RAPIDS™ 22.10.01 (For x86, only these libraries are included: cudf, xgboost, rmm, cuml, and cugraph.)
Apex
rdma-core 36.0
NVIDIA HPC-X 2.13
OpenMPI 4.1.4+
GDRCopy 2.3
TensorBoard 2.9.0
Nsight Compute 2022.3.0.0
Nsight Systems 2022.4.2.1
NVIDIA TensorRT™ 8.5.1
Torch-TensorRT 1.1.0a0
NVIDIA DALI® 1.20.0
MAGMA 2.6.2
JupyterLab 2.3.2 including Jupyter-TensorBoard
TransformerEngine 0.3.0