Running two different models on the same machine

I want to run two different models on the same machine. Right now, I'm declaring two different `AsyncLLMEngine` objects such that the respective `gpu_memory_utilizations` add up to 1 but I'm getting CUDA OOM errors.

What would be the right way to do this? Thanks!