### System Info Tgi version 3.0.1, official docker image: thanks for the amazing last releases 🤗 Within a kubernetes deployment with 256Gi mem request and shm volume. Prefix caching and chunking enabled. Works fine on 2xH100 but not on 4, i.e. CUDA_VISIBLE_DEVICES=0,1,2,3 Loading llama3.1-70b works fine on the same config with 4xH100. ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ### Reproduction Start TGI codestral-22b on 4 H100, it stucks at warming model phase. ### Expected behavior Autoconfig and model warmed up for codestral22b on 4 H100 as it works on 2