Support Kuberenetes for Distributed Serving

Only having support for ray for distributed inference will significantly reduce adoption of this tool if it truly is more performant than TGI. TGI can be run as a black-box image on Kubernetes with support for sharded models and vLLM should support this as well.