document for huggingface(vllm) servingruntime for multi-node #402

Jooho · 2024-10-08T14:04:13Z

"Fixes #issue-number" or "Add description of the problem this PR solves"

Proposed Changes

This PR add a new documentation for setting up multi-node/multi-GPU inference using the Hugging Face LLM Serving Runtime. It includes detailed instructions on prerequisites, key configurations, model inference, and sample requests for OpenAI completions and chat endpoints. This documentation aims to enhance user understanding and streamline the deployment process, ensuring a smooth experience for developers looking to leverage Hugging Face's capabilities in a Kubernetes environment

This documentation is valid only after kserve/kserve#3972 is merged.

netlify · 2024-10-08T14:04:29Z

✅ Deploy Preview for elastic-nobel-0aef7a ready!

Name	Link
🔨 Latest commit	`4e7cca4`
🔍 Latest deploy log	https://app.netlify.com/sites/elastic-nobel-0aef7a/deploys/6707f8559eecb3000885828d
😎 Deploy Preview	https://deploy-preview-402--elastic-nobel-0aef7a.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: jooho lee <jlee@redhat.com>

Jooho mentioned this pull request Oct 8, 2024

Support Multi-Node Inference and Serving. kserve/kserve#3870

Open

4 tasks

Jooho marked this pull request as draft October 8, 2024 15:13

document for huggingface(vllm) servingruntime for multi-node

4e7cca4

Signed-off-by: jooho lee <jlee@redhat.com>

Jooho force-pushed the multi-node branch from dda0af9 to 4e7cca4 Compare October 10, 2024 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document for huggingface(vllm) servingruntime for multi-node #402

document for huggingface(vllm) servingruntime for multi-node #402

Jooho commented Oct 8, 2024

netlify bot commented Oct 8, 2024 •

edited

Loading

document for huggingface(vllm) servingruntime for multi-node #402

Are you sure you want to change the base?

document for huggingface(vllm) servingruntime for multi-node #402

Conversation

Jooho commented Oct 8, 2024

Proposed Changes

netlify bot commented Oct 8, 2024 • edited Loading

✅ Deploy Preview for elastic-nobel-0aef7a ready!

netlify bot commented Oct 8, 2024 •

edited

Loading