Skip to content

Commit 5f208ed

Browse files
committed
Update script
Signed-off-by: kerthcet <kerthcet@gmail.com>
1 parent 6293538 commit 5f208ed

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Easy, advanced inference platform for large language models on Kubernetes
3636
- **Accelerator Fungibility**: llmaz supports serving the same LLM with various accelerators to optimize cost and performance.
3737
- **SOTA Inference**: llmaz supports the latest cutting-edge researches like [Speculative Decoding](https://arxiv.org/abs/2211.17192) or [Splitwise](https://arxiv.org/abs/2311.18677)(WIP) to run on Kubernetes.
3838
- **Various Model Providers**: llmaz supports a wide range of model providers, such as [HuggingFace](https://huggingface.co/), [ModelScope](https://www.modelscope.cn), ObjectStores. llmaz will automatically handle the model loading, requiring no effort from users.
39-
- **Multi-hosts Support**: llmaz supports both single-host and multi-hosts scenarios with [LWS](https://github.com/kubernetes-sigs/lws) from day 0.
39+
- **Multi-Host Support**: llmaz supports both single-host and multi-host scenarios with [LWS](https://github.com/kubernetes-sigs/lws) from day 0.
4040
- **Scaling Efficiency**: llmaz supports horizontal scaling with [HPA](./docs/examples/hpa/README.md) by default and will integrate with autoscaling components like [Cluster-Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) or [Karpenter](https://github.com/kubernetes-sigs/karpenter) for smart scaling across different clouds.
4141

4242
## Quick Start
@@ -50,7 +50,7 @@ Read the [Installation](./docs/installation.md) for guidance.
5050
Here's a toy example for deploying `facebook/opt-125m`, all you need to do
5151
is to apply a `Model` and a `Playground`.
5252

53-
If you're running on CPUs, you can refer to [llama.cpp](/docs/examples/llamacpp/README.md), or more examples [here](/docs/examples/README.md).
53+
If you're running on CPUs, you can refer to [llama.cpp](/docs/examples/llamacpp/README.md), or more [examples](/docs/examples/README.md) here.
5454

5555
> Note: if your model needs Huggingface token for weight downloads, please run `kubectl create secret generic modelhub-secret --from-literal=HF_TOKEN=<your token>` ahead.
5656

hack/update-codegen.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ source "${CODEGEN_PKG}/kube_codegen.sh"
1414

1515
# TODO: remove the workaround when the issue is solved in the code-generator
1616
# (https://github.com/kubernetes/code-generator/issues/165).
17-
# Here, we create the soft link named "x-k8s.io" to the parent directory of
18-
# LeaderWorkerSet to ensure the layout required by the kube_codegen.sh script.
17+
# Here, we create the soft link named "github.com" to the parent directory of
18+
# llmaz to ensure the layout required by the kube_codegen.sh script.
1919
mkdir -p github.com && ln -s ../.. github.com/inftyai
2020
trap "rm -r github.com" EXIT
2121

0 commit comments

Comments
 (0)