Skip to content

Commit 4c3aeef

Browse files
authored
[Fix] Minor Fixs for Tutorial and Bumped version to 0.0.9 (#154)
* Updated AWS, bumped release version Signed-off-by: hanchenli <lihanc2002@gmail.com> * minor fix to readme in aws Signed-off-by: hanchenli <lihanc2002@gmail.com> * minor fix to readme in aws Signed-off-by: hanchenli <lihanc2002@gmail.com> * fix format Signed-off-by: hanchenli <lihanc2002@gmail.com> * fix folder Signed-off-by: hanchenli <lihanc2002@gmail.com> --------- Signed-off-by: hanchenli <lihanc2002@gmail.com>
1 parent a5717ac commit 4c3aeef

File tree

8 files changed

+65
-64
lines changed

8 files changed

+65
-64
lines changed

README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,12 @@
1717
## Step-By-Step Tutorials
1818

1919
0. How To [*Install Kubernetes (kubectl, helm, minikube, etc)*](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md)?
20-
1. How To [*Setup a Minimal vLLM Production Stack*](https://github.com/vllm-project/production-stack/blob/main/tutorials/01-minimal-helm-installation.md)?
21-
2. How To [*Customize vLLM Configs (optional)*](https://github.com/vllm-project/production-stack/blob/main/tutorials/02-basic-vllm-config.md)?
22-
3. How to [*Load Your LLM Weights*](https://github.com/vllm-project/production-stack/blob/main/tutorials/03-load-model-from-pv.md)?
23-
4. How to [*Launch Different LLMs in vLLM Production Stack*](https://github.com/vllm-project/production-stack/blob/main/tutorials/04-launch-multiple-model.md)?
24-
5. How to [*Enable KV Cache Offloading with LMCache*](https://github.com/vllm-project/production-stack/blob/main/tutorials/05-offload-kv-cache.md)?
20+
1. How to [*Deploy Production Stack on Major Cloud Platforms (AWS, GCP, Azure)*](https://github.com/vllm-project/production-stack/blob/main/tutorials/cloud_deployments)?
21+
2. How To [*Setup a Minimal vLLM Production Stack*](https://github.com/vllm-project/production-stack/blob/main/tutorials/01-minimal-helm-installation.md)?
22+
3. How To [*Customize vLLM Configs (optional)*](https://github.com/vllm-project/production-stack/blob/main/tutorials/02-basic-vllm-config.md)?
23+
4. How to [*Load Your LLM Weights*](https://github.com/vllm-project/production-stack/blob/main/tutorials/03-load-model-from-pv.md)?
24+
5. How to [*Launch Different LLMs in vLLM Production Stack*](https://github.com/vllm-project/production-stack/blob/main/tutorials/04-launch-multiple-model.md)?
25+
6. How to [*Enable KV Cache Offloading with LMCache*](https://github.com/vllm-project/production-stack/blob/main/tutorials/05-offload-kv-cache.md)?
2526

2627
## Architecture
2728

deployment_on_cloud/aws/Readme.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
11
# Setting up EKS vLLM stack with one command
22

33
This script automatically configures a EKS LLM inference cluster.
4-
Make sure your AWS cli is set up, logged in, and region set up. You have eksctl, kubectl, helm installed.
4+
Make sure your AWS cli (v2) is installed, logged in, and region set up. You have eksctl, kubectl, helm installed.
55

66
Modify fields production_stack_specification.yaml and execute as:
77

88
```bash
99
bash entry_point.sh YOUR_AWSREGION YAML_FILE_PATH
1010
```
1111

12-
Clean up the service (not the VPC) with:
12+
Clean up the service with:
1313

1414
```bash
1515
bash clean_up.sh production-stack YOUR_AWSREGION
1616
```
17+
18+
You may also want to manually delete the VPC and clean up the cloud formation in the AWS Console.

deployment_on_cloud/aws/clean_up.sh

Lines changed: 46 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ for TG_ARN in $TG_ARNs; do
5454
aws elbv2 delete-target-group --target-group-arn "$TG_ARN" --region "$REGION"
5555
done
5656

57-
# Delete NAT Gateways
57+
# # Delete NAT Gateways
5858
echo "Deleting NAT Gateways..."
59-
NAT_GATEWAYS=$(aws ec2 describe-nat-gateways --filter "Name=tag:eks:cluster-name,Values=$CLUSTER_NAME" --query "NatGateways[].NatGatewayId" --output text --region "$REGION")
59+
NAT_GATEWAYS=$(aws ec2 describe-nat-gateways --filter "Name=tag:Name,Values=eksctl-${CLUSTER_NAME}-cluster/NATGateway" --query "NatGateways[].NatGatewayId" --output text --region "$REGION")
6060
for NAT_ID in $NAT_GATEWAYS; do
6161
aws ec2 delete-nat-gateway --nat-gateway-id "$NAT_ID" --region "$REGION"
6262
echo "Waiting for NAT Gateway $NAT_ID to be deleted..."
@@ -72,30 +72,27 @@ for EIP in $EIP_ALLOCS; do
7272
done
7373

7474
# Release EFS and the created security group
75-
while read -r fs_id; do
76-
echo "Processing File System: $fs_id"
75+
read -r fs_id < temp.txt
76+
echo "Processing File System: $fs_id"
7777

78-
# Get the list of mount targets
79-
mount_targets=$(aws efs describe-mount-targets --file-system-id "$fs_id" --query "MountTargets[*].MountTargetId" --output text)
78+
# Get the list of mount targets
79+
mount_targets=$(aws efs describe-mount-targets --file-system-id "$fs_id" --query "MountTargets[*].MountTargetId" --output text)
8080

81-
# Delete each mount target
82-
for mt_id in $mount_targets; do
83-
echo "Deleting Mount Target: $mt_id"
84-
aws efs delete-mount-target --mount-target-id "$mt_id"
85-
done
86-
87-
# Wait for mount targets to be deleted (optional, prevents API conflicts)
88-
while [[ -n $(aws efs describe-mount-targets --file-system-id "$fs_id" --query "MountTargets[*].MountTargetId" --output text) ]]; do
89-
echo "Waiting for mount targets to be deleted..."
90-
sleep 10
91-
done
92-
93-
# Delete the file system
94-
echo "Deleting File System: $fs_id"
95-
aws efs delete-file-system --file-system-id "$fs_id"
81+
# Delete each mount target
82+
for mt_id in $mount_targets; do
83+
echo "Deleting Mount Target: $mt_id"
84+
aws efs delete-mount-target --mount-target-id "$mt_id"
85+
done
9686

97-
done < temp.txt
87+
# Wait for mount targets to be deleted (optional, prevents API conflicts)
88+
while [[ -n $(aws efs describe-mount-targets --file-system-id "$fs_id" --query "MountTargets[*].MountTargetId" --output text) ]]; do
89+
echo "Waiting for mount targets to be deleted..."
90+
sleep 10
91+
done
9892

93+
# Delete the file system
94+
echo "Deleting File System: $fs_id"
95+
aws efs delete-file-system --file-system-id "$fs_id"
9996

10097
for sg in $(aws ec2 describe-security-groups --filters "Name=group-name,Values=efs-sg" --query "SecurityGroups[*].GroupId" --output text); do
10198

@@ -113,32 +110,32 @@ aws eks delete-cluster --name "$CLUSTER_NAME" --region "$REGION"
113110
echo "Waiting for cluster $CLUSTER_NAME to be deleted..."
114111
aws eks wait cluster-deleted --name "$CLUSTER_NAME" --region "$REGION"
115112

116-
# Delete CloudFormation Stack
117-
echo "Checking if CloudFormation stack exists for EKS cluster..."
118-
STACK_NAME="eksctl-${CLUSTER_NAME}-cluster"
119-
STACK_STATUS=$(aws cloudformation describe-stacks --stack-name "$STACK_NAME" --region "$REGION" --query "Stacks[0].StackStatus" --output text 2>/dev/null)
120-
121-
if [ -n "$STACK_STATUS" ]; then
122-
echo "Deleting CloudFormation stack: $STACK_NAME"
123-
aws cloudformation delete-stack --stack-name "$STACK_NAME" --region "$REGION"
124-
echo "Waiting for CloudFormation stack $STACK_NAME to be deleted..."
125-
aws cloudformation wait stack-delete-complete --stack-name "$STACK_NAME" --region "$REGION"
126-
echo "CloudFormation stack $STACK_NAME has been deleted successfully!"
127-
else
128-
echo "CloudFormation stack $STACK_NAME not found, skipping..."
129-
fi
130-
131-
STACK_NAME="eksctl-${CLUSTER_NAME}-cluster-nodegroup-gpu-nodegroup"
132-
STACK_STATUS=$(aws cloudformation describe-stacks --stack-name "$STACK_NAME" --region "$REGION" --query "Stacks[0].StackStatus" --output text 2>/dev/null)
133-
134-
if [ -n "$STACK_STATUS" ]; then
135-
echo "Deleting CloudFormation stack: $STACK_NAME"
136-
aws cloudformation delete-stack --stack-name "$STACK_NAME" --region "$REGION"
137-
echo "Waiting for CloudFormation stack $STACK_NAME to be deleted..."
138-
aws cloudformation wait stack-delete-complete --stack-name "$STACK_NAME" --region "$REGION"
139-
echo "CloudFormation stack $STACK_NAME has been deleted successfully!"
140-
else
141-
echo "CloudFormation stack $STACK_NAME not found, skipping..."
142-
fi
113+
# Clean up VPC
114+
# echo "Cleaning up VPC..."
115+
# VPC_ID=$(aws ec2 describe-vpcs \
116+
# --filters "Name=tag:Name,Values=eksctl-${CLUSTER_NAME}-cluster/VPC" \
117+
# --query "Vpcs[0].VpcId" \
118+
# --output text \
119+
# --region "$REGION")
120+
# if [ -n "$VPC_ID" ]; then
121+
# echo "Deleting VPC: $VPC_ID"
122+
# aws ec2 delete-vpc --vpc-id "$VPC_ID" --region "$REGION"
123+
# else
124+
# echo "VPC not found, skipping..."
125+
# fi
126+
127+
# Delete CloudFormation Stackecho "Deleting CloudFormation stacks..."
128+
# STACKS=( "eksctl-${CLUSTER_NAME}-cluster" "eksctl-${CLUSTER_NAME}-cluster-nodegroup-gpu-nodegroup" )
129+
# for STACK_NAME in "${STACKS[@]}"; do
130+
# STACK_STATUS=$(aws cloudformation describe-stacks --stack-name "$STACK_NAME" --region "$REGION" --query "Stacks[0].StackStatus" --output text 2>/dev/null)
131+
# if [ -n "$STACK_STATUS" ]; then
132+
# echo "Deleting CloudFormation stack: $STACK_NAME"
133+
# aws cloudformation delete-stack --stack-name "$STACK_NAME" --region "$REGION"
134+
# echo "Waiting for CloudFormation stack $STACK_NAME to be deleted..."
135+
# aws cloudformation wait stack-delete-complete --stack-name "$STACK_NAME" --region "$REGION"
136+
# else
137+
# echo "CloudFormation stack $STACK_NAME not found, skipping..."
138+
# fi
139+
# done
143140

144141
echo "EKS cluster $CLUSTER_NAME cleanup completed successfully!"

deployment_on_cloud/aws/entry_point.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ eksctl create iamserviceaccount \
3737

3838
#create pv after modify the filesys id to be the filesys id
3939
#storage needed is based on model weights
40-
EFS_ID=$(cat temp.text)
40+
EFS_ID=$(cat temp.txt)
4141

4242
cat <<EOF > efs-pv.yaml
4343
apiVersion: v1

deployment_on_cloud/aws/set_up_efs.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,4 +70,4 @@ done
7070

7171
echo "EFS setup complete!"
7272
echo "File System ID: $EFS_ID"
73-
echo "$EFS_ID" > temp.text
73+
echo "$EFS_ID" > temp.txt

helm/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ type: application
1515
# This is the chart version. This version number should be incremented each time you make changes
1616
# to the chart and its templates, including the app version.
1717
# Versions are expected to follow Semantic Versioning (https://semver.org/)
18-
version: 0.0.8
18+
version: 0.0.9
1919

2020
maintainers:
2121
- name: apostac

tutorials/deployments/01-AWS-EKS-deployment.md renamed to tutorials/cloud_deployments/01-AWS-EKS-deployment.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,13 @@ This guide walks you through the script that sets up a vLLM production-stack on
66

77
Before running this setup, ensure you have:
88

9-
1. AWS CLI installed and configured with credential and region set up.
10-
2. AWS eksctl
11-
3. Kubectl
12-
4. Helm
9+
1. AWS CLI (version higher than v2) installed and configured with credential and region [[Link]](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
10+
2. AWS eksctl [[Link]](https://eksctl.io/installation/)
11+
3. Kubectl and Helm [[Link]](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md)
1312

1413
## TLDR
1514

16-
To run the service
15+
To run the service, go into the "deployment_on_cloud/aws" folder and run:
1716

1817
```bash
1918
bash entry_point.sh YOUR_AWSREGION EXAMPLE_YAML_PATH
@@ -243,6 +242,8 @@ This step cleans up EKS, mount-points, created security groups, EFS.
243242
bash clean_up.sh "$CLUSTER_NAME" "$AWS_REGION"
244243
```
245244

245+
You may also want to manually delete the VPC and clean up the cloud formation in the AWS Console.
246+
246247
## Summary
247248

248249
This tutorial covers:
File renamed without changes.

0 commit comments

Comments
 (0)