Skip to content

[Usage]: File Access Error When Using RunAI Model Streamer with S3 in VLLM #12311

Closed
@ghost

Description

Your current environment

I am encountering a persistent issue when attempting to serve a model from an S3 bucket using the vllm serve command with the --load-format runai_streamer option. Despite having proper access to the S3 bucket and all required files being present, the process fails with a "File access error." Below are the details of the issue:

Command Used:
vllm serve s3://hip-general/benchmark-model-loading/ --load-format runai_streamer

Error Message:
Exception: Could not send runai_request to libstreamer due to: b'File access error'

Environment Details:
VLLM version: 0.6.6
Python version: 3.12
RunAI Model Streamer version: 0.11.2
S3 Region: us-west-2


Files in S3 Bucket:
config.json
generation_config.json
model-00001-of-00004.safetensors
model-00002-of-00004.safetensors
model-00003-of-00004.safetensors
model-00004-of-00004.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json

my deployment file is

apiVersion: apps/v1
kind: Deployment
metadata:
name: benchmark-model-8b
namespace: workload
spec:
replicas: 1
selector:
matchLabels:
app: benchmark-model-8b
strategy:
type: Recreate
template:
metadata:
creationTimestamp: null
labels:
app: benchmark-model-8b
spec:
containers:
- command:
- sh
- -c
- exec tail -f /dev/null
env:
- name: HF_HOME
value: /huggingface
- name: HUGGINGFACE_HUB_CACHE
value: /huggingface/hub
- name: HF_HUB_ENABLE_HF_TRANSFER
value: "False"
- name: HUGGING_FACE_HUB_TOKEN
value: ""
image: vllm/vllm-openai:v0.6.6
imagePullPolicy: IfNotPresent
name: benchmark-model-8b
ports:
- containerPort: 8888
name: http
protocol: TCP
resources:
limits:
nvidia.com/gpu: "1"
requests:
cpu: "5"
memory: 128Gi
securityContext:
capabilities:
add:
- SYS_ADMIN
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /huggingface
name: hf-volume
- mountPath: /dev/shm
name: dshm
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: hf-volume
persistentVolumeClaim:
claimName: benchmark-model-pvc
- emptyDir:
medium: Memory
sizeLimit: 90Gi
name: dshm

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    usageHow to use vllm

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions