[Usage]: File Access Error When Using RunAI Model Streamer with S3 in VLLM

### Your current environment

```text
I am encountering a persistent issue when attempting to serve a model from an S3 bucket using the vllm serve command with the --load-format runai_streamer option. Despite having proper access to the S3 bucket and all required files being present, the process fails with a "File access error." Below are the details of the issue:

Command Used:
vllm serve s3://hip-general/benchmark-model-loading/ --load-format runai_streamer

Error Message:
Exception: Could not send runai_request to libstreamer due to: b'File access error'

Environment Details:
VLLM version: 0.6.6
Python version: 3.12
RunAI Model Streamer version: 0.11.2
S3 Region: us-west-2


Files in S3 Bucket:
config.json
generation_config.json
model-00001-of-00004.safetensors
model-00002-of-00004.safetensors
model-00003-of-00004.safetensors
model-00004-of-00004.safetensors
model.safetensors.index.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json
```


### my deployment file is 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: benchmark-model-8b
  namespace: workload
spec:
  replicas: 1
  selector:
    matchLabels:
      app: benchmark-model-8b
  strategy:
    type: Recreate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: benchmark-model-8b
    spec:
      containers:
      - command:
        - sh
        - -c
        - exec tail -f /dev/null
        env:
        - name: HF_HOME
          value: /huggingface
        - name: HUGGINGFACE_HUB_CACHE
          value: /huggingface/hub
        - name: HF_HUB_ENABLE_HF_TRANSFER
          value: "False"
        - name: HUGGING_FACE_HUB_TOKEN
          value: ""        
        image: vllm/vllm-openai:v0.6.6
        imagePullPolicy: IfNotPresent
        name: benchmark-model-8b
        ports:
        - containerPort: 8888
          name: http
          protocol: TCP
        resources:
          limits:
            nvidia.com/gpu: "1"
          requests:
            cpu: "5"
            memory: 128Gi
        securityContext:
          capabilities:
            add:
            - SYS_ADMIN
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /huggingface
          name: hf-volume
        - mountPath: /dev/shm
          name: dshm
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: hf-volume
        persistentVolumeClaim:
          claimName: benchmark-model-pvc
      - emptyDir:
          medium: Memory
          sizeLimit: 90Gi
        name: dshm





### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Usage]: File Access Error When Using RunAI Model Streamer with S3 in VLLM #12311

Your current environment

my deployment file is

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Usage]: File Access Error When Using RunAI Model Streamer with S3 in VLLM #12311

Description

Your current environment

my deployment file is

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions