Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading adapter from S3 fails #155

Closed
2 of 4 tasks
joaopcm1996 opened this issue Jan 2, 2024 · 6 comments · Fixed by #161 or #246
Closed
2 of 4 tasks

Loading adapter from S3 fails #155

joaopcm1996 opened this issue Jan 2, 2024 · 6 comments · Fixed by #161 or #246
Assignees
Labels
bug Something isn't working

Comments

@joaopcm1996
Copy link

joaopcm1996 commented Jan 2, 2024

Uploaded the adapter used in the default example in LoRAX docs to a personal lorax-adapters bucket on S3, no change to dir structure:

aws s3 ls s3://lorax-adapters/adapter-1/ --human-readable

2024-01-02 16:15:10    1.5 KiB .gitattributes
2024-01-02 16:15:10    1.3 KiB README.md
2024-01-02 16:15:10  501 Bytes adapter_config.json
2024-01-02 16:15:10   13.0 MiB adapter_model.bin
2024-01-02 16:15:10    4.0 KiB training_args.bin

Tried to load the adapter dynamically from S3 using Python client, got :

Traceback (most recent call last):
  File "/home/ubuntu/user/client.py", line 8, in <module>
    print(client.generate(prompt, max_new_tokens=64, adapter_id=adapter_id, adapter_source=adapter_source).generated_text)
  File "/opt/conda/envs/pytorch/lib/python3.10/site-packages/lorax/client.py", line 157, in generate
    raise parse_error(resp.status_code, payload)
lorax.errors.GenerationError: Request failed during generation: Server error: PREDIBASE_MODEL_BUCKET environment variable is not set

this env var is not documented, happy to contribute. Relaunching the container with PREDIBASE_MODEL_BUCKET set, got

Traceback (most recent call last):
  File "/home/ubuntu/user/client.py", line 8, in <module>
    print(client.generate(prompt, max_new_tokens=64, adapter_id=adapter_id, adapter_source=adapter_source).generated_text)
  File "/opt/conda/envs/pytorch/lib/python3.10/site-packages/lorax/client.py", line 157, in generate
    raise parse_error(resp.status_code, payload)
lorax.errors.GenerationError: Request failed during generation: Server error: No .bin weights found for model s3://lorax-adapters/adapter-1

even though I have verified that adapter_model.bin is present in the provided path.

Running latest Docker on an EC2 g5 instance, instance role has s3 get permission.

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

#!/bin/bash

# Create bucket, assuming AWS creds are configured
acct_id=$(aws sts get-caller-identity --query "Account" --output text)
bucket_name=${acct_id}-lorax-adapters

aws s3 mb s3://${bucket_name}

# Install HF HUB CLI
pip install -U "huggingface_hub[cli]"
huggingface-cli download vineetsharma/qlora-adapter-Mistral-7B-Instruct-v0.1-gsm8k --local-dir ./mistral-adapter

# Upload adapter to S3
aws s3 cp ./mistral-adapter s3://${bucket_name}/adapter-1 --recursive

# Run LoRAX docker container with PREDIBASE_MODEL_BUCKET set to new bucket name
model=mistralai/Mistral-7B-Instruct-v0.1
volume=$PWD/data  # share a volume with the container as a weight cache

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e PREDIBASE_MODEL_BUCKET=${bucket_name} \
    ghcr.io/predibase/lorax:latest --model-id $model
# python

from lorax import Client

bucket_name=<REPLACE_BUCKET_NAME>
client = Client("http://127.0.0.1:8080")
prompt = "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]"
adapter_id = "s3://bucket_name/adapter-1"
adapter_source = 's3'

print(client.generate(prompt,max_new_tokens=64, adapter_id=adapter_id, adapter_source=adapter_source).generated_text)

Expected behavior

Adapter is downloaded from S3 and loaded to GPU for inference.

@tgaddair tgaddair added the bug Something isn't working label Jan 3, 2024
@tgaddair
Copy link
Contributor

tgaddair commented Jan 3, 2024

Thanks for reporting @joaopcm1996. This code path definitely hasn't been tested thoroughly outside of our internal usage of it. I can definitely take a look and see what the issue is.

@tgaddair tgaddair self-assigned this Jan 3, 2024
@tgaddair
Copy link
Contributor

tgaddair commented Jan 3, 2024

@joaopcm1996 assuming you set PREDIBASE_BUCKET_NAME to bucket_name (or whatever the true bucket name is), can you try setting adapter_id = "adapter-1" in your script? I believe this is the current form the s3 source expects (which is out of date with the docs, I'll fix this as soon as possible to be consistent with the docs).

@joaopcm1996
Copy link
Author

joaopcm1996 commented Jan 3, 2024

Thanks @tgaddair, that solved it!

@tgaddair
Copy link
Contributor

tgaddair commented Jan 4, 2024

Thanks for confirming, @joaopcm1996! This should be fixed going forward by #161.

@joaopcm1996
Copy link
Author

@tgaddair I believe there is still a discrepancy between docs and behaviour; an absolute S3 path (s3://bucket/adapter_prefix) does not work as adapter_id, as the docs indicate.

@joaopcm1996 joaopcm1996 mentioned this issue Feb 1, 2024
4 tasks
@DhruvaBansal00
Copy link
Contributor

^ Facing the same issue @tgaddair

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants