Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Adapter '/data/llama2-lora' is not compatible with model '/data/Llama-2-7b-chat-hf'. Use --model-id '/new-model/llama2-7b/Llama-2-7b-chat-hf' instead. #172

Open
2 of 4 tasks
Senna1960321 opened this issue Jan 10, 2024 · 5 comments
Labels
question Further information is requested

Comments

@Senna1960321
Copy link

System Info

2024-01-10T09:14:20.356771Z INFO lorax_launcher: Args { model_id: "/data/Llama-2-7b-chat-hf", adapter_id: "/data/llama2-lora", source: "hub", adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: None, compile: false, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1023, max_total_tokens: 1024, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 1024, max_batch_total_tokens: Some(1024), max_waiting_tokens: 20, max_active_adapters: 128, adapter_cycle_time_s: 2, hostname: "e2bcf2fc09e3", port: 80, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-01-10T09:14:20.356869Z INFO download: lorax_launcher: Starting download process.
2024-01-10T09:14:23.227147Z WARN lorax_launcher: cli.py:145 No safetensors weights found for model /data/Llama-2-7b-chat-hf at revision None. Converting PyTorch weights to safetensors.

2024-01-10T09:14:25.972567Z INFO lorax_launcher: convert.py:114 Convert: [1/2] -- Took: 0:00:02.741882

2024-01-10T09:14:33.450451Z INFO lorax_launcher: convert.py:114 Convert: [2/2] -- Took: 0:00:07.477435

2024-01-10T09:14:33.450778Z INFO lorax_launcher: cli.py:104 Files are already present on the host. Skipping download.

2024-01-10T09:14:33.972217Z INFO download: lorax_launcher: Successfully downloaded weights.
2024-01-10T09:14:33.972518Z INFO shard-manager: lorax_launcher: Starting shard rank=0
2024-01-10T09:14:37.373745Z INFO lorax_launcher: flash_llama.py:74 Merging adapter weights from adapter_id /data/llama2-lora into model weights.

2024-01-10T09:14:37.375075Z ERROR lorax_launcher: server.py:235 Error when initializing model

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

volume=/home/user/data
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -it ghcr.io/predibase/lorax:latest --model-id /data/Llama-2-7b-chat-hf --adapter-id /data/llama2-lora --max-input-length 1023 --max-total-tokens 1024 --max-batch-total-tokens 1024 --max-batch-prefill-tokens 1024

Expected behavior

I train this lora on my local llama2 model, why it is not compatible

@Senna1960321
Copy link
Author

Senna1960321 commented Jan 10, 2024

I also try another way:

volume=/home/user/data
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -it ghcr.io/predibase/lorax:latest --model-id /data/Llama-2-7b-chat-hf --max-input-length 1023 --max-total-tokens 1024 --max-batch-total-tokens 1024 --max-batch-prefill-tokens 1024

from lorax import Client

client = Client("http://127.0.0.1:8080")

check.py

prompt = "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]"
print(client.generate(prompt, max_new_tokens=64).generated_text)

adapter_id = "/data/llama2-lora"
adapter_source = "local"
print(client.generate(prompt, max_new_tokens=64, adapter_id=adapter_id, adapter_source=adapter_source).generated_text)

python check.py
To find out how many clips Natalia sold altogether in April and May, we need to use the information given in the problem.

In April, Natalia sold clips to 48 of her friends. So, she sold a total of 48 clips in April.

In
Traceback (most recent call last):
File "check.py", line 12, in
print(client.generate(prompt, max_new_tokens=64, adapter_id=adapter_id, adapter_source=adapter_source).generated_text)
File "/home/azureuser/anaconda3/lib/python3.8/site-packages/lorax/client.py", line 157, in generate
raise parse_error(resp.status_code, payload)
lorax.errors.GenerationError: Request failed during generation: Server error: Incorrect path_or_model_id: '/new-model/llama2-7b/Llama-2-7b-chat-hf'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
@tgaddair

@abhibst
Copy link

abhibst commented Jan 10, 2024

may be looks like this :
#51

@Senna1960321
Copy link
Author

may be looks like this : #51

@abhibst I have already tried this solution, but it still error.
python check.py
To find out how many clips Natalia sold altogether in April and May, we need to use the information given in the problem.

In April, Natalia sold clips to 48 of her friends. So, she sold a total of 48 clips in April.

In
Traceback (most recent call last):
File "check.py", line 12, in
print(client.generate(prompt, max_new_tokens=64, adapter_id=adapter_id, adapter_source=adapter_source).generated_text)
File "/home/azureuser/anaconda3/lib/python3.8/site-packages/lorax/client.py", line 157, in generate
raise parse_error(resp.status_code, payload)
lorax.errors.GenerationError: Request failed during generation: Server error: Incorrect path_or_model_id: '/new-model/llama2-7b/Llama-2-7b-chat-hf'. Please provide either the path to a local folder or the repo_id of a model on the Hub.

@tgaddair
Copy link
Contributor

Hey @Senna1960321, sorry for the late reply!

For the first error you saw:

ValueError: Adapter '/data/llama2-lora' is not compatible with model '/data/Llama-2-7b-chat-hf'. Use --model-id '/new-model/llama2-7b/Llama-2-7b-chat-hf' instead.

This suggests you're running an older version of LoRAX. The error message was changed to a warning in #58. Can you try running docker pull ghcr.io/predibase/lorax:latest to get the latest image?

If you're still running into issues after that, then for the more recent errors, can you share the output of the following commands run from outside the container?

ls /home/user/data/Llama-2-7b-chat-hf

ls /home/user/data/llama2-lora

The error message is odd because it seems to suggest that it's looking for a model with path /new-model/llama2-7b/Llama-2-7b-chat-hf.

@tgaddair tgaddair added the question Further information is requested label Jan 12, 2024
@Senna1960321
Copy link
Author

@tgaddair Thanks for your reply, I solved this problem by make volume=new-model/llama2-7b and then
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/new-model/llama2-7b -it ghcr.io/predibase/lorax:latest --model-id /new-model/llama2-7b/Llama-2-7b-chat-hf --adapter-id /new-model/llama2-7b/llama2-lora
I fine tune this lora by Llama-2-7b-chat-hf with path /new-model/llama2-7b/Llama-2-7b-chat-hf, I don't know why loraX only recognize this path.
I have another question, when I use loraX I find it inference answer is worse than the normal way, even though the dataset I had already fine tuned. The decrease in the quality of the generated responses is not as pronounced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants