-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Adapter '/data/llama2-lora' is not compatible with model '/data/Llama-2-7b-chat-hf'. Use --model-id '/new-model/llama2-7b/Llama-2-7b-chat-hf' instead. #172
Comments
I also try another way: volume=/home/user/data from lorax import Client client = Client("http://127.0.0.1:8080") check.py prompt = "[INST] Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? [/INST]" adapter_id = "/data/llama2-lora" python check.py In April, Natalia sold clips to 48 of her friends. So, she sold a total of 48 clips in April. In |
may be looks like this : |
@abhibst I have already tried this solution, but it still error. In April, Natalia sold clips to 48 of her friends. So, she sold a total of 48 clips in April. In |
Hey @Senna1960321, sorry for the late reply! For the first error you saw:
This suggests you're running an older version of LoRAX. The error message was changed to a warning in #58. Can you try running If you're still running into issues after that, then for the more recent errors, can you share the output of the following commands run from outside the container?
The error message is odd because it seems to suggest that it's looking for a model with path |
@tgaddair Thanks for your reply, I solved this problem by make volume=new-model/llama2-7b and then |
System Info
2024-01-10T09:14:20.356771Z INFO lorax_launcher: Args { model_id: "/data/Llama-2-7b-chat-hf", adapter_id: "/data/llama2-lora", source: "hub", adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: None, compile: false, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1023, max_total_tokens: 1024, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 1024, max_batch_total_tokens: Some(1024), max_waiting_tokens: 20, max_active_adapters: 128, adapter_cycle_time_s: 2, hostname: "e2bcf2fc09e3", port: 80, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-01-10T09:14:20.356869Z INFO download: lorax_launcher: Starting download process.
2024-01-10T09:14:23.227147Z WARN lorax_launcher: cli.py:145 No safetensors weights found for model /data/Llama-2-7b-chat-hf at revision None. Converting PyTorch weights to safetensors.
2024-01-10T09:14:25.972567Z INFO lorax_launcher: convert.py:114 Convert: [1/2] -- Took: 0:00:02.741882
2024-01-10T09:14:33.450451Z INFO lorax_launcher: convert.py:114 Convert: [2/2] -- Took: 0:00:07.477435
2024-01-10T09:14:33.450778Z INFO lorax_launcher: cli.py:104 Files are already present on the host. Skipping download.
2024-01-10T09:14:33.972217Z INFO download: lorax_launcher: Successfully downloaded weights.
2024-01-10T09:14:33.972518Z INFO shard-manager: lorax_launcher: Starting shard rank=0
2024-01-10T09:14:37.373745Z INFO lorax_launcher: flash_llama.py:74 Merging adapter weights from adapter_id /data/llama2-lora into model weights.
2024-01-10T09:14:37.375075Z ERROR lorax_launcher: server.py:235 Error when initializing model
Information
Tasks
Reproduction
volume=/home/user/data
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -it ghcr.io/predibase/lorax:latest --model-id /data/Llama-2-7b-chat-hf --adapter-id /data/llama2-lora --max-input-length 1023 --max-total-tokens 1024 --max-batch-total-tokens 1024 --max-batch-prefill-tokens 1024
Expected behavior
I train this lora on my local llama2 model, why it is not compatible
The text was updated successfully, but these errors were encountered: