AttributeError: 'RWConfig' object has no attribute 'num_ln_in_parallel_attn' #2349
Closed
Description
System Info
image: ghcr.io/huggingface/text-generation-inference:2.2.0-rocm
model: tiiuae/falcon-40b-instruct
GPU: MI250
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
- run docker run --gpus all --shm-size 1g -e HF_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:2.2.0-rocm --model-id tiiuae/falcon-40-instruct --num-shard 2
Expected behavior
The model should be sharded on 2 GPUs
Metadata
Assignees
Labels
No labels