Skip to content

Commit

Permalink
Fix nccl regression on PyTorch 2.3 upgrade (huggingface#2099)
Browse files Browse the repository at this point in the history
* fix nccl issue

* add note in dockerfile

* use v2.22.3 that also fixes @samsamoa's repro

* poetry actually can't handle the conflict between torch and nccl

* set LD_PRELOAD
  • Loading branch information
fxmarty authored and yuanwu2017 committed Sep 25, 2024
1 parent 48f1196 commit eaaea91
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion server/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,5 +35,5 @@ run-dev:
SAFETENSORS_FAST_GPU=1 python -m torch.distributed.run --nproc_per_node=2 text_generation_server/cli.py serve bigscience/bloom-560m --sharded

export-requirements:
poetry export -o requirements_cuda.txt --without-hashes
poetry export -o requirements_cuda.txt --without-hashes --with cuda
poetry export -o requirements_rocm.txt --without-hashes

0 comments on commit eaaea91

Please sign in to comment.