Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for ppc64le #1316

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

mgiessing
Copy link
Contributor

Initial support for PowerPC architecture following the design pattern introduced by the aarch64 PR

Signed-off-by: mgiessing <marvin.giessing@gmail.com>
@matthewdouglas
Copy link
Member

Thanks @mgiessing! I am not sure that we'll be able to commit to distributing binary wheels for ppc64le, but certainly welcome source compatibility!

I'll do some quick regression testing on this over the next couple days, but at first glance it looks good!

PowerPC support was deprecated in CUDA 12.4 and removed in 12.5. However my understanding is that there's still many AC922 systems with V100 GPUs out there, and maybe even some Power8 S822LC with P100s, so to add context and further advocate:

Example operational supercomputers:

Just a reference note: relates to #652

@mgiessing
Copy link
Contributor Author

Thanks a lot @matthewdouglas!
Yeah - I do not expect to have binary wheels but as you said there are many people/organisations having P100/V100/T4 on Power Systems so source compatibility is desired :-)

Btw. I figured out I had to rename the libbitsandbytes_cuda122.so to libbitsandbytes_cuda122_nocublaslt.so otherwise it would crash during the test...not sure I've done something wrong during the build which was the following:

## System: AC922 // CUDA 12.2 // RHEL8.9

# Create bnb environment and install dependencies via mamba/conda and rocketce channel
micromamba create -n bnb \
    -c rocketce \
    -c defaults \
    python=3.10 \
    pytorch==2.1.2 \
    pandas \
    scipy \
    matplotlib && micromamba clean --all --yes

micromamba activate bnb

#Install remaining depenedencies via pypi
pip3 install lion-pytorch wheel einops pytest setuptools>=63 transformers accelerate

git clone https://github.com/mgiessing/bitsandbytes

export PATH=$PATH:/usr/local/cuda/bin

cmake -DCMAKE_CUDA_ARCHITECTURES=70 -DCOMPUTE_BACKEND=cuda -S .

make -j$(nproc)
pip install -e .

cp bitsandbytes/libbitsandbytes*.so bitsandbytes/libbitsandbytes_cuda122_nocublaslt.so

#Simple test to check if it works
python3 -m bitsandbytes

@matthewdouglas
Copy link
Member

matthewdouglas commented Aug 16, 2024

If you add -DNO_CUBLASLT=ON to the cmake step it will build libbitsandbytes_cuda122_nocublaslt.so.

@matthewdouglas matthewdouglas merged commit 432a4f4 into bitsandbytes-foundation:main Aug 22, 2024
28 checks passed
matthewdouglas pushed a commit to matthewdouglas/bitsandbytes that referenced this pull request Oct 28, 2024
Signed-off-by: mgiessing <marvin.giessing@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants