-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flash-attn #26239
Add flash-attn #26239
Conversation
Flash Attention: Fast and Memory-Efficient Exact Attention! Repo at https://github.com/Dao-AILab/flash-attention
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
To try and fix `OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root`
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( I do have some suggestions for making it better though... For recipes/flash-attn:
|
This flash-attn library only runs on Linux with CUDA GPUs if I'm not mistaken.
Needed to compile flash-attn on CUDA 12.0 in conda-forge.
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipes/flash-attn:
|
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipes/flash-attn:
|
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
recipes/flash-attn/meta.yaml
Outdated
number: 0 | ||
script: {{ PYTHON }} -m pip install . -vvv --no-deps --no-build-isolation | ||
script_env: | ||
- FLASH_ATTENTION_FORCE_BUILD=TRUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't set FLASH_ATTENTION_FORCE_BUILD=TRUE
, the package tries to download pre-build binaries instead of building them. Pre-built binaries are not allowed on our channel, you must compile all binaries with our toolchains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @carterbox for helping out and pushing that super helpful patch! I've kinda just started from the output of grayskull pypi flash-attn
, and hoped that it would work almost out of the box (after adding in the correct cuda-related dependencies).
It seems like the linux-cuda builds timed out, so I'll kickstart that again, and maybe do some testing locally to make sure it works.
Because this package takes so long to compile, it probably exceeds the CI limits here. Try adjusting the Azure timeout to 6 hours. I believe that is the longest allowed. https://conda-forge.org/docs/maintainer/conda_forge_yml/#timeout-minutes You can also debug locally. Set TORCH_CUDA_ARCH_LIST to only one arch to reduce compile times. |
Only compiling on Compute Capability 8.0 and above, see https://developer.nvidia.com/cuda-gpus. I.e. NVIDIA Ampere generation devices or newer.
Ah, didn't realize we could add a
Good tip. Actually, I think FlashAttention-2 only works on NVIDIA Ampere generation GPUs or newer according to https://github.com/Dao-AILab/flash-attention/tree/v2.5.8?tab=readme-ov-file#installation-and-features, so I've set |
azure: | ||
timeout_minutes: 360 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it seems that the azure builds still timeout after 30-40min (e.g. at https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=927858&view=logs&jobId=67448ffb-e003-5bfa-c062-cee3af60fcba&j=67448ffb-e003-5bfa-c062-cee3af60fcba&t=818ff20d-11b7-59db-6ce1-bb4df921454a). Maybe this only works on the feedstock repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦🏼 You're right. But also, it looks like the timeout is already set to 360 minutes in staged-recipes. So probably, the builds are failing for other reasons. Perhaps, the worker crashes by running out of RAM or disk space? Let's try reducing the compute load as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this file because we need to have it in the feedstock.
Ah, my local build on Linux CUDA 12.0 finally completed. Posting the tail end of the logs for reference: [48/49] /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/bin/nvcc -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/flash_attn/src/flash_fwd_split_hdim96_bf16_sm80.cu -o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1 -ccbin /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/bin/x86_64-conda-linux-gnu-cc
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
[49/49] /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/bin/nvcc -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/csrc/flash_attn/src/flash_fwd_split_hdim96_fp16_sm80.cu -o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_fp16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1 -ccbin /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/bin/x86_64-conda-linux-gnu-cc
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/bin/x86_64-conda-linux-gnu-c++ -shared -Wl,--allow-shlib-undefined -Wl,-rpath,/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-rpath-link,/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,--allow-shlib-undefined -Wl,-rpath,/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-rpath-link,/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-rpath-link,/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/lib/stubs -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work=/usr/local/src/conda/flash-attn-2.5.8 -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_=/usr/local/src/conda-prefix -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/lib/stubs -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/targets/x86_64-linux/lib/stubs /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim160_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim192_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim192_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim224_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim224_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim256_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim256_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim32_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim32_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim64_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim64_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim96_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim96_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim128_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim128_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim160_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim160_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim192_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim192_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim224_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim224_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim256_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim256_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim32_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim32_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim64_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim64_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim96_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim96_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim128_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim128_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim160_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim160_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim192_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim192_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim224_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim224_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim256_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim256_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim32_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim32_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim64_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim64_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_fp16_sm80.o -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-cpython-311/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so
/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
!!
self.initialize_options()
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/mha.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/mlp.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/embedding.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/block.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
creating build/bdist.linux-x86_64/wheel/flash_attn/losses
copying build/lib.linux-x86_64-cpython-311/flash_attn/losses/cross_entropy.py -> build/bdist.linux-x86_64/wheel/flash_attn/losses
copying build/lib.linux-x86_64-cpython-311/flash_attn/losses/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/losses
copying build/lib.linux-x86_64-cpython-311/flash_attn/fused_softmax.py -> build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/ops
creating build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/rotary.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/linear.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/cross_entropy.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/mlp.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/k_activations.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/layer_norm.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/fused_dense.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/activations.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/layer_norm.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/rms_norm.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_attn_triton_og.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/bert_padding.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_blocksparse_attention.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_attn_triton.py -> build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/btlm.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/gptj.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/baichuan.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/opt.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/gpt_neox.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/bert.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/gpt.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/bigcode.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/falcon.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/llama.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/vit.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_attn_interface.py -> build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/layers/rotary.py -> build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/layers/patch_embed.py -> build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/layers/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/layers
creating build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/generation.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/distributed.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/pretrained.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/benchmark.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_blocksparse_attn_interface.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel
running install_egg_info
running egg_info
writing flash_attn.egg-info/PKG-INFO
writing dependency_links to flash_attn.egg-info/dependency_links.txt
writing requirements to flash_attn.egg-info/requires.txt
writing top-level names to flash_attn.egg-info/top_level.txt
reading manifest file 'flash_attn.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*.cu' under directory 'flash_attn'
warning: no files found matching '*.h' under directory 'flash_attn'
warning: no files found matching '*.cuh' under directory 'flash_attn'
warning: no files found matching '*.cpp' under directory 'flash_attn'
warning: no files found matching '*.hpp' under directory 'flash_attn'
adding license file 'LICENSE'
adding license file 'AUTHORS'
writing manifest file 'flash_attn.egg-info/SOURCES.txt'
Copying flash_attn.egg-info to build/bdist.linux-x86_64/wheel/flash_attn-2.5.8-py3.11.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/flash_attn-2.5.8.dist-info/WHEEL
creating '/tmp/pip-wheel-x3dm7157/flash_attn-2.5.8-cp311-cp311-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so'
adding 'flash_attn/__init__.py'
adding 'flash_attn/bert_padding.py'
adding 'flash_attn/flash_attn_interface.py'
adding 'flash_attn/flash_attn_triton.py'
adding 'flash_attn/flash_attn_triton_og.py'
adding 'flash_attn/flash_blocksparse_attention.py'
adding 'flash_attn/flash_blocksparse_attn_interface.py'
adding 'flash_attn/fused_softmax.py'
adding 'flash_attn/layers/__init__.py'
adding 'flash_attn/layers/patch_embed.py'
adding 'flash_attn/layers/rotary.py'
adding 'flash_attn/losses/__init__.py'
adding 'flash_attn/losses/cross_entropy.py'
adding 'flash_attn/models/__init__.py'
adding 'flash_attn/models/baichuan.py'
adding 'flash_attn/models/bert.py'
adding 'flash_attn/models/bigcode.py'
adding 'flash_attn/models/btlm.py'
adding 'flash_attn/models/falcon.py'
adding 'flash_attn/models/gpt.py'
adding 'flash_attn/models/gpt_neox.py'
adding 'flash_attn/models/gptj.py'
adding 'flash_attn/models/llama.py'
adding 'flash_attn/models/opt.py'
adding 'flash_attn/models/vit.py'
adding 'flash_attn/modules/__init__.py'
adding 'flash_attn/modules/block.py'
adding 'flash_attn/modules/embedding.py'
adding 'flash_attn/modules/mha.py'
adding 'flash_attn/modules/mlp.py'
adding 'flash_attn/ops/__init__.py'
adding 'flash_attn/ops/activations.py'
adding 'flash_attn/ops/fused_dense.py'
adding 'flash_attn/ops/layer_norm.py'
adding 'flash_attn/ops/rms_norm.py'
adding 'flash_attn/ops/triton/__init__.py'
adding 'flash_attn/ops/triton/cross_entropy.py'
adding 'flash_attn/ops/triton/k_activations.py'
adding 'flash_attn/ops/triton/layer_norm.py'
adding 'flash_attn/ops/triton/linear.py'
adding 'flash_attn/ops/triton/mlp.py'
adding 'flash_attn/ops/triton/rotary.py'
adding 'flash_attn/utils/__init__.py'
adding 'flash_attn/utils/benchmark.py'
adding 'flash_attn/utils/distributed.py'
adding 'flash_attn/utils/generation.py'
adding 'flash_attn/utils/pretrained.py'
adding 'flash_attn-2.5.8.dist-info/AUTHORS'
adding 'flash_attn-2.5.8.dist-info/LICENSE'
adding 'flash_attn-2.5.8.dist-info/METADATA'
adding 'flash_attn-2.5.8.dist-info/WHEEL'
adding 'flash_attn-2.5.8.dist-info/top_level.txt'
adding 'flash_attn-2.5.8.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
Building wheel for flash_attn (setup.py): finished with status 'done'
Created wheel for flash_attn: filename=flash_attn-2.5.8-cp311-cp311-linux_x86_64.whl size=118037430 sha256=98dcb93971fcf325e6e445753d4bde87c83b2f106ae7d9623f41fbaaf887ed32
Stored in directory: /tmp/pip-ephem-wheel-cache-wunavxq9/wheels/f9/2f/9f/b8e4397695654fd6038ef99f6fc6a1e126be3c8b23e8ee6855
Successfully built flash_attn
Installing collected packages: flash_attn
Successfully installed flash_attn-2.5.8
Removed build tracker: '/tmp/pip-build-tracker-fy6htau6'
Resource usage statistics from building flash-attn:
Process count: 20
CPU time: Sys=0:02:37.8, User=4:35:12.0
Memory: 22.5G
Disk usage: 1.1M
Time elapsed: 1:12:55.7
Packaging flash-attn
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'FLASH_ATTENTION_FORCE_BUILD' is being passed through with value 'TRUE'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'FLASH_ATTENTION_SKIP_CUDA_BUILD' is being passed through with value 'FALSE'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'FLASH_ATTENTION_FORCE_CXX11_ABI' is being passed through with value 'FALSE'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'MAX_JOBS' is being passed through with value '$CPU_COUNT'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'TORCH_CUDA_ARCH_LIST' is being passed through with value '"8.6+PTX"'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
Packaging flash-attn-2.5.8-py311h379968c_0
compiling .pyc files...
number of files: 104
Warning: rpath /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/lib is outside prefix /home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_ (removing it)
INFO: sysroot: '/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/x86_64-conda-linux-gnu/sysroot/' files: '['/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/zone1970.tab', '/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/zone.tab', '/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/tzdata.zi', '/home/conda/staged-recipes/build_artifacts/flash-attn_1714961048474/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/right/Zulu']'
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libc10.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libtorch_cpu.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
ERROR (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): $RPATH/libtorch_python.so not found in packages, sysroot(s) nor the missing_dso_whitelist.
.. is this binary repackaging?
ERROR (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libcudart.so.12 found in ['conda-forge/linux-64::cuda-cudart==12.0.107=hd3aeb46_8']
ERROR (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): .. but ['conda-forge/linux-64::cuda-cudart==12.0.107=hd3aeb46_8'] not in reqs/run, (i.e. it is overlinking) (likely) or a missing dependency (less likely)
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libc10_cuda.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libtorch_cuda.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libstdc++.so.6 found in conda-forge/linux-64::libstdcxx-ng==13.2.0=hc0a3c3a_6
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libgcc_s.so.1 found in conda-forge/linux-64::libgcc-ng==13.2.0=h77fa898_6
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO x86_64-conda-linux-gnu/sysroot/lib64/libc.so.6 found in CDT/compiler package conda-forge/noarch::sysroot_linux-64==2.17=h4a8ded7_14
WARNING (flash-attn): dso library package conda-forge/linux-64::libcublas==12.0.1.189=hd3aeb46_3 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
WARNING (flash-attn): run-exports library package conda-forge/linux-64::pytorch==2.1.2=cuda120_py311h25b6552_303 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
WARNING (flash-attn): dso library package conda-forge/linux-64::libcusolver==11.4.2.57=hd3aeb46_2 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
WARNING (flash-attn): interpreter (Python) package conda-forge/linux-64::python==3.11.9=hb806964_0_cpython in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
WARNING (flash-attn): dso library package conda-forge/linux-64::libcusparse==12.0.0.76=hd3aeb46_2 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
Traceback (most recent call last):
File "/home/conda/staged-recipes-copy/.ci_support/build_all.py", line 261, in <module>
build_all(os.path.join(root_dir, "recipes"), args.arch)
File "/home/conda/staged-recipes-copy/.ci_support/build_all.py", line 151, in build_all
build_folders(recipes_dir, folders, arch, channel_urls)
File "/home/conda/staged-recipes-copy/.ci_support/build_all.py", line 207, in build_folders
conda_build.api.build([recipe], config=get_config(arch, channel_urls))
File "/opt/conda/lib/python3.10/site-packages/conda_build/api.py", line 250, in build
return build_tree(
File "/opt/conda/lib/python3.10/site-packages/conda_build/build.py", line 3762, in build_tree
packages_from_this = build(
File "/opt/conda/lib/python3.10/site-packages/conda_build/build.py", line 2839, in build
newly_built_packages = bundlers[pkg_type](output_d, m, env, stats)
File "/opt/conda/lib/python3.10/site-packages/conda_build/build.py", line 1974, in bundle_conda
files = post_process_files(metadata, initial_files)
File "/opt/conda/lib/python3.10/site-packages/conda_build/build.py", line 1782, in post_process_files
post_build(m, new_files, build_python=python)
File "/opt/conda/lib/python3.10/site-packages/conda_build/post.py", line 1729, in post_build
check_overlinking(m, files, host_prefix)
File "/opt/conda/lib/python3.10/site-packages/conda_build/post.py", line 1554, in check_overlinking
return check_overlinking_impl(
File "/opt/conda/lib/python3.10/site-packages/conda_build/post.py", line 1531, in check_overlinking_impl
raise OverLinkingError(overlinking_errors)
conda_build.exceptions.OverLinkingError: overlinking check failed
[' ERROR (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): $RPATH/libtorch_python.so not found in packages, sysroot(s) nor the missing_dso_whitelist.\n.. is this binary repackaging?', " ERROR (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): .. but ['conda-forge/linux-64::cuda-cudart==12.0.107=hd3aeb46_8'] not in reqs/run, (i.e. it is overlinking) (likely) or a missing dependency (less likely)"]
Traceback (most recent call last):
File "/home/user/projects/staged-recipes/build-locally.py", line 101, in <module>
main()
File "/home/user/projects/staged-recipes/build-locally.py", line 95, in main
run_docker_build(ns)
File "/home/user/projects/staged-recipes/build-locally.py", line 33, in run_docker_build
subprocess.check_call([script])
File "/home/user/mambaforge/envs/condalock/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['.scripts/run_docker_build.sh']' returned non-zero exit status 1. Looks like we might need to add a |
- libcublas-dev # [(cuda_compiler_version or "").startswith("12")] | ||
- libcusolver-dev # [(cuda_compiler_version or "").startswith("12")] | ||
- libcusparse-dev # [(cuda_compiler_version or "").startswith("12")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these deps listed if conda-build cannot detect any links to these libraries? It also doesn't look like the upstream library has any linking flags in the setup script CUDAExtension module. If they are needed, then the recipe needs a patch to switch from static to dynamic linking... but these packages don't contain the static libraries, so I'm not sure how static linking could be happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added these because I was getting errors like the following:
[1/49] /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/bin/nvcc -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu -o /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1 -ccbin /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/bin/x86_64-conda-linux-gnu-cc
FAILED: /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/bin/nvcc -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu -o /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1 -ccbin /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/bin/x86_64-conda-linux-gnu-cc
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
In file included from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_launch_template.h:7,
from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu:5:
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/ATen/cuda/CUDAContext.h:6:10: fatal error: cusparse.h: No such file or directory
6 | #include <cusparse.h>
| ^~~~~~~~~~~~
compilation terminated.
In file included from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_launch_template.h:7,
from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu:5:
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/ATen/cuda/CUDAContext.h:6:10: fatal error: cusparse.h: No such file or directory
6 | #include <cusparse.h>
| ^~~~~~~~~~~~
compilation terminated.
In file included from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_launch_template.h:7,
from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.cu:5:
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/ATen/cuda/CUDAContext.h:6:10: fatal error: cusparse.h: No such file or directory
6 | #include <cusparse.h>
| ^~~~~~~~~~~~
compilation terminated.
[2/49] /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/bin/x86_64-conda-linux-gnu-c++ -MMD -MF /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o.d -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -fPIC -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work=/usr/local/src/conda/flash-attn-2.5.8 -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_=/usr/local/src/conda-prefix -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib/stubs -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib/stubs -fPIC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/flash_api.cpp -o /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1
FAILED: /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/bin/x86_64-conda-linux-gnu-c++ -MMD -MF /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o.d -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -fPIC -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work=/usr/local/src/conda/flash-attn-2.5.8 -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_=/usr/local/src/conda-prefix -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib/stubs -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/targets/x86_64-linux/lib/stubs -fPIC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/flash_api.cpp -o /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1
In file included from /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/csrc/flash_attn/flash_api.cpp:8:
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/ATen/cuda/CUDAContext.h:6:10: fatal error: cusparse.h: No such file or directory
6 | #include <cusparse.h>
| ^~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '2']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/setup.py", line 311, in <module>
setup(
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/__init__.py", line 104, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 184, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
dist.run_commands()
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/dist.py", line 967, in run_command
super().run_command(command)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/setup.py", line 266, in run
return super().run()
^^^^^^^^^^^^^
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/wheel/bdist_wheel.py", line 368, in run
self.run_command("build")
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/dist.py", line 967, in run_command
super().run_command(command)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 132, in run
self.run_command(cmd_name)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/dist.py", line 967, in run_command
super().run_command(command)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 91, in run
_build_ext.run(self)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
self.build_extensions()
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
build_ext.build_extensions(self)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 479, in build_extensions
self._build_extensions_serial()
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 505, in _build_extensions_serial
self.build_extension(ext)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 252, in build_extension
_build_ext.build_extension(self, ext)
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 560, in build_extension
objects = self.compiler.compile(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/bin/python -u -c '
exec(compile('"'"''"'"''"'"'
# This is <pip-setuptools-caller> -- a caller that pip uses to run setup.py
#
# - It imports setuptools before invoking setup.py, to enable projects that directly
# import from `distutils.core` to work with newer packaging standards.
# - It provides a clear error message when setuptools is not installed.
# - It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so
# setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:
# manifest_maker: standard file '"'"'-c'"'"' not found".
# - It generates a shim setup.py, for handling setup.cfg-only projects.
import os, sys, tokenize
try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute `setup.py` since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)
__file__ = %r
sys.argv[0] = __file__
if os.path.exists(__file__):
filename = __file__
with tokenize.open(__file__) as f:
setup_py_code = f.read()
else:
filename = "<auto-generated setuptools caller>"
setup_py_code = "from setuptools import setup; setup()"
exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/setup.py'"'"',), "<pip-setuptools-caller>", "exec"))' bdist_wheel -d /tmp/pip-wheel-7hn5_c75
cwd: /home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/
Building wheel for flash_attn (setup.py): finished with status 'error'
ERROR: Failed building wheel for flash_attn
Running setup.py clean for flash_attn
Running command python setup.py clean
No CUDA runtime is found, using CUDA_HOME='/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_build_env'
error: pathspec 'csrc/cutlass' did not match any file(s) known to git
torch.__version__ = 2.1.2.post303
/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/setuptools/__init__.py:81: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
!!
********************************************************************************
Requirements should be satisfied by a PEP 517 installer.
If you are using pip, you can try `pip install --use-pep517`.
********************************************************************************
!!
dist.fetch_build_eggs(dist.setup_requires)
running clean
removing 'build/temp.linux-x86_64-cpython-311' (and everything under it)
removing 'build/lib.linux-x86_64-cpython-311' (and everything under it)
'build/bdist.linux-x86_64' does not exist -- can't clean it
'build/scripts-3.11' does not exist -- can't clean it
removing 'build'
Failed to build flash_attn
ERROR: Could not build wheels for flash_attn, which is required to install pyproject.toml-based projects
Exception information:
Traceback (most recent call last):
File "$PREFIX/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 180, in exc_logging_wrapper
status = run_func(*args)
^^^^^^^^^^^^^^^
File "$PREFIX/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 245, in wrapper
return func(self, options, args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "$PREFIX/lib/python3.11/site-packages/pip/_internal/commands/install.py", line 429, in run
raise InstallationError(
pip._internal.exceptions.InstallationError: Could not build wheels for flash_attn, which is required to install pyproject.toml-based projects
Removed build tracker: '/tmp/pip-build-tracker-vjr8_wy1'
Traceback (most recent call last):
File "/home/conda/staged-recipes-copy/.ci_support/build_all.py", line 261, in <module>
build_all(os.path.join(root_dir, "recipes"), args.arch)
File "/home/conda/staged-recipes-copy/.ci_support/build_all.py", line 151, in build_all
build_folders(recipes_dir, folders, arch, channel_urls)
File "/home/conda/staged-recipes-copy/.ci_support/build_all.py", line 207, in build_folders
conda_build.api.build([recipe], config=get_config(arch, channel_urls))
File "/opt/conda/lib/python3.10/site-packages/conda_build/api.py", line 250, in build
return build_tree(
File "/opt/conda/lib/python3.10/site-packages/conda_build/build.py", line 3762, in build_tree
packages_from_this = build(
File "/opt/conda/lib/python3.10/site-packages/conda_build/build.py", line 2634, in build
utils.check_call_env(
File "/opt/conda/lib/python3.10/site-packages/conda_build/utils.py", line 408, in check_call_env
return _func_defaulting_env_to_os_environ("call", *popenargs, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/conda_build/utils.py", line 384, in _func_defaulting_env_to_os_environ
raise subprocess.CalledProcessError(proc.returncode, _args)
subprocess.CalledProcessError: Command '['/bin/bash', '-o', 'errexit', '/home/conda/staged-recipes/build_artifacts/flash-attn_1715034607333/work/conda_build.sh']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/home/weiji/Documents/github/staged-recipes/build-locally.py", line 101, in <module>
main()
File "/home/weiji/Documents/github/staged-recipes/build-locally.py", line 95, in main
run_docker_build(ns)
File "/home/weiji/Documents/github/staged-recipes/build-locally.py", line 33, in run_docker_build
subprocess.check_call([script])
File "/home/weiji/mambaforge/envs/condalock/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['.scripts/run_docker_build.sh']' returned non-zero exit status 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yep, we're getting that same #include <cusparse.h> ... compilation terminated
error after removing libcusparse-dev
from host deps at https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=928515&view=logs&j=67448ffb-e003-5bfa-c062-cee3af60fcba&t=818ff20d-11b7-59db-6ce1-bb4df921454a&l=1016
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! It's because Pytorch's ATen includes the cusparse header in its headers.
OK. Then we need to list these deps in both requirments/host
and build/ignore_run_exports_from
. Because we are not linking to these libs, but we need to know about them for the ATen API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, added build/ignore_run_exports_from
in 37e676a. Assuming that only libcublas-dev
, libcusolver-dev
, and libcusparse-dev
needs to be added, judging from the warnings at #26239 (comment).
This simpler script doesn't have unused features and doesn't set -O3 because our channel defaults are -O2
Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
|
||
test: | ||
imports: | ||
- flash_attn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test may need to be commented out because the test runners don't have a GPU, so imports might fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imports seemed to have worked at https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=928722&view=logs&j=4f860608-e5f8-5c9c-4eb0-308a99ecb61e&t=02ef1a5c-d960-5c54-fcea-983775f057bb&l=1352
done
export PREFIX=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047997366/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place
export SRC_DIR=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047997366/test_tmp
import: 'flash_attn'
import: 'flash_attn'
+ pip check
No broken requirements found.
+ exit 0
Silence warnings like: ``` WARNING (flash-attn): dso library package conda-forge/linux-64::libcublas==12.0.1.189=hd3aeb46_3 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`) WARNING (flash-attn): dso library package conda-forge/linux-64::libcusparse==12.0.0.76=hd3aeb46_2 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`) WARNING (flash-attn): dso library package conda-forge/linux-64::libcusolver==11.4.2.57=hd3aeb46_2 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`) ```
Trying to reduce CPU load on Azure CI to debug build.
recipes/flash-attn/meta.yaml
Outdated
script_env: | ||
# Temporarily reduce ARCHs and JOBS to debug build | ||
# - MAX_JOBS=$CPU_COUNT | ||
- MAX_JOBS=1 | ||
# - TORCH_CUDA_ARCH_LIST=8.0;8.6;8.9;9.0+PTX | ||
- TORCH_CUDA_ARCH_LIST=8.6+PTX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set MAX_JOBS=1 again as in ef03f90. Looks like CI can run up to 6 hours now (without crashing due to out of memory), though that's still not enough to finish compiling 😅 See e.g. https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=928208&view=logs&jobId=67448ffb-e003-5bfa-c062-cee3af60fcba&j=4f860608-e5f8-5c9c-4eb0-308a99ecb61e&t=02ef1a5c-d960-5c54-fcea-983775f057bb where the CUDA 11.8 build got as far as 30/49.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oo, it looks the compilation finally finished, at least a single Python version (Python 3.11). The CI check shows that the job was cancelled after 6 hours, but it's because it continued to try to build for another Python version (Python 3.10) for some reason (maybe because we're not using noarch).
Logs from https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=928722&view=logs&j=67448ffb-e003-5bfa-c062-cee3af60fcba&t=818ff20d-11b7-59db-6ce1-bb4df921454a&l=1226 showing successful build
[49/49] /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/bin/nvcc -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/csrc/flash_attn -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/csrc/flash_attn/src -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/csrc/cutlass/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/TH -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/include/THC -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include/python3.11 -c -c /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/csrc/flash_attn/src/flash_fwd_split_hdim96_fp16_sm80.cu -o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_fp16_sm80.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -ccbin /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/bin/x86_64-conda-linux-gnu-cc
2024-05-07T07:10:57.5843417Z nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
2024-05-07T07:10:57.7918559Z /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/bin/x86_64-conda-linux-gnu-c++ -shared -Wl,--allow-shlib-undefined -Wl,-rpath,/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-rpath-link,/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,--allow-shlib-undefined -Wl,-rpath,/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-rpath-link,/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -Wl,-rpath-link,/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/lib/stubs -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work=/usr/local/src/conda/flash-attn-2.5.8 -fdebug-prefix-map=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_=/usr/local/src/conda-prefix -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/lib/stubs -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/include -I/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/include -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/targets/x86_64-linux/lib/stubs -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/targets/x86_64-linux/lib/stubs /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/flash_api.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim160_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim192_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim192_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim224_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim224_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim256_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim256_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim32_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim32_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim64_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim64_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim96_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_bwd_hdim96_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim128_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim128_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim160_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim160_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim192_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim192_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim224_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim224_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim256_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim256_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim32_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim32_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim64_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim64_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim96_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_hdim96_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim128_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim128_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim160_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim160_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim192_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim192_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim224_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim224_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim256_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim256_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim32_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim32_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim64_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim64_fp16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_bf16_sm80.o /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work/build/temp.linux-x86_64-cpython-311/csrc/flash_attn/src/flash_fwd_split_hdim96_fp16_sm80.o -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.11/site-packages/torch/lib -L/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-cpython-311/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
copying build/lib.linux-x86_64-cpython-311/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/activations.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/rms_norm.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/fused_dense.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/layer_norm.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops
creating build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/cross_entropy.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/rotary.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/mlp.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/k_activations.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/layer_norm.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
copying build/lib.linux-x86_64-cpython-311/flash_attn/ops/triton/linear.py -> build/bdist.linux-x86_64/wheel/flash_attn/ops/triton
creating build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/mha.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/mlp.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/embedding.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/modules/block.py -> build/bdist.linux-x86_64/wheel/flash_attn/modules
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_blocksparse_attention.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_attn_interface.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_attn_triton_og.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/distributed.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/benchmark.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/pretrained.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
copying build/lib.linux-x86_64-cpython-311/flash_attn/utils/generation.py -> build/bdist.linux-x86_64/wheel/flash_attn/utils
creating build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/layers/patch_embed.py -> build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/layers/rotary.py -> build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/layers/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/layers
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_blocksparse_attn_interface.py -> build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/losses
copying build/lib.linux-x86_64-cpython-311/flash_attn/losses/cross_entropy.py -> build/bdist.linux-x86_64/wheel/flash_attn/losses
copying build/lib.linux-x86_64-cpython-311/flash_attn/losses/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/losses
copying build/lib.linux-x86_64-cpython-311/flash_attn/flash_attn_triton.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/fused_softmax.py -> build/bdist.linux-x86_64/wheel/flash_attn
copying build/lib.linux-x86_64-cpython-311/flash_attn/bert_padding.py -> build/bdist.linux-x86_64/wheel/flash_attn
creating build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/falcon.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/btlm.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/vit.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/__init__.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/bert.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/baichuan.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/bigcode.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/opt.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/gpt_neox.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/gpt.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/llama.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
copying build/lib.linux-x86_64-cpython-311/flash_attn/models/gptj.py -> build/bdist.linux-x86_64/wheel/flash_attn/models
running install_egg_info
Copying flash_attn.egg-info to build/bdist.linux-x86_64/wheel/flash_attn-2.5.8-py3.11.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/flash_attn-2.5.8.dist-info/WHEEL
creating '/tmp/pip-wheel-7_xhpf61/.tmp-4qdnj8g3/flash_attn-2.5.8-cp311-cp311-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so'
adding 'flash_attn/__init__.py'
adding 'flash_attn/bert_padding.py'
adding 'flash_attn/flash_attn_interface.py'
adding 'flash_attn/flash_attn_triton.py'
adding 'flash_attn/flash_attn_triton_og.py'
adding 'flash_attn/flash_blocksparse_attention.py'
adding 'flash_attn/flash_blocksparse_attn_interface.py'
adding 'flash_attn/fused_softmax.py'
adding 'flash_attn/layers/__init__.py'
adding 'flash_attn/layers/patch_embed.py'
adding 'flash_attn/layers/rotary.py'
adding 'flash_attn/losses/__init__.py'
adding 'flash_attn/losses/cross_entropy.py'
adding 'flash_attn/models/__init__.py'
adding 'flash_attn/models/baichuan.py'
adding 'flash_attn/models/bert.py'
adding 'flash_attn/models/bigcode.py'
adding 'flash_attn/models/btlm.py'
adding 'flash_attn/models/falcon.py'
adding 'flash_attn/models/gpt.py'
adding 'flash_attn/models/gpt_neox.py'
adding 'flash_attn/models/gptj.py'
adding 'flash_attn/models/llama.py'
adding 'flash_attn/models/opt.py'
adding 'flash_attn/models/vit.py'
adding 'flash_attn/modules/__init__.py'
adding 'flash_attn/modules/block.py'
adding 'flash_attn/modules/embedding.py'
adding 'flash_attn/modules/mha.py'
adding 'flash_attn/modules/mlp.py'
adding 'flash_attn/ops/__init__.py'
adding 'flash_attn/ops/activations.py'
adding 'flash_attn/ops/fused_dense.py'
adding 'flash_attn/ops/layer_norm.py'
adding 'flash_attn/ops/rms_norm.py'
adding 'flash_attn/ops/triton/__init__.py'
adding 'flash_attn/ops/triton/cross_entropy.py'
adding 'flash_attn/ops/triton/k_activations.py'
adding 'flash_attn/ops/triton/layer_norm.py'
adding 'flash_attn/ops/triton/linear.py'
adding 'flash_attn/ops/triton/mlp.py'
adding 'flash_attn/ops/triton/rotary.py'
adding 'flash_attn/utils/__init__.py'
adding 'flash_attn/utils/benchmark.py'
adding 'flash_attn/utils/distributed.py'
adding 'flash_attn/utils/generation.py'
adding 'flash_attn/utils/pretrained.py'
adding 'flash_attn-2.5.8.dist-info/AUTHORS'
adding 'flash_attn-2.5.8.dist-info/LICENSE'
adding 'flash_attn-2.5.8.dist-info/METADATA'
adding 'flash_attn-2.5.8.dist-info/WHEEL'
adding 'flash_attn-2.5.8.dist-info/top_level.txt'
adding 'flash_attn-2.5.8.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
Building wheel for flash_attn (pyproject.toml): finished with status 'done'
Created wheel for flash_attn: filename=flash_attn-2.5.8-cp311-cp311-linux_x86_64.whl size=161615679 sha256=019cabb84a0f37b55ff08e14959ee34184e658cbc5d6edc3622e44779820b595
Stored in directory: /tmp/pip-ephem-wheel-cache-g138790y/wheels/d8/91/a6/6160216e602ad906106939541f06f84e2dbc50fbc04b44036d
Successfully built flash_attn
Installing collected packages: flash_attn
Successfully installed flash_attn-2.5.8
Removed build tracker: '/tmp/pip-build-tracker-r7fvrno4'
Resource usage statistics from building flash-attn:
Process count: 8
CPU time: Sys=0:03:26.0, User=4:47:19.4
Memory: 5.6G
Disk usage: 1.1M
Time elapsed: 4:56:11.4
Packaging flash-attn
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'MAX_JOBS' is being passed through with value '1'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/conda_build/environ.py:558: UserWarning: The environment variable 'TORCH_CUDA_ARCH_LIST' is being passed through with value '8.6+PTX'. If you are splitting build and test phases with --no-test, please ensure that this value is also set similarly at test time.
warnings.warn(
Packaging flash-attn-2.5.8-py311h379968c_0
compiling .pyc files...
number of files: 104
Warning: rpath /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/lib is outside prefix /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_ (removing it)
INFO: sysroot: '/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/x86_64-conda-linux-gnu/sysroot/' files: '['/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/zone1970.tab', '/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/zone.tab', '/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/tzdata.zi', '/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_build_env/x86_64-conda-linux-gnu/sysroot/usr/share/zoneinfo/right/Zulu']'
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libc10.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libtorch_cpu.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/python3.11/site-packages/torch/lib/libtorch_python.so found in conda-forge/linux-64::pytorch==2.1.2=cuda120_py311h25b6552_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libcudart.so.12 found in conda-forge/linux-64::cuda-cudart==12.0.107=hd3aeb46_8
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libc10_cuda.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libtorch_cuda.so found in conda-forge/linux-64::libtorch==2.1.2=cuda120_h2aa5df7_303
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libstdc++.so.6 found in conda-forge/linux-64::libstdcxx-ng==13.2.0=hc0a3c3a_7
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO lib/libgcc_s.so.1 found in conda-forge/linux-64::libgcc-ng==13.2.0=h77fa898_7
INFO (flash-attn,lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so): Needed DSO x86_64-conda-linux-gnu/sysroot/lib64/libc.so.6 found in CDT/compiler package conda-forge/noarch::sysroot_linux-64==2.17=h4a8ded7_14
WARNING (flash-attn): interpreter (Python) package conda-forge/linux-64::python==3.11.9=hb806964_0_cpython in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
Fixing permissions
Packaged license file/s.
INFO :: Time taken to mark (prefix)
0 replacements in 0 files was 1.18 seconds
Files containing CONDA_PREFIX
-----------------------------
lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so (binary): Patching
WARNING: Importing conda-verify failed. Please be sure to test your packages. conda install conda-verify to make this message go away.
TEST START: /home/conda/staged-recipes/build_artifacts/linux-64/flash-attn-2.5.8-py311h379968c_0.conda
Renaming work directory '/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work' to '/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work_moved_flash-attn-2.5.8-py311h379968c_0_linux-64'
shutil.move(work)=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work, dest=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/work_moved_flash-attn-2.5.8-py311h379968c_0_linux-64)
Reloading output folder (local): ...working... done
Solving environment (_test_env): ...working... done
## Package Plan ##
environment location: /home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place
The following NEW packages will be INSTALLED:
_libgcc_mutex: 0.1-conda_forge conda-forge
_openmp_mutex: 4.5-2_kmp_llvm conda-forge
bzip2: 1.0.8-hd590300_5 conda-forge
ca-certificates: 2024.2.2-hbcca054_0 conda-forge
cuda-cudart: 12.4.127-hd3aeb46_0 conda-forge
cuda-cudart_linux-64: 12.4.127-h59595ed_0 conda-forge
cuda-nvrtc: 12.4.127-hd3aeb46_1 conda-forge
cuda-nvtx: 12.4.127-h59595ed_1 conda-forge
cuda-version: 12.4-h3060b56_3 conda-forge
cudnn: 8.9.7.29-h092f7fd_3 conda-forge
einops: 0.8.0-pyhd8ed1ab_0 conda-forge
filelock: 3.14.0-pyhd8ed1ab_0 conda-forge
flash-attn: 2.5.8-py311h379968c_0 local
fsspec: 2024.3.1-pyhca7485f_0 conda-forge
gmp: 6.3.0-h59595ed_1 conda-forge
gmpy2: 2.1.5-py311he48d604_0 conda-forge
icu: 73.2-h59595ed_0 conda-forge
jinja2: 3.1.3-pyhd8ed1ab_0 conda-forge
ld_impl_linux-64: 2.40-h55db66e_0 conda-forge
libabseil: 20230802.1-cxx17_h59595ed_0 conda-forge
libblas: 3.9.0-22_linux64_openblas conda-forge
libcblas: 3.9.0-22_linux64_openblas conda-forge
libcublas: 12.4.5.8-hd3aeb46_1 conda-forge
libcufft: 11.2.1.3-hd3aeb46_1 conda-forge
libcurand: 10.3.5.147-hd3aeb46_1 conda-forge
libcusolver: 11.6.1.9-hd3aeb46_1 conda-forge
libcusparse: 12.3.1.170-hd3aeb46_1 conda-forge
libexpat: 2.6.2-h59595ed_0 conda-forge
libffi: 3.4.2-h7f98852_5 conda-forge
libgcc-ng: 13.2.0-h77fa898_7 conda-forge
libgfortran-ng: 13.2.0-h69a702a_7 conda-forge
libgfortran5: 13.2.0-hca663fb_7 conda-forge
libhwloc: 2.10.0-default_h2fb2949_1000 conda-forge
libiconv: 1.17-hd590300_2 conda-forge
liblapack: 3.9.0-22_linux64_openblas conda-forge
libmagma: 2.7.2-h173bb3b_2 conda-forge
libmagma_sparse: 2.7.2-h173bb3b_3 conda-forge
libnsl: 2.0.1-hd590300_0 conda-forge
libnvjitlink: 12.4.127-hd3aeb46_1 conda-forge
libopenblas: 0.3.27-pthreads_h413a1c8_0 conda-forge
libprotobuf: 4.25.1-hf27288f_2 conda-forge
libsqlite: 3.45.3-h2797004_0 conda-forge
libstdcxx-ng: 13.2.0-hc0a3c3a_7 conda-forge
libtorch: 2.1.2-cuda120_h2aa5df7_303 conda-forge
libuuid: 2.38.1-h0b41bf4_0 conda-forge
libuv: 1.48.0-hd590300_0 conda-forge
libxcrypt: 4.4.36-hd590300_1 conda-forge
libxml2: 2.12.6-h232c23b_2 conda-forge
libzlib: 1.2.13-hd590300_5 conda-forge
llvm-openmp: 18.1.5-ha31de31_0 conda-forge
magma: 2.7.2-h51420fd_3 conda-forge
markupsafe: 2.1.5-py311h459d7ec_0 conda-forge
mkl: 2023.2.0-h84fe81f_50496 conda-forge
mpc: 1.3.1-hfe3b2da_0 conda-forge
mpfr: 4.2.1-h9458935_1 conda-forge
mpmath: 1.3.0-pyhd8ed1ab_0 conda-forge
nccl: 2.21.5.1-h3a97aeb_0 conda-forge
ncurses: 6.4.20240210-h59595ed_0 conda-forge
networkx: 3.3-pyhd8ed1ab_1 conda-forge
numpy: 1.26.4-py311h64a7726_0 conda-forge
openssl: 3.3.0-hd590300_0 conda-forge
pip: 24.0-pyhd8ed1ab_0 conda-forge
python: 3.11.9-hb806964_0_cpython conda-forge
python_abi: 3.11-4_cp311 conda-forge
pytorch: 2.1.2-cuda120_py311h25b6552_303 conda-forge
readline: 8.2-h8228510_1 conda-forge
setuptools: 69.5.1-pyhd8ed1ab_0 conda-forge
sleef: 3.5.1-h9b69904_2 conda-forge
sympy: 1.12-pypyh9d50eac_103 conda-forge
tbb: 2021.12.0-h00ab1b0_0 conda-forge
tk: 8.6.13-noxft_h4845f30_101 conda-forge
typing_extensions: 4.11.0-pyha770c72_0 conda-forge
tzdata: 2024a-h0c530f3_0 conda-forge
wheel: 0.43.0-pyhd8ed1ab_1 conda-forge
xz: 5.2.6-h166bdaf_0 conda-forge
zstd: 1.5.6-ha6fb4c9_0 conda-forge
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... By downloading and using the cuDNN conda packages, you accept the terms and conditions of the NVIDIA cuDNN EULA -
https://docs.nvidia.com/deeplearning/cudnn/sla/index.html
done
export PREFIX=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place
export SRC_DIR=/home/conda/staged-recipes/build_artifacts/flash-attn_1715047963337/test_tmp
import: 'flash_attn'
import: 'flash_attn'
+ pip check
No broken requirements found.
+ exit 0
Resource usage statistics from testing flash-attn:
Process count: 4
CPU time: Sys=0:00:00.3, User=0:00:01.1
Memory: 337.3M
Disk usage: 24B
Time elapsed: 0:00:06.0
TEST END: /home/conda/staged-recipes/build_artifacts/linux-64/flash-attn-2.5.8-py311h379968c_0.conda
@carterbox, did you want to keep using the simplified setup.py
and pyproject.toml
file, or try reverting back to upstream one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it's because it continued to try to build for another Python version (Python 3.10) for some reason (maybe because we're not using noarch).
In staged recipes, there is only one runner per platform, every python variant is built on the same runner.
@carterbox, did you want to keep using the simplified setup.py and pyproject.toml file, or try reverting back to upstream one?
I want to keep the simplified scripts for now. The only drawback is we have to manually update the source files list, some compile args, and dependencies. However, I think that is better than working around all of special operations upstream has added to their build script which are not compatible with our build environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weiji14, once the feedstock is allocated, please do some experiments to see how much time it takes to run with MAX_JOBS=$CPU_COUNT and how many CUDA archs can be added to the arch list. We want to build for as many of 8.0;8.6;8.9;9.0+PTX as we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks so much @carterbox! The initial feedstock commit actually failed due to running out of disk space 😅 But I've opened a PR now at conda-forge/flash-attn-feedstock#1, so we can continue discussion there.
Flash Attention: Fast and Memory-Efficient Exact Attention! Repo at https://github.com/Dao-AILab/flash-attention
Packaging
flash-attn
, so that I can packagetransformer-engine
later (edit: see #26296)Checklist
url
) rather than a repo (e.g.git_url
) is used in your recipe (see here for more details).