Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flash-attn #26239

Merged
merged 23 commits into from
May 7, 2024
Merged

Add flash-attn #26239

merged 23 commits into from
May 7, 2024

Commits on May 4, 2024

  1. Add flash-attn

    Flash Attention: Fast and Memory-Efficient Exact Attention! Repo at https://github.com/Dao-AILab/flash-attention
    weiji14 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    07ec11e View commit details
    Browse the repository at this point in the history
  2. Add cuda compiler

    To try and fix `OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root`
    weiji14 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    a4def75 View commit details
    Browse the repository at this point in the history
  3. Add c and cxx compilers

    weiji14 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    06713e3 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    51d7d74 View commit details
    Browse the repository at this point in the history
  5. Skip build on non-cuda platforms

    This flash-attn library only runs on Linux with CUDA GPUs if I'm not mistaken.
    weiji14 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    aa17a2c View commit details
    Browse the repository at this point in the history
  6. Add libcublas-dev, libcusolver-dev, libcusparse-dev to host deps

    Needed to compile flash-attn on CUDA 12.0 in conda-forge.
    weiji14 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    4d8b37c View commit details
    Browse the repository at this point in the history
  7. Remove noarch: python

    weiji14 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    04a346a View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    16414ff View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    2d2212d View commit details
    Browse the repository at this point in the history

Commits on May 5, 2024

  1. Configuration menu
    Copy the full SHA
    2cde3c1 View commit details
    Browse the repository at this point in the history

Commits on May 6, 2024

  1. Configuration menu
    Copy the full SHA
    501aa9d View commit details
    Browse the repository at this point in the history
  2. Set TORCH_CUDA_ARCH_LIST to 8.0 and above

    Only compiling on Compute Capability 8.0 and above, see https://developer.nvidia.com/cuda-gpus. I.e. NVIDIA Ampere generation devices or newer.
    weiji14 committed May 6, 2024
    Configuration menu
    Copy the full SHA
    a1b1faa View commit details
    Browse the repository at this point in the history
  3. Drop quotes from script_env

    weiji14 committed May 6, 2024
    Configuration menu
    Copy the full SHA
    5235314 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ef03f90 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ebae578 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    460eeb2 View commit details
    Browse the repository at this point in the history
  7. BLD: Replace setup script with simpler one

    This simpler script doesn't have unused features and doesn't set -O3 because our channel defaults are -O2
    carterbox committed May 6, 2024
    Configuration menu
    Copy the full SHA
    317646a View commit details
    Browse the repository at this point in the history
  8. Update recipes/flash-attn/meta.yaml

    Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
    carterbox and weiji14 authored May 6, 2024
    Configuration menu
    Copy the full SHA
    0b81f6f View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    96e817a View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    0733767 View commit details
    Browse the repository at this point in the history

Commits on May 7, 2024

  1. ignore_run_exports_from libcublas-dev, libcusolver-dev, libcusparse-dev

    Silence warnings like:
    
    ```
    WARNING (flash-attn): dso library package conda-forge/linux-64::libcublas==12.0.1.189=hd3aeb46_3 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
    WARNING (flash-attn): dso library package conda-forge/linux-64::libcusparse==12.0.0.76=hd3aeb46_2 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
    WARNING (flash-attn): dso library package conda-forge/linux-64::libcusolver==11.4.2.57=hd3aeb46_2 in requirements/run but it is not used (i.e. it is overdepending or perhaps statically linked? If that is what you want then add it to `build/ignore_run_exports`)
    ```
    weiji14 committed May 7, 2024
    Configuration menu
    Copy the full SHA
    37e676a View commit details
    Browse the repository at this point in the history
  2. Temporarily set TORCH_CUDA_ARCH_LIST=8.6+PTX and MAX_JOBS=1

    Trying to reduce CPU load on Azure CI to debug build.
    weiji14 committed May 7, 2024
    Configuration menu
    Copy the full SHA
    fc2fc76 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    63dcb65 View commit details
    Browse the repository at this point in the history