Skip to content

Steer agent to HF kernels instead of pip install flash-attn#204

Merged
akseljoonas merged 2 commits into
huggingface:mainfrom
DarshanCode2005:feat-kernels
May 1, 2026
Merged

Steer agent to HF kernels instead of pip install flash-attn#204
akseljoonas merged 2 commits into
huggingface:mainfrom
DarshanCode2005:feat-kernels

Conversation

@DarshanCode2005
Copy link
Copy Markdown
Contributor

@DarshanCode2005 DarshanCode2005 commented May 1, 2026

Resolves #202

The agent kept trying to pip install flash-attn in jobs, which often takes
ages to compile or fails outright on the job's CUDA/torch combo. The HF
kernels library lets you pull a prebuilt flash-attn (and friends) straight
from the Hub via attn_implementation="kernels-community/flash-attn2".

  • Rewrote the HARDCODED UNAVAILABLE PACKAGES bullet in system_prompt_v3.yaml
    to recommend kernels first and only fall back to pip install when no Hub
    kernel covers the need. Listed the common kernel ids (flash-attn2,
    vllm-flash-attn3, paged-attention) so the agent doesn't have to guess.
  • Fixed the kernels entry in explore_hf_docs. The old description
    ("Lightweight execution environments and notebook-style workflows") was
    describing something else entirely and would have sent the agent the wrong
    way when it actually went looking.

Refs: https://huggingface.co/docs/kernels/index, https://huggingface.co/docs/trl/kernels_hub

@akseljoonas akseljoonas merged commit 7599843 into huggingface:main May 1, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tell the agent to use kernels instead of installing flash-attn

2 participants