feat: flash attention v3 #479

cathalobrien · 2025-08-13T14:57:21Z

Description

Adds a flash attention v3 option to models/layers/attention.py

flash attention v3 is optimised for hoppers and newer GPUs, with up to 2x faster attention kernels. For more info see here

you can use flash-attn v3 with the following config entry

model.processor.attention_implementation=flash_attention_v3

I compared the loss (seed=42) for SDPA vs flash-attn v3 and it matched to the 7th decimal place

I also changed the error message reporting for the flash attention v2 wrapper, to actually print the error. This improves the user experience in the case that flash attn is installed but you need a newer gcc. Now anemoi will print the GLIBC error, rather then telling you to install flash attn when it's already installed

What problem does this change solve?

faster attention

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

mchantry · 2025-08-13T21:16:26Z

@cathalobrien great contribution. Please could you add some tiny docs or references in the configs to show how a user can control which attention is used?

…e into feat/flash_attention_v3

for more information, see https://pre-commit.ci

cathalobrien · 2025-08-18T08:31:04Z

@mchantry
added an explanation of the different attention implementations, and how to set it into the docs

cathalobrien · 2025-08-19T14:35:01Z

TODO will merge both flash_attention wrappers into one wrapper to minimise config complexity

mchantry · 2025-08-21T10:19:23Z

TODO will merge both flash_attention wrappers into one wrapper to minimise config complexity

Would be good to get output saying which version was chosen, for posterity.

…e into feat/flash_attention_v3

cathalobrien · 2025-08-21T10:53:40Z

TODO will merge both flash_attention wrappers into one wrapper to minimise config complexity

Would be good to get output saying which version was chosen, for posterity.

done, good catch

flash attention v3

0d7a420

github-project-automation bot added this to Anemoi-dev Aug 13, 2025

github-project-automation bot moved this to To be triaged in Anemoi-dev Aug 13, 2025

github-actions bot added the models label Aug 13, 2025

remove todo

82956ea

cathalobrien changed the title ~~flash attention v3~~ feat: flash attention v3 Aug 13, 2025

Merge branch 'main' into feat/flash_attention_v3

c5fa4e9

mchantry added the ATS Approval Not Needed No approval needed by ATS label Aug 13, 2025

cathalobrien added 4 commits August 14, 2025 06:12

list possible options in config

4decd75

forgot transformer mappers

66a3fc0

added note to explain differences between attention implementations

39605d6

Merge branch 'feat/flash_attention_v3' of github.com:ecmwf/anemoi-cor…

b81bc62

…e into feat/flash_attention_v3

github-actions bot added the training label Aug 14, 2025

cathalobrien and others added 6 commits August 15, 2025 17:00

pass causal and include guards

67d9444

update docs

d203976

[pre-commit.ci] auto fixes from pre-commit.com hooks

a49cc03

for more information, see https://pre-commit.ci

tidy

e2c4d5f

show how to set attn imp

60c797c

[pre-commit.ci] auto fixes from pre-commit.com hooks

2dd0e8d

for more information, see https://pre-commit.ci

Merge branch 'main' into feat/flash_attention_v3

8782c48

merge flash_attention_v3 into old flash_attn wrapper (WIP)

abf0668

Merge branch 'feat/flash_attention_v3' of github.com:ecmwf/anemoi-cor…

0e1159a

…e into feat/flash_attention_v3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: flash attention v3 #479

feat: flash attention v3 #479

Uh oh!

cathalobrien commented Aug 13, 2025 •

edited

Loading

Uh oh!

mchantry commented Aug 13, 2025

Uh oh!

cathalobrien commented Aug 18, 2025

Uh oh!

cathalobrien commented Aug 19, 2025

Uh oh!

mchantry commented Aug 21, 2025

Uh oh!

cathalobrien commented Aug 21, 2025

Uh oh!

Uh oh!

feat: flash attention v3 #479

Are you sure you want to change the base?

feat: flash attention v3 #479

Uh oh!

Conversation

cathalobrien commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What problem does this change solve?

Uh oh!

mchantry commented Aug 13, 2025

Uh oh!

cathalobrien commented Aug 18, 2025

Uh oh!

cathalobrien commented Aug 19, 2025

Uh oh!

mchantry commented Aug 21, 2025

Uh oh!

cathalobrien commented Aug 21, 2025

Uh oh!

Uh oh!

cathalobrien commented Aug 13, 2025 •

edited

Loading