Flash Attention for Neuron #883

apoorvtintin · 2024-12-11T20:08:07Z

This PR adds support for flash attention kernel for Neuron implemented through Neuron Kernel Interface (NKI).

The flash attention kernel works with TRN1 and TRN2.

axlearn/common/flash_attention/neuron_attention.py

kelvin-zou · 2024-12-11T22:18:46Z

axlearn/common/flash_attention/neuron_attention.py

@@ -0,0 +1,129 @@
+from absl import logging


Also please add comments for the file.

kelvin-zou · 2024-12-11T23:37:26Z

axlearn/common/flash_attention/neuron_attention.py

+    return out
+
+
+def _mha_forward(query, key, value, bias, causal, softmax_scale):


Can we get a support for segment ID and dropout as well? Both are quite needed nowadays.

Segment ID support is in progress and will be added soon

axlearn/common/flash_attention/utils.py

axlearn/common/flash_attention/neuron_attention.py

ruomingp

Will defer to @kelvin-zou for approval.

apoorvtintin · 2025-01-21T23:16:24Z

This has been implemented in a new PR #939. That PR addresses all comments in this PR but is from a different fork of Axlearn.

apoorvtintin requested review from markblee and ruomingp as code owners December 11, 2024 20:08

apivovarov reviewed Dec 11, 2024

View reviewed changes

axlearn/common/flash_attention/neuron_attention.py Outdated Show resolved Hide resolved

apoorvtintin mentioned this pull request Dec 11, 2024

[DO-NOT-MERGE] PR encompassing all changes needed to support neuron on Axlearn #886

Closed

Flash Attention for Neuron

347f522

apoorvtintin force-pushed the mainline_upstream_fa branch from eab90f9 to 347f522 Compare December 11, 2024 22:57

kelvin-zou reviewed Dec 11, 2024

View reviewed changes

ruomingp reviewed Dec 14, 2024

View reviewed changes

apoorvtintin mentioned this pull request Jan 13, 2025

[DO-NOT-MERGE] PR encompassing all changes needed to support neuron on Axlearn #919

Closed

apoorvtintin mentioned this pull request Jan 21, 2025

Flash Attention for Neuron #939

Merged

apoorvtintin closed this Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flash Attention for Neuron #883

Flash Attention for Neuron #883

Uh oh!

apoorvtintin commented Dec 11, 2024

Uh oh!

Uh oh!

kelvin-zou Dec 11, 2024

Uh oh!

kelvin-zou Dec 11, 2024

Uh oh!

kelvin-zou Dec 11, 2024

Uh oh!

apoorvtintin Jan 21, 2025

Uh oh!

Uh oh!

Uh oh!

ruomingp left a comment

Uh oh!

apoorvtintin commented Jan 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		return out


		def _mha_forward(query, key, value, bias, causal, softmax_scale):

Flash Attention for Neuron #883

Flash Attention for Neuron #883

Uh oh!

Conversation

apoorvtintin commented Dec 11, 2024

Uh oh!

Uh oh!

kelvin-zou Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

kelvin-zou Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

kelvin-zou Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

apoorvtintin Jan 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ruomingp left a comment

Choose a reason for hiding this comment

Uh oh!

apoorvtintin commented Jan 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants