Skip to content

Pass ForwardOptions from top level module and also return any relevant state as output #8186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

sxu
Copy link
Contributor

@sxu sxu commented Feb 4, 2025

Summary: Pass a ForwardOptions argument (introduced by #8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation. This is in preparation for a static shape attention implementation.

Differential Revision: D69080123

cc @mergennachin @cccclai @helunwencser @dvorjackz

Copy link

pytorch-bot bot commented Feb 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8186

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 867a885 with merge base dd31d93 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 4, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

@iseeyuan
Copy link
Contributor

iseeyuan commented Feb 4, 2025

@pytorchbot label "release notes: llama_transformer: Pass ForwardOptions from top level module and also return any relevant state as output"

Copy link

pytorch-bot bot commented Feb 4, 2025

Didn't find following labels among repository labels: release notes: llama_transformer: Pass ForwardOptions from top level module and also return any relevant state as output

sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from cbe146d to d9a5d21 Compare February 4, 2025 19:19
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from d9a5d21 to aca7245 Compare February 4, 2025 21:09
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

@sxu sxu force-pushed the export-D69080123 branch from aca7245 to d921670 Compare February 4, 2025 21:59
sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from d921670 to 045eef6 Compare February 4, 2025 22:22
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from 045eef6 to 5f79682 Compare February 4, 2025 22:49
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from 5f79682 to ec982a9 Compare February 4, 2025 23:03
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 4, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from ec982a9 to ccf7082 Compare February 4, 2025 23:17
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 5, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from ccf7082 to 04a6fb2 Compare February 5, 2025 00:25
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 5, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from 4351436 to c514fa8 Compare February 5, 2025 04:50
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 5, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from c514fa8 to b6e26b7 Compare February 5, 2025 16:27
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 5, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from b6e26b7 to 8a78aba Compare February 5, 2025 16:46
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

@sxu sxu requested a review from cccclai February 5, 2025 18:52
Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally seems fine

@@ -236,7 +236,7 @@ def forward(
assert input_pos is not None
k, v = self.kv_cache.update(input_pos, k, v)
output = self.SDPA(input_pos, q, k, v, bsz, seqlen, self.mask)
return self.wo(output)
return self.wo(output), None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is None a place holder or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to support additional output from each attention layer that'll be aggregated and returned by the top level transformer. To avoid disruptions if this is `None, the top level transformer still return a single logit output, otherwise it returns a tuple.

This was actually added in the previous PR that introduced the abstract attention class:

) -> Tuple[torch.Tensor, Optional[Any]]:
.

[0], dtype=torch.long
), # start_pos, what token of output are we on.
{
"input_pos": torch.tensor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it because we are plugging in the Forward option down the line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, "input_pos" is the only thing the regular attention needs, other implementations can take additional/different parameters.

@sxu sxu force-pushed the export-D69080123 branch from 8a78aba to e25d751 Compare February 5, 2025 23:55
sxu added a commit to sxu/executorch that referenced this pull request Feb 5, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan, cccclai

Differential Revision: D69080123
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 6, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan, cccclai

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from e25d751 to ff88fe5 Compare February 6, 2025 02:55
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

sxu added a commit to sxu/executorch that referenced this pull request Feb 6, 2025
…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan, cccclai

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from ff88fe5 to 39b9f86 Compare February 6, 2025 16:30
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

@iseeyuan
Copy link
Contributor

iseeyuan commented Feb 6, 2025

Thanks @sxu ! Not sure how to connect to a discussion, but this PR is related to #8228

…t state as output (pytorch#8186)

Summary:

Pass a `ForwardOptions` argument (introduced by pytorch#8128) from the top level transformer, consolidate some existing inputs into it, and return any optional updates from the attention implementation.

Reviewed By: iseeyuan, cccclai

Differential Revision: D69080123
@sxu sxu force-pushed the export-D69080123 branch from 39b9f86 to 867a885 Compare February 6, 2025 19:03
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69080123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code topic: not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants