Correct length in sharding #2262

shuningjin · 2025-08-29T00:30:39Z

Description

Due to change in #2023, "activation_length" definition changed. Specifically, "expert" is added there, and causing some conflicts:

As a result, it should not be used together with "activation_batch", otherwise sharding like CP/SP will be ignored. See details in b/441547754.
Similarly, should not be used together with "activation_embed_and_logits_batch". e.g, b/433561718#comment22

1. For conflicting combination "activation_batch" + "activation_length" in file. Either update to "activation_length_no_exp" or "activation_norm_length".

we can restore old definition by using "activation_length_no_exp"
In some cases, they should be "activation_norm_length", aligning with Tensor Sequence Parallelism (TSP).

2. For conflicting combination "activation_embed_and_logits_batch", "activation_length", we can restore old definition by using "activation_length_no_exp"

FIXES: b/441547754

1 Changes related to "activation_batch"

TSP should use "activation_norm_length" for decoder block

as in TSP PR: introduce activation_norm_length, see changes in 1, 3, 5
other decoder blocks also uses this practice, e.g., deepseek, etc
qwen3.py, decoders.py

TSP should use "activation_length_no_exp" for MLP and MoE

as in TSP PR: remain as activation_length, which now should be activation_length_no_exp
linears.py: restore, activation_length -> activation_length_no_exp
moe.py: restore, activation_length -> PR2023 -> activation_norm_length -> activation_length_no_exp

Additional changes

gpt3.py: attention layer, restore
test: restore
pyconfig.py: correct typo

2 Changes related to "activation_embed_and_logits_batch"

train.py, decoder.py, embedding.py: restore to "activation_length_no_exp"

Tests

N/A

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

…ngjin-shard

github-actions · 2025-09-30T16:07:24Z

This PR has been automatically marked as stale because it has not had recent activity. It will be closed soon if no further activity occurs. Thank you for your contributions.

github-actions · 2025-10-08T16:07:02Z

This PR was closed because it has been inactive for a while. Please reopen it if you are still working on it.

Correct activation length

3ebb88d

shuningjin marked this pull request as ready for review August 29, 2025 01:18

shuningjin requested review from A9isha, NuojCheng, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, gagika, gobbleturk, hengtaoguo, khatwanimohit, michelle-yooh, richjames0, shralex, vipannalla and yangyuwei as code owners August 29, 2025 01:18

shuningjin added 2 commits August 31, 2025 02:34

Merge branch 'main' of github.com:AI-Hypercomputer/maxtext into shuni…

f7a549e

…ngjin-shard

Correction2

2a4ef34

shuningjin requested review from Lumosis, gpolovets1, jrplatin, mailvijayasingh, mitalisi and patemotter as code owners August 31, 2025 02:36

revert

e1c1a32

github-actions bot added the stale Automatically applied to stale PRs. label Sep 30, 2025

github-actions bot closed this Oct 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct length in sharding #2262

Correct length in sharding #2262

Uh oh!

shuningjin commented Aug 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 30, 2025

Uh oh!

github-actions bot commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Correct length in sharding #2262

Correct length in sharding #2262

Uh oh!

Conversation

shuningjin commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

1 Changes related to "activation_batch"

2 Changes related to "activation_embed_and_logits_batch"

Tests

Checklist

Uh oh!

github-actions bot commented Sep 30, 2025

Uh oh!

github-actions bot commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shuningjin commented Aug 29, 2025 •

edited

Loading