Updates to LM PR #891

priyakasimbeg · 2025-10-06T20:53:06Z

No description provided.

github-actions · 2025-10-06T20:53:17Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

algoperf/workloads/lm/lm_jax/workload.py

rka97 · 2025-10-09T14:55:07Z

algoperf/workloads/lm/lm_jax/workload.py

-    loss = -jnp.sum(targets * jax.nn.log_softmax(logits, axis=-1))
-    return loss
+    # TODO(kasimbeg): add weights?
+    metrics = self.compute_weighted_cross_entropy(logits, batch['targets'], batch['weights'])


For this workload I don't think we need weights for the cross-entropy calculation. Maybe we should explicitly del any weights?

The weights are used to identify padded elements in the validation split and correctly calculate the number of tokens returned in the eval dict.

rka97 · 2025-10-09T14:57:27Z

algoperf/workloads/lm/input_pipeline.py

+    ds,
+  )
+
+  return iter(it)


Should we use itertools.cycle here?

No I don't think so. The input_pipeline already calls .repeat() on the train split and we don't want cycle on the validation split.

algoperf/workloads/lm/lm_pytorch/workload.py

algoperf/workloads/lm/lm_jax/workload.py

…efficiency into lm_workload_priya

updates to input_pipeline and model spec

a59dfda

priyakasimbeg requested a review from a team as a code owner October 6, 2025 20:53

priyakasimbeg added 5 commits October 6, 2025 22:15

add defaults for lm workload

1c3cb66

refactor eval pipeline and loss fn for lm

af91b12

refactor evaluation pipeline for lm

6b55adf

remove temporary flag for hlo dumps

210d671

fix in workload target condition check

0ad7788

priyakasimbeg changed the title ~~updates to input_pipeline and model spec~~ Updates to LM PR Oct 8, 2025

fix in mlp for glu

01921d5

priyakasimbeg requested review from Niccolo-Ajroldi and rka97 October 9, 2025 01:26

rka97 reviewed Oct 9, 2025

View reviewed changes

rka97 and others added 6 commits October 10, 2025 04:14

Fix OOM error in weighted cross entropy calculation

e420450

fix issue with checkpointing bool

3b31ad5

increase buffer size

bbc114f

Merge branch 'lm_workload_priya' of github.com:mlcommons/algorithmic-…

f531b35

…efficiency into lm_workload_priya

remove _eval_batch from jax workload

2b162e8

add todo for pytorch _eval_batch cleanup

617e1a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updates to LM PR #891

Updates to LM PR #891

Uh oh!

priyakasimbeg commented Oct 6, 2025

Uh oh!

github-actions bot commented Oct 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

rka97 Oct 9, 2025

Uh oh!

priyakasimbeg Oct 10, 2025

Uh oh!

rka97 Oct 9, 2025

Uh oh!

priyakasimbeg Oct 10, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Updates to LM PR #891

Are you sure you want to change the base?

Updates to LM PR #891

Uh oh!

Conversation

priyakasimbeg commented Oct 6, 2025

Uh oh!

github-actions bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rka97 Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

priyakasimbeg Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

rka97 Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

priyakasimbeg Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 6, 2025 •

edited

Loading