CI: 05/28/25 upstream sync #436

rocm-repo-management-api-2 · 2025-05-28T06:02:19Z

Daily sync with upstream

A lot of this logic was confusing phrased as conditions over both CPU and GPU build flags. But we can decompose it: * dependencies we add for CPU tests, and * additional dependencies we add for GPU tests. While we are here, also add the necessary pypi dependency for TPU tests.

hold references to raw buffers instead of PjRtBuffers. This fixes an issue where the buffers can be deleted before the transfer is complete, but introduces another problem where if they are donated it will now silently read from donated arrays. Once the underlying runtime exposes usage holds properly, this new codepath should take a usage hold and the old pjrtbuffer path should be removed. PiperOrigin-RevId: 758819621

PiperOrigin-RevId: 758833461

These had been accidentally broken at some point in the plugin switchover..

…ub.com/jax-ml/jax/actions/runs/15031061909/job/42243435305 PiperOrigin-RevId: 758898284

`slice` is not hashable before Python 3.12. This change mitigates it by converting it into a hash value. PiperOrigin-RevId: 758905560

PiperOrigin-RevId: 758915292

We must not depend on the nvidia_nvshmem_cu12 pip package directly since it does not exist on Windows and Mac platforms. PiperOrigin-RevId: 758917499

The errors are too verbose and mostly not very useful. PiperOrigin-RevId: 759025165

We weren't handling them correctly meaning you couldn't use a `shard_map`/`ManualComputationOp` which has callbacks inside. PiperOrigin-RevId: 759072597

http://github.com/openxla/xla/commit/ab7cea20271d8a24a7309e09fc5af486dde8e155. PiperOrigin-RevId: 759095567

The "add a token" part of the `callback` primitive's MLIR lowering was incorrectly adding a ranked sharding by using the sharding of a ranked tensor. So instead create an unranked sharding explicitly PiperOrigin-RevId: 759135477

…ape/dtype

…extra PiperOrigin-RevId: 759203972

See: numpy/numpy#28843

PiperOrigin-RevId: 759221096

PiperOrigin-RevId: 759252455

PiperOrigin-RevId: 759294851

PiperOrigin-RevId: 759301396

shouldn't affect existing behaviors, or trace time The main implementation ideas: * each Trace is tagged with a `requires_low: bool` * each Jaxpr * is tagged with an `is_high: bool`, default False but set True while tracing if any hijax primitives are encountered * includes an `mut_types: dict[Var, HijaxType]` indicating final types for type-changing mutable hijax types * each AbstractValue is tagged by a `mutable: bool` which is read to populate `mut_types` * each Primitive * has an `is_high(**params) -> bool` method (depends on params for HOPs) * has a `to_lojax(*args, **params)` method taking and returning hijaxtypes-wrapping-lowtracers * in `Primitive.bind`, we check if `prim.is_high(**params) and trace.requires_low`, and if so we call `prim.to_lojax` Co-authored-by: Dougal Maclaurin <dougalm@google.com>

PiperOrigin-RevId: 759336328

…tly it looks like this. ``` ValueError: Pytree for `in_specs` and inputs do not match. There are 1 mismatches, including: * `in_specs` is a tuple of length 1 but inputs is a tuple of length 4, so the lengths do not match ``` PiperOrigin-RevId: 759499528

… CI job

http://github.com/openxla/xla/commit/5fee96f09a42daa80283dde9fb7090ba90d9d07a. PiperOrigin-RevId: 759564260

…t_dict_merge PiperOrigin-RevId: 759579563

PiperOrigin-RevId: 759602792

The implementation currently forces O=0 due to a suspected bug in the NVPTX backend. To get source information * Set MOSAIC_GPU_LINE_INFO=1 * Run with --jax_include_full_tracebacks_in_locations=true PiperOrigin-RevId: 759608368

http://github.com/openxla/xla/commit/5a5e232f7bb9a2fa0d79f461f86a3cfa2c78f2cf. PiperOrigin-RevId: 763372229

The C128 matmuls will be routed to cuBLAS rather than to be handled by the loop emitter, causing a very slight numerical difference. Therefore, don't be very strict in the comparison. PiperOrigin-RevId: 763397887

PiperOrigin-RevId: 763697379

…om-ptxas-and-llvm PiperOrigin-RevId: 763701410

http://github.com/openxla/xla/commit/cb67f2f7ce4787f63f5fc80dc5c30cd3dee8f4e3. PiperOrigin-RevId: 763710186

…yout in some ops I can't explain it, but if we don't do it then the verifier sometimes fails... I'm not even sure how to properly trigger this in a test right now, but worst case it would result in more verifier failures to fix, so I think it's fine to merge as is. PiperOrigin-RevId: 763711454

I thought this doesn't work, but it does! Still, adding a test to make sure we don't regress it. PiperOrigin-RevId: 763717665

If we don't synchronize the warps, some of them can go on and schedule e.g. async copies without waiting for the memory transactions of other warps in the warpgroup to complete. PiperOrigin-RevId: 763721411

…rue` PiperOrigin-RevId: 763730217

Creating smaller build rules enforces better organized dependency graphs in the JAX project, helps pytype propagate annotations correctly, and leads to improved build and iteration times. This was unblocked by moving ad, batching, and custom_transpose to their own rules in prior changes. It required one small code refactoring: moving an effects registration to the location where the effect is defined. PiperOrigin-RevId: 763736189

…TPU interpret mode. Since dimensions with parallel semantics must now appear as the leading dimensions of the grid, this CL also makes the sequential iteration over cores in the simulation never re-visit a core after the simulation has moved on to the next core. This enables the simulation to correctly omit loads and stores of kernel buffers if the same (slice of a) buffer is processed by multiple kernel invocations on the same core. PiperOrigin-RevId: 763737647

…on ASAN. PiperOrigin-RevId: 763756072

We already call `xla::sdy::addSdyRoundTripExportPipeline` in `xla::SerializeUsingVersionedStablehlo` so no need for this anymore. PiperOrigin-RevId: 763762358

Just to give us extra confidence while we make changes. PiperOrigin-RevId: 763767275

We sometimes access NVSHMEM functions from the host code too, which means we should include the NVSHMEM host library in the context of the ExecutionEngine. PiperOrigin-RevId: 763777731

This will make it much simpler to make the kernel persistent. PiperOrigin-RevId: 763782577

Before this fix, the test would finish before execution was done, and profiling would thus yield nothing. PiperOrigin-RevId: 763783695

http://github.com/openxla/xla/commit/a566a66e53c489f947eb6c04fe44205013250922. PiperOrigin-RevId: 763822788

…ToXlaComputation`. PiperOrigin-RevId: 763837933

…nsertion Enabling this flag can introduce races into certain kernels, which is why it's False by default. Still, there's plenty of kernels where it's unnecessary and a few of those suffer performance regressions when it is on. So it makes sense to at least allow users to opt out. PiperOrigin-RevId: 763853668

PiperOrigin-RevId: 763862020

PiperOrigin-RevId: 763865376

…effects PiperOrigin-RevId: 763886950

PiperOrigin-RevId: 763950695

Previously the result of vmapped RA2A was concatenating a flattened result. PiperOrigin-RevId: 763958632

PiperOrigin-RevId: 764019664

hawkinsp and others added 30 commits May 14, 2025 20:41

Merge pull request jax-ml#28753 from hawkinsp:plugins

7e8fa0d

PiperOrigin-RevId: 758833461

Reenable CUDA version checks from Python.

011639c

These had been accidentally broken at some point in the plugin switchover..

Add numpy and absl/testing dep to custom_api_test. Fixes https://gith…

bf4fda9

…ub.com/jax-ml/jax/actions/runs/15031061909/job/42243435305 PiperOrigin-RevId: 758898284

[JAX] Fix unhashable slice in api_test

9a1535e

`slice` is not hashable before Python 3.12. This change mitigates it by converting it into a hash value. PiperOrigin-RevId: 758905560

Merge pull request jax-ml#28755 from hawkinsp:plugins

0f91c40

PiperOrigin-RevId: 758915292

Fix CI build failure on Mac.

ec72f17

We must not depend on the nvidia_nvshmem_cu12 pip package directly since it does not exist on Windows and Mac platforms. PiperOrigin-RevId: 758917499

[pallas] Do not emit verbose lowering errors by default

7cbdc3c

The errors are too verbose and mostly not very useful. PiperOrigin-RevId: 759025165

#sdy Properly handle token types in JAX and ManualComputationOp.

0a0368b

We weren't handling them correctly meaning you couldn't use a `shard_map`/`ManualComputationOp` which has callbacks inside. PiperOrigin-RevId: 759072597

Update XLA dependency to use revision

9c3b7f0

http://github.com/openxla/xla/commit/ab7cea20271d8a24a7309e09fc5af486dde8e155. PiperOrigin-RevId: 759095567

#sdy Fix incorrect sharding on a token during a callback.

afdf51d

The "add a token" part of the `callback` primitive's MLIR lowering was incorrectly adding a ranked sharding by using the sharding of a ranked tensor. So instead create an unranked sharding explicitly PiperOrigin-RevId: 759135477

improve error message with when custom_vjp bwd rule produces wrong sh…

0984dc8

…ape/dtype

Merge pull request jax-ml#28757 from mattjj:custom-vjp-aval-mismatch-…

0533263

…extra PiperOrigin-RevId: 759203972

Workaround a crash on aarch64 due to a NumPy bug.

594e1d2

See: numpy/numpy#28843

Merge pull request jax-ml#28771 from hawkinsp:numpy2

a5b5ffa

PiperOrigin-RevId: 759221096

Merge pull request jax-ml#28317 from yhtang:pr-add-k8s-docs

72f540e

PiperOrigin-RevId: 759252455

Merge pull request jax-ml#28721 from jakevdp:fix-typestubs

3c55db3

PiperOrigin-RevId: 759294851

[pre-commit] bump pre-commit version to v5.0.0

2618a23

Merge pull request jax-ml#28779 from jakevdp:pre-commit-version-bump

f1100f9

PiperOrigin-RevId: 759301396

[pre-commit] update mypy to v1.15.0

8ff60c3

Merge pull request jax-ml#28781 from mattjj:hijax

5b2f399

PiperOrigin-RevId: 759336328

Added dict_dict_merge/split_keys_entry_added suppression to 3.14 TSAN…

921ddd5

… CI job

Update XLA dependency to use revision

5567b58

http://github.com/openxla/xla/commit/5fee96f09a42daa80283dde9fb7090ba90d9d07a. PiperOrigin-RevId: 759564260

Merge pull request jax-ml#28785 from vfdev-5:race-suppression-314-dic…

6831cf8

…t_dict_merge PiperOrigin-RevId: 759579563

Add OMP_NUM_THREADS workaround to more aarch64 CI configuations.

3662851

Merge pull request jax-ml#28792 from hawkinsp:aarch64

79eb9a5

PiperOrigin-RevId: 759602792

Google-ML-Automation and others added 28 commits May 26, 2025 04:16

Update XLA dependency to use revision

4bfd163

http://github.com/openxla/xla/commit/5a5e232f7bb9a2fa0d79f461f86a3cfa2c78f2cf. PiperOrigin-RevId: 763372229

Fix a test which blocks the openxla change.

444e952

The C128 matmuls will be routed to cuBLAS rather than to be handled by the loop emitter, causing a very slight numerical difference. Therefore, don't be very strict in the comparison. PiperOrigin-RevId: 763397887

[Mosaic GPU] Use PTX ISA version = min(ptxas, LLVM)

c1e8f25

[pallas] The cf dialect is now always available

f35d708

PiperOrigin-RevId: 763697379

Merge pull request jax-ml#28595 from andportnoy:mosaic-gpu-ptx-isa-fr…

2cbec58

…om-ptxas-and-llvm PiperOrigin-RevId: 763701410

Update XLA dependency to use revision

3aa4e36

http://github.com/openxla/xla/commit/cb67f2f7ce4787f63f5fc80dc5c30cd3dee8f4e3. PiperOrigin-RevId: 763710186

[Pallas:MGPU] Make sure that lowering errors mention the offending line

b44b963

I thought this doesn't work, but it does! Still, adding a test to make sure we don't regress it. PiperOrigin-RevId: 763717665

[Pallas:MGPU] Add a missing warpgroup barrier before warp core_map

9a7f9f1

If we don't synchronize the warps, some of them can go on and schedule e.g. async copies without waiting for the memory transactions of other warps in the warpgroup to complete. PiperOrigin-RevId: 763721411

[pallas:mosaic_gpu] Barrier and ClusterBarrier are now `kw_only=T…

4f717d3

…rue` PiperOrigin-RevId: 763730217

Skip //third_party/py/jax/tests/pallas:mgpu_ragged_dot_test_gpu_h100 …

71edce4

…on ASAN. PiperOrigin-RevId: 763756072

#sdy remove redundant call to sdy-round-trip-export in JAX export.

a57b4a1

We already call `xla::sdy::addSdyRoundTripExportPipeline` in `xla::SerializeUsingVersionedStablehlo` so no need for this anymore. PiperOrigin-RevId: 763762358

[Mosaic GPU] Add tests for the Blackwell matmul kernel

487eeb4

Just to give us extra confidence while we make changes. PiperOrigin-RevId: 763767275

[Mosaic GPU] Fix missing symbol errors in OSS collective kernels

f5ffd7f

We sometimes access NVSHMEM functions from the host code too, which means we should include the NVSHMEM host library in the context of the ExecutionEngine. PiperOrigin-RevId: 763777731

[Mosaic GPU][NFC] Refactor the body of the matmul kernel

ee727f9

This will make it much simpler to make the kernel persistent. PiperOrigin-RevId: 763782577

Block until ready for PGLE test

10cdbb7

Before this fix, the test would finish before execution was done, and profiling would thus yield nothing. PiperOrigin-RevId: 763783695

Update XLA dependency to use revision

3c926a2

http://github.com/openxla/xla/commit/a566a66e53c489f947eb6c04fe44205013250922. PiperOrigin-RevId: 763822788

#sdy Remove redundant sdy export since it's now done as part of `Mlir…

3b3c338

…ToXlaComputation`. PiperOrigin-RevId: 763837933

Fix sempahore typo in JAX

e258708

PiperOrigin-RevId: 763862020

Merge pull request jax-ml#29001 from johannahaffner:test-clip

1d10a48

PiperOrigin-RevId: 763865376

Merge pull request jax-ml#28955 from jax-ml:prevent-partial-eval-dce-…

0f4da0c

…effects PiperOrigin-RevId: 763886950

Update lock files for jaxlib 0.6.1

c09b1bb

Merge pull request jax-ml#29043 from hawkinsp:locks

0caeb98

PiperOrigin-RevId: 763950695

Reshape ragged_all_to_all to correct shape before concatenating

669f08a

Previously the result of vmapped RA2A was concatenating a flattened result. PiperOrigin-RevId: 763958632

[pallas] Fix broadcast_in_dim fuser eval rule.

69c4317

PiperOrigin-RevId: 764019664

rocm-repo-management-api-2 bot requested a review from a team as a code owner May 28, 2025 06:02

rocm-repo-management-api-2 bot enabled auto-merge (rebase) May 28, 2025 06:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CI: 05/28/25 upstream sync #436

CI: 05/28/25 upstream sync #436

Uh oh!

rocm-repo-management-api-2 bot commented May 28, 2025

Uh oh!

Uh oh!

CI: 05/28/25 upstream sync #436

Are you sure you want to change the base?

CI: 05/28/25 upstream sync #436

Uh oh!

Conversation

rocm-repo-management-api-2 bot commented May 28, 2025

Uh oh!

Uh oh!