Add Pallas core_map guide #33048

IvyZX · 2025-10-31T21:26:30Z

No description provided.

gemini-code-assist · 2025-10-31T21:26:46Z

Summary of Changes

Hello @IvyZX, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds a comprehensive guide to Pallas core-specific programming using pl.core_map for TPUs. The guide covers fundamental concepts like per-core programming and inter-core communication, progresses to advanced pipelining and scalar prefetch techniques, and concludes with an example of mapping operations onto SparseCores, providing developers with detailed instructions and examples for fine-grained control over TPU hardware.

Highlights

Introduction to pl.core_map: The guide introduces pl.core_map for writing Pallas kernels, emphasizing its benefits over pallas_call for per-core programming, flexible pipelining, and inter-core collectives on TPUs.
Core-level Programming Examples: It provides practical examples, starting with a simple per-core kernel demonstrating VMEM and semaphore allocations, and inter-core communication using barriers and remote copies.
Pipelining and Work Splitting: The guide explains how to implement custom pipelining with pltpu.emit_pipeline and manually parallelize work across cores using index_map and pl.BoundedSlice.
Advanced Techniques: It covers advanced topics such as scalar prefetch and dynamic block indexing, showcasing how to use SMEM buffers and sync_copy for optimized data access.
SparseCore Integration: The guide demonstrates how to map operations over SparseCores, detailing the setup of VectorSubcoreMesh and handling work distribution across subcores for sparse memory access.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This PR adds a new guide for Pallas core_map on TPU. The guide is well-structured and covers important concepts from basic per-core kernels to more advanced topics like pipelining, scalar prefetch, and SparseCores. The examples are clear and helpful.

I've found a few issues, mainly minor typos in the documentation and a more significant issue with the barrier implementation in the first example. The barrier is incorrect and inefficient, which could be misleading for users. I've suggested a correct and efficient implementation.

Most of the typos are present in both the Jupyter notebook and the generated Markdown file. It would be best to fix them in the source notebook and regenerate the Markdown file.

gemini-code-assist · 2025-10-31T21:30:51Z

docs/pallas/tpu/core_map.ipynb

+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Pallas Core-specifc Programming"


There's a typo in 'specific'.

Suggested change

"# Pallas Core-specifc Programming"

"# Pallas Core-specific Programming"

gemini-code-assist · 2025-10-31T21:30:52Z

docs/pallas/tpu/core_map.ipynb

+    {
+      "cell_type": "markdown",
+      "source": [
+        "In addition to the typical TPU device mesh, you need to make a mesh of cores. Consider this as an addition dimension called \"core\", with length 2, in addition to the 4-device mesh you work with. That is 8 cores in total."


Typo: 'addition' should be 'additional'.

Suggested change

"In addition to the typical TPU device mesh, you need to make a mesh of cores. Consider this as an addition dimension called \"core\", with length 2, in addition to the 4-device mesh you work with. That is 8 cores in total."

"In addition to the typical TPU device mesh, you need to make a mesh of cores. Consider this as an additional dimension called \"core\", with length 2, in addition to the 4-device mesh you work with. That is 8 cores in total."

gemini-code-assist · 2025-10-31T21:30:52Z

docs/pallas/tpu/core_map.ipynb

+        "\n",
+        "**Parallelize work per core**\n",
+        "\n",
+        "Since you are programming on the core level, you get to customize exactly how the work is splitted amongst cores. To do that, you need to:\n",


Typo: 'splitted' should be 'split'.

Suggested change

"Since you are programming on the core level, you get to customize exactly how the work is splitted amongst cores. To do that, you need to:\n",

"Since you are programming on the core level, you get to customize exactly how the work is split amongst cores. To do that, you need to:\n",

gemini-code-assist · 2025-10-31T21:30:52Z

docs/pallas/tpu/core_map.ipynb

+      "source": [
+        "## Scalar prefetch\n",
+        "\n",
+        "The code below extended the kernel above but uses [scalar prefetch and dynamic block indexing](https://docs.jax.dev/en/latest/pallas/tpu/sparse.html) to select a specific sub-slice of the input.\n",


Typo: 'extended' should be 'extends'.

Suggested change

"The code below extended the kernel above but uses [scalar prefetch and dynamic block indexing](https://docs.jax.dev/en/latest/pallas/tpu/sparse.html) to select a specific sub-slice of the input.\n",

"The code below extends the kernel above but uses [scalar prefetch and dynamic block indexing](https://docs.jax.dev/en/latest/pallas/tpu/sparse.html) to select a specific sub-slice of the input.\n",

gemini-code-assist · 2025-10-31T21:30:52Z

docs/pallas/tpu/core_map.ipynb

+      "source": [
+        "## Mapping over SparseCores\n",
+        "\n",
+        "TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with same semantics and minimal changes from the TensorCore code.\n",


Grammar: 'with same semantics' should be 'with the same semantics'.

Suggested change

"TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with same semantics and minimal changes from the TensorCore code.\n",

"TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with the same semantics and minimal changes from the TensorCore code.\n",

gemini-code-assist · 2025-10-31T21:30:52Z

docs/pallas/tpu/core_map.ipynb

+    {
+      "cell_type": "markdown",
+      "source": [
+        "The code below is very similar from the `add_one_kernel` we wrote earlier, except for a few differences:\n",


Typo: 'similar from' should be 'similar to'.

Suggested change

"The code below is very similar from the `add_one_kernel` we wrote earlier, except for a few differences:\n",

"The code below is very similar to the `add_one_kernel` we wrote earlier, except for a few differences:\n",

justinjfu · 2025-10-31T21:49:18Z

docs/pallas/tpu/core_map.md

+
+* **Flexible pipelining**: You have the option to write pipelining communications on your own, instead of relying on Pallas grids and specs. This is helpful if your pipeline diverges from the standard "copy-in, compute & copy-out" pattern.
+
+* **Collectives**: Since `core_map` allows inter-core communications, it is especially helpful when writing collectives on the core level.


I thought pallas_call allows this too? Maybe it would help to mention that the way core-specific code in pallas is done is quite indirect and not user-friendly. You have to set the grid=(num_cores,) and mark that dimension as PARALLEL.

justinjfu · 2025-10-31T21:52:37Z

docs/pallas/tpu/core_map.md

+
+## Environment setup
+
+Modern accelerators often have multiple cores under a device. For TPU chips higher than v4, every JAX device by default contains two TensorCores (aka. a [Megacore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#chips)). They also contain a [SparseCore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#sparsecore), consisting of many subcores.


Only v4 and v5p have megacore. v7 lacks megacore and instead comes in pairs of chips with separate HBM. Also I think only v5p/v6/v7 have sparsecore.

justinjfu · 2025-10-31T21:56:29Z

docs/pallas/tpu/core_map.md

+  for i in range(num_devices):
+    for j in range(num_cores):
+      pltpu.semaphore_signal(sem0, 1, device_id={'device': i, 'core': j})
+  pltpu.semaphore_wait(sem0, num_devices * num_cores)


Why is it necessary to barrier with everything - can you just barrier with the cores you are computing with?

justinjfu · 2025-10-31T22:02:29Z

docs/pallas/tpu/core_map.md

+
+**Parallelize work per core**
+
+Since you are programming on the core level, you get to customize exactly how the work is splitted amongst cores. To do that, you need to:


how the work is splitted -> how the work is split

justinjfu · 2025-10-31T22:04:46Z

docs/pallas/tpu/core_map.md

+
+1. Provide an `index_map` function that, given the iteration indices, return *the slice* of the input data that shall be passed in.
+
+1. On `BlockSpec`, wrap the corresponding dimension with `pl.BoundedSlice`, indicating the `index_map` function would return a slice instead of a iteration index on that dimension.


Could you have done the same thing with index_map = (core_idx * core_slc_size // 8 + i, j) and just use a normal (8, 128) block shape?

justinjfu · 2025-10-31T22:07:12Z

docs/pallas/tpu/core_map.md

+
+You could make a shortcut `kernel()` that wraps all the `shard_map`, `core_map` and `run_scoped` boilerplates.
+
+Some similar APIs are currently available in Pallas package, such as `plgpu.kernel` and `plsc.kernel`. A unified API may be released soon.


Let's just add this to Pallas now? Any thoughts @sharadmv?

justinjfu · 2025-10-31T22:07:50Z

docs/pallas/tpu/core_map.md

@@ -0,0 +1,358 @@
+# Pallas Core-specifc Programming


typo: specifc -> specific

justinjfu · 2025-10-31T22:08:48Z

docs/pallas/tpu/core_map.md

+
+* **Per-core level programming**: You write code for an TPU/GPU core, not for a JAX device. This is crucial if you want to specifically control a core, or how cores communicate and distribute work among one another.
+
+* **Flexible pipelining**: You have the option to write pipelining communications on your own, instead of relying on Pallas grids and specs. This is helpful if your pipeline diverges from the standard "copy-in, compute & copy-out" pattern.


We can already do this in pallas_call, by simply not using the grid.

sharadmv · 2025-10-31T22:02:49Z

docs/pallas/tpu/core_map.ipynb

+          "timestamp": 1761945248463
+        }
+      ],
+      "last_runtime": {


this should not be in the ipynb

sharadmv · 2025-10-31T22:03:57Z

docs/pallas/tpu/core_map.md

+
+## Environment setup
+
+Modern accelerators often have multiple cores under a device. For TPU chips higher than v4, every JAX device by default contains two TensorCores (aka. a [Megacore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#chips)). They also contain a [SparseCore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#sparsecore), consisting of many subcores.


I'd say for "For chips such as TPU v5p" to be precise

sharadmv · 2025-10-31T22:04:48Z

docs/pallas/tpu/core_map.md

+
+Modern accelerators often have multiple cores under a device. For TPU chips higher than v4, every JAX device by default contains two TensorCores (aka. a [Megacore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#chips)). They also contain a [SparseCore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#sparsecore), consisting of many subcores.
+
+This guide was written on a v5p chip, which contains 4 devices (2 TensorCores each) and a SparseCore of 16 subcores.


"SparseCore with 16 subcores"

4 SparseCores, each with 16 (vector) subcores. We can link to https://openxla.org/xla/sparsecore#specifications_at_a_glance.

sharadmv · 2025-10-31T22:05:58Z

docs/pallas/tpu/core_map.md

+
+`pl.core_map` allows you to write per-core local code, just as `jax.shard_map` allows you to write per-device code.
+
+In the example kernel below, each core has its own VMEM and semaphore allocations. As with normal kernel, you can initiate copy between HBM and VMEM refs using `async_copy`.


"can initiate copy" -> "can initiate copies"
"async_copy" -> "pl.async_copy"

sharadmv · 2025-10-31T22:06:11Z

docs/pallas/tpu/core_map.md

+
+In the example kernel below, each core has its own VMEM and semaphore allocations. As with normal kernel, you can initiate copy between HBM and VMEM refs using `async_copy`.
+
+**Communication amongst cores**


"amongst" -> "between"

sharadmv · 2025-10-31T22:11:02Z

docs/pallas/tpu/core_map.md

+
+ * Call it inside a `pl.core_map`, which takes the TensorCore mesh.
+
+    * You would need `collective_id` if there exists inter-core communications.


You will need collective_id for the barrier semaphore

sharadmv · 2025-10-31T22:12:51Z

docs/pallas/tpu/core_map.md

+
+## Pipelining with `core_map`
+
+Note that the kernel above only does simple copies and computes, without automatic pipelining via Pallas `grid` and `BlockSpec`. To do pipelining inside `core_map`, use `pltpu.emit_pipeline` inside the core-local kernel.


"computes" -> "compute"

sharadmv · 2025-10-31T22:13:10Z

docs/pallas/tpu/core_map.md

+
+**Parallelize work per core**
+
+Since you are programming on the core level, you get to customize exactly how the work is splitted amongst cores. To do that, you need to:


"on the core level" -> "at the core level"

"splitted amongst cores" -> "split between cores"

sharadmv · 2025-10-31T22:14:48Z

docs/pallas/tpu/core_map.md

+
+1. Provide an `index_map` function that, given the iteration indices, return *the slice* of the input data that shall be passed in.
+
+1. On `BlockSpec`, wrap the corresponding dimension with `pl.BoundedSlice`, indicating the `index_map` function would return a slice instead of a iteration index on that dimension.


Note that we don't strictly need BoundedSlice here. We can also use half of the original BlockSize and offset the index map. Also, emit_pipeline with core_axis_name also automatically partitions the grid

I think for this, the first example should be emit_pipeline with core_axis_name.

The second one could be a more custom splitting across cores.

sharadmv · 2025-10-31T22:15:32Z

docs/pallas/tpu/core_map.md

+
+## Mapping over SparseCores
+
+TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with same semantics and minimal changes from the TensorCore code.


Note that TPU v5e does not have SparseCore.

superbobry · 2025-11-03T10:41:46Z

docs/pallas/tpu/core_map.md

+
+## Environment setup
+
+Modern accelerators often have multiple cores under a device. For TPU chips higher than v4, every JAX device by default contains two TensorCores (aka. a [Megacore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#chips)). They also contain a [SparseCore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#sparsecore), consisting of many subcores.


Should we say "SparseCores", since every chip has at least 2?

superbobry · 2025-11-03T10:43:53Z

docs/pallas/tpu/core_map.md

+from functools import partial
+
+import jax
+from jax.sharding import NamedSharding, PartitionSpec as P


Nit: I think we have jax.P now.

superbobry · 2025-11-03T10:44:57Z

docs/pallas/tpu/core_map.md

+
+**Communication amongst cores**
+
+Before making a inter-core communication, you may need to do a global barrier signal (`pltpu.semaphore_signal`), to make sure all the destination semaphores have been properly initialized.


Is it "may" or "must"? If "may", it will be useful to explain when this is in fact required.

superbobry · 2025-11-03T10:47:35Z

docs/pallas/tpu/core_map.md

+
+## Mapping over SparseCores
+
+TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with same semantics and minimal changes from the TensorCore code.


Nit: should we say "SparseCores" here as well, instead of "a SparseCore" to highlight that it's >1 per chip.

superbobry · 2025-11-03T10:48:26Z

docs/pallas/tpu/core_map.md

+
+sc_mesh = plsc.VectorSubcoreMesh(
+    core_axis_name="core", subcore_axis_name="subcore",
+    num_cores=sc_info.num_cores


I wonder if we should default to sc_info.num_cores instead of requiring users to always query SC info?

superbobry · 2025-11-03T10:50:19Z

docs/pallas/tpu/core_map.md

+
+1. You need to split the work amongst all subcores, so a few lines to compute the specific slice for each subcore.
+
+1. SparseCore register computation allows smaller slices (`4x16` max for int32), so you need nested loops to iterate the slice during computation phase.


Nit: 2.

Also, should we reference sc_info.num_lanes here and have a single loop reading out (num_lanes,) vectors? 4x16 relies on unrolling in the SC compiler, which only really works for a handful of datatypes atm.

Add Pallas core_map guide

72259ba

IvyZX requested review from justinjfu and sharadmv October 31, 2025 21:26

gemini-code-assist bot reviewed Oct 31, 2025

View reviewed changes

justinjfu reviewed Oct 31, 2025

View reviewed changes

sharadmv requested changes Oct 31, 2025

View reviewed changes

superbobry reviewed Nov 3, 2025

View reviewed changes

	"# Pallas Core-specifc Programming"
	"# Pallas Core-specific Programming"

	"In addition to the typical TPU device mesh, you need to make a mesh of cores. Consider this as an addition dimension called \"core\", with length 2, in addition to the 4-device mesh you work with. That is 8 cores in total."
	"In addition to the typical TPU device mesh, you need to make a mesh of cores. Consider this as an additional dimension called \"core\", with length 2, in addition to the 4-device mesh you work with. That is 8 cores in total."

	"Since you are programming on the core level, you get to customize exactly how the work is splitted amongst cores. To do that, you need to:\n",
	"Since you are programming on the core level, you get to customize exactly how the work is split amongst cores. To do that, you need to:\n",

	"The code below extended the kernel above but uses [scalar prefetch and dynamic block indexing](https://docs.jax.dev/en/latest/pallas/tpu/sparse.html) to select a specific sub-slice of the input.\n",
	"The code below extends the kernel above but uses [scalar prefetch and dynamic block indexing](https://docs.jax.dev/en/latest/pallas/tpu/sparse.html) to select a specific sub-slice of the input.\n",

	"TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with same semantics and minimal changes from the TensorCore code.\n",
	"TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with the same semantics and minimal changes from the TensorCore code.\n",

	"The code below is very similar from the `add_one_kernel` we wrote earlier, except for a few differences:\n",
	"The code below is very similar to the `add_one_kernel` we wrote earlier, except for a few differences:\n",


		* Flexible pipelining: You have the option to write pipelining communications on your own, instead of relying on Pallas grids and specs. This is helpful if your pipeline diverges from the standard "copy-in, compute & copy-out" pattern.

		* Collectives: Since `core_map` allows inter-core communications, it is especially helpful when writing collectives on the core level.


		## Environment setup

		Modern accelerators often have multiple cores under a device. For TPU chips higher than v4, every JAX device by default contains two TensorCores (aka. a [Megacore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#chips)). They also contain a [SparseCore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#sparsecore), consisting of many subcores.


		Parallelize work per core

		Since you are programming on the core level, you get to customize exactly how the work is splitted amongst cores. To do that, you need to:


		1. Provide an `index_map` function that, given the iteration indices, return the slice of the input data that shall be passed in.

		1. On `BlockSpec`, wrap the corresponding dimension with `pl.BoundedSlice`, indicating the `index_map` function would return a slice instead of a iteration index on that dimension.


		You could make a shortcut `kernel()` that wraps all the `shard_map`, `core_map` and `run_scoped` boilerplates.

		Some similar APIs are currently available in Pallas package, such as `plgpu.kernel` and `plsc.kernel`. A unified API may be released soon.


		* Per-core level programming: You write code for an TPU/GPU core, not for a JAX device. This is crucial if you want to specifically control a core, or how cores communicate and distribute work among one another.

		* Flexible pipelining: You have the option to write pipelining communications on your own, instead of relying on Pallas grids and specs. This is helpful if your pipeline diverges from the standard "copy-in, compute & copy-out" pattern.


		Modern accelerators often have multiple cores under a device. For TPU chips higher than v4, every JAX device by default contains two TensorCores (aka. a [Megacore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#chips)). They also contain a [SparseCore](https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#sparsecore), consisting of many subcores.

		This guide was written on a v5p chip, which contains 4 devices (2 TensorCores each) and a SparseCore of 16 subcores.


		`pl.core_map` allows you to write per-core local code, just as `jax.shard_map` allows you to write per-device code.

		In the example kernel below, each core has its own VMEM and semaphore allocations. As with normal kernel, you can initiate copy between HBM and VMEM refs using `async_copy`.


		In the example kernel below, each core has its own VMEM and semaphore allocations. As with normal kernel, you can initiate copy between HBM and VMEM refs using `async_copy`.

		Communication amongst cores


		* Call it inside a `pl.core_map`, which takes the TensorCore mesh.

		* You would need `collective_id` if there exists inter-core communications.


		## Pipelining with `core_map`

		Note that the kernel above only does simple copies and computes, without automatic pipelining via Pallas `grid` and `BlockSpec`. To do pipelining inside `core_map`, use `pltpu.emit_pipeline` inside the core-local kernel.


		## Mapping over SparseCores

		TPU v4 and above includes a [SparseCore](https://openxla.org/xla/sparsecore), which is specialized in sparse memory access and operations. This guide will not dive into the capabilities of SparseCore, but rather show how to run a program on SparseCore with same semantics and minimal changes from the TensorCore code.


		Communication amongst cores

		Before making a inter-core communication, you may need to do a global barrier signal (`pltpu.semaphore_signal`), to make sure all the destination semaphores have been properly initialized.


		1. You need to split the work amongst all subcores, so a few lines to compute the specific slice for each subcore.

		1. SparseCore register computation allows smaller slices (`4x16` max for int32), so you need nested loops to iterate the slice during computation phase.

Add Pallas core_map guide #33048

Are you sure you want to change the base?

Add Pallas core_map guide #33048

Conversation

IvyZX commented Oct 31, 2025

Uh oh!

gemini-code-assist bot commented Oct 31, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

superbobry Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

superbobry Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

superbobry Nov 3, 2025 •

edited

Loading

superbobry Nov 3, 2025 •

edited

Loading