Skip to content

Commit 6485c7f

Browse files
committed
[Pallas][Mosaic GPU] Add GPU pipelining docs
1 parent 0e1b341 commit 6485c7f

9 files changed

+753
-2
lines changed

docs/_static/pallas/gpu/pipeline_matmul.svg

Lines changed: 1 addition & 0 deletions
Loading

docs/_static/pallas/gpu/pipeline_matmul_ws.svg

Lines changed: 1 addition & 0 deletions
Loading

docs/_static/pallas/gpu/warp_specialization.svg

Lines changed: 1 addition & 0 deletions
Loading

docs/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,7 @@ def _do_not_evaluate_in_jax(
134134
'notebooks/*.md',
135135
'pallas/quickstart.md',
136136
'pallas/pipelining.md',
137+
'pallas/gpu/pipelining.md',
137138
'pallas/tpu/pipelining.md',
138139
'pallas/tpu/distributed.md',
139140
'pallas/tpu/sparse.md',
@@ -230,6 +231,7 @@ def _do_not_evaluate_in_jax(
230231
# Requires accelerators
231232
'pallas/quickstart.*',
232233
'pallas/pipelining.*',
234+
'pallas/gpu/pipelining.*',
233235
'pallas/tpu/pipelining.*',
234236
'pallas/tpu/distributed.*',
235237
'pallas/tpu/sparse.*',

docs/pallas/gpu/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ Backend specific documentation for the Mosaic GPU backend.
77
:maxdepth: 2
88

99
reference
10+
pipelining
1011

1112
.. toctree::
1213
:caption: Guides

docs/pallas/gpu/pipelining.ipynb

Lines changed: 417 additions & 0 deletions
Large diffs are not rendered by default.

docs/pallas/gpu/pipelining.md

Lines changed: 328 additions & 0 deletions
Large diffs are not rendered by default.

docs/pallas/pipelining.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
"\n",
1111
"Software pipelining is an important technique in performance optimization by overlapping multiple asynchronous operations even if there are data dependencies between them. In the context of kernel writing, the most common form of pipelining involves overlapping communication and memory transfers with compute such that the hardware accelerator never stalls while waiting for data to arrive. Therefore, we will solely focus on the problem of communication-compute pipelining in this tutorial. We will begin by covering the problem conceptually, outlining the Pallas API for writing pipelines, and going over some realistic examples using the API.\n",
1212
"\n",
13-
"This tutorial only covers the conceptual foundations of pipelining. For platform-specific references, please see the [TPU](https://docs.jax.dev/en/latest/pallas/tpu/pipelining.html), or GPU (coming soon!) specific pipelining references.\n"
13+
"This tutorial only covers the conceptual foundations of pipelining. For platform-specific references, please see the [TPU](https://docs.jax.dev/en/latest/pallas/tpu/pipelining.html), or [GPU](https://docs.jax.dev/en/latest/pallas/gpu/pipelining.html) specific pipelining references.\n"
1414
]
1515
},
1616
{

docs/pallas/pipelining.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jupyter:
1717

1818
Software pipelining is an important technique in performance optimization by overlapping multiple asynchronous operations even if there are data dependencies between them. In the context of kernel writing, the most common form of pipelining involves overlapping communication and memory transfers with compute such that the hardware accelerator never stalls while waiting for data to arrive. Therefore, we will solely focus on the problem of communication-compute pipelining in this tutorial. We will begin by covering the problem conceptually, outlining the Pallas API for writing pipelines, and going over some realistic examples using the API.
1919

20-
This tutorial only covers the conceptual foundations of pipelining. For platform-specific references, please see the [TPU](https://docs.jax.dev/en/latest/pallas/tpu/pipelining.html), or GPU (coming soon!) specific pipelining references.
20+
This tutorial only covers the conceptual foundations of pipelining. For platform-specific references, please see the [TPU](https://docs.jax.dev/en/latest/pallas/tpu/pipelining.html), or [GPU](https://docs.jax.dev/en/latest/pallas/gpu/pipelining.html) specific pipelining references.
2121

2222
<!-- #endregion -->
2323

0 commit comments

Comments
 (0)