Skip to content

Commit d10655c

Browse files
authored
Update README.md
1 parent 7b00088 commit d10655c

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

README.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,15 @@ Requires: Linux. If using multiple machines: SSH & shared filesystem.
2727

2828
<h4>Example: simple training loop</h4>
2929

30-
Suppose we have some distributed training function (which needs to run on every GPU):
30+
Suppose we have some distributed training function (needs to run on every GPU):
31+
32+
```python
33+
def distributed_training(output_dir: str, num_steps: int = 10) -> str:
34+
# returns path to model checkpoint
35+
```
36+
37+
<details>
38+
<summary><b>Click to expand (implementation)</b></summary>
3139

3240
```python
3341
from __future__ import annotations
@@ -63,6 +71,8 @@ def distributed_training(output_dir: str, num_steps: int = 10) -> str | None:
6371
return None
6472
```
6573

74+
</details>
75+
6676
We can distribute and run this function (e.g. on 2 machines x 2 GPUs) using **`torchrunx`**!
6777

6878
```python

0 commit comments

Comments
 (0)