Skip to content

Commit 8f2b99d

Browse files
authored
Add Goodput documentation (apple#989)
* Temporarily change checkpointing to every 5 steps * revert local changes * Add example command for goodput usage
1 parent 31e8da0 commit 8f2b99d

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

axlearn/cloud/gcp/measurement.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,20 @@
11
# Copyright © 2024 Apple Inc.
22

3-
"""Measurement utils for GCP."""
3+
"""Measurement utils for GCP.
4+
5+
Example:
6+
7+
# Enable GoodPut when launching an AXLearn training job
8+
axlearn gcp gke start --instance_type=tpu-v5litepod-16 \
9+
--bundler_type=artifactregistry --bundler_spec=image=tpu \
10+
--bundler_spec=dockerfile=Dockerfile \
11+
-- python3 -m my_training_job \
12+
--recorder_type=axlearn.cloud.gcp.measurement:goodput \
13+
--recorder_spec=name=my-run-with-goodput \
14+
--recorder_spec=upload_dir=my-output-directory/summaries \
15+
--recorder_spec=upload_interval=30
16+
17+
"""
418

519
import jax
620
from absl import flags, logging

0 commit comments

Comments
 (0)