Commit 83f653a
Enable logging for the plan() function, ShardEstimators and TrainingPipeline class constructors (meta-pytorch#3565)
Summary:
This diff enables the static logging functionality to collect data for:
plan() - This will allow us to log the inputs and outputs to the planner to help with use issue debugging
ShardEstimators - This will allow us to log the inputs and outputs to the ShardEstimators, which gives us the bandwidth inputs to verify if the planner is generating expected values as well as help with debugging OOMs
TrainingPipeline - The class type here will be an indicator of which pipeline was used by the training job. The training pipeline has implications on the memory usage and is an important data point to collect to investigate OOMs.
Reviewed By: kausv
Differential Revision: D874880151 parent 217889e commit 83f653a
File tree
6 files changed
+16
-0
lines changed- torchrec
- distributed
- planner
- train_pipeline
- modules
6 files changed
+16
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| |||
466 | 467 | | |
467 | 468 | | |
468 | 469 | | |
| 470 | + | |
469 | 471 | | |
470 | 472 | | |
471 | 473 | | |
| |||
2021 | 2023 | | |
2022 | 2024 | | |
2023 | 2025 | | |
| 2026 | + | |
2024 | 2027 | | |
2025 | 2028 | | |
2026 | 2029 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
498 | 499 | | |
499 | 500 | | |
500 | 501 | | |
| 502 | + | |
501 | 503 | | |
502 | 504 | | |
503 | 505 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
955 | 956 | | |
956 | 957 | | |
957 | 958 | | |
| 959 | + | |
958 | 960 | | |
959 | 961 | | |
960 | 962 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
146 | 147 | | |
147 | 148 | | |
148 | 149 | | |
| 150 | + | |
149 | 151 | | |
150 | 152 | | |
151 | 153 | | |
| |||
194 | 196 | | |
195 | 197 | | |
196 | 198 | | |
| 199 | + | |
197 | 200 | | |
198 | 201 | | |
199 | 202 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
106 | 107 | | |
107 | 108 | | |
108 | 109 | | |
| 110 | + | |
| 111 | + | |
109 | 112 | | |
110 | 113 | | |
111 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| 129 | + | |
128 | 130 | | |
129 | 131 | | |
130 | 132 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| 169 | + | |
167 | 170 | | |
168 | 171 | | |
169 | 172 | | |
| |||
0 commit comments