-
Notifications
You must be signed in to change notification settings - Fork 108
Use torch._logging._internal.trace_structured_artifact to save TraceCtxs when TORCH_TRACE is set
#2751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enables dumping of TraceCtx objects using PyTorch's torch._logging._internal.trace_structured_artifact API when the TORCH_TRACE environment variable is set. This provides enhanced observability for Thunder's compilation process when using thunder.dynamo.compiler.thunderfx.
- Adds support for logging compilation traces and graph modules to
TORCH_TRACEartifacts - Implements backward compatibility wrapper for
trace_structured_artifactto handle different PyTorch versions - Extends test coverage to validate the TORCH_TRACE functionality
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| thunder/dynamo/utils.py | Adds log_trace_or_graphmodule_to_torch_trace helper function and backward compatibility wrapper for trace_structured_artifact |
| thunder/dynamo/compiler.py | Integrates trace logging for graph modules and split reasons in the ThunderCompiler |
| thunder/init.py | Adds trace logging at various compilation stages (prologue, computation, epilogue, transforms, backward) |
| thunder/tests/test_dynamo.py | Adds parametrized test to validate functionality with and without TORCH_TRACE enabled |
Comments suppressed due to low confidence (1)
thunder/dynamo/compiler.py:144
- This assignment to 'thunder_options' is unnecessary as it is redefined before this value is used.
thunder_options = _with_prologue_pruning_transform(
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
thunder/dynamo/compiler.py:144
- This assignment to 'thunder_options' is unnecessary as it is redefined before this value is used.
thunder_options = _with_prologue_pruning_transform(
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
shino16
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a very cool feature!
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
so that the fallback path can also correctly work with torch-trace Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
Co-authored-by: Masato Shinokawa <40788005+shino16@users.noreply.github.com>
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
eb5bf55 to
3eeb517
Compare
Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>
|
@shino16 @kshitij12345 could you give this another look? |
shino16
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
kshitij12345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @crcrpar
|
@KaelanDt could you review this? |
What does this PR do?
Taking over #2182.
This enables us to dump
TraceCtxby settingTORCH_TRACEenv var if we usethunder.dynamo.compiler.thunderfx.What are stored?
Ref: https://docs.pytorch.org/docs/stable/compile/programming_model.observability.html
I got the following as an artifact of
TORCH_TRACE="./trace-tests/003" python thunder/benchmarks/benchmark_litgpt.py --compile "thunder_dynamo" --n_layers 2For
TORCH_TRACE="./trace-tests/012" python thunder/benchmarks/benchmark_inference.py --mode thunder,