-
Notifications
You must be signed in to change notification settings - Fork 249
Multi Pin Bumps across PT/AO/tune/ET #1367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
bcdfc54
a976734
23b4536
6328935
7aa96d7
4a977a5
774ebb6
655dc4a
a6cb90c
8cb415d
f9d0a29
c3f18c6
5b91d46
bde427d
7647d52
bb6ca2a
eb00467
673f5ab
da0a26d
2530e71
1ada559
f58c22e
2ece601
565338b
7088e79
94aa9a8
6e54cba
a05683d
411cf94
953a42e
6e8bfb1
5a80f5f
59e00d5
d67eb86
aae4eb3
25da485
a9fa27e
bdd2356
bfe5826
02dc6a4
dbb090f
9579f18
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -122,7 +122,7 @@ def export_for_server( | |
from executorch.exir.tracer import Value | ||
|
||
from torch._export import capture_pre_autograd_graph | ||
from torch.export import export, ExportedProgram | ||
from torch.export import export_for_training, ExportedProgram | ||
|
||
from torchchat.model import apply_rotary_emb, Attention | ||
from torchchat.utils.build_utils import get_precision | ||
|
@@ -238,7 +238,7 @@ def _to_core_aten( | |
raise ValueError( | ||
f"Expected passed in model to be an instance of fx.GraphModule, got {type(model)}" | ||
) | ||
core_aten_ep = export(model, example_inputs, dynamic_shapes=dynamic_shapes) | ||
core_aten_ep = export_for_training(model, example_inputs, dynamic_shapes=dynamic_shapes) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure what we are doing here, but shouldn't TorchChat be exporting for inference? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was picked up from @tugsbayasgalan's PR migrating away from export(), but export_for_inference does sound more in line with what we want @tugsbayasgalan Can you share info on the new APIs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep the intended use for inference IR is that user will export to a training IR and call run_decompositions() to lower to inference IR. In this flow, after core_aten_ep, there is to_edge call which lowers to inference. Export team is moving the IR to non-functional training IR so export_for_training will exist as an alias to official export. After we actually migrate official export, we will replace this call with export. |
||
if verbose: | ||
logging.info(f"Core ATen graph:\n{core_aten_ep.graph}") | ||
return core_aten_ep | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it needs false? All LLMs should be loadable with weights_only, shouldn't they? (Also, there are no such option as
weight_only
(or so I hope :P ))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch on the typo;
As for setting it to False: I'd rather keep it behavior consistent in a pin bump PR; we can flip in a separate PR