-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add doc for serialization/deserialization of torchao optimized models #524
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/524
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 4eec577 with merge base 6dd82d8 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
14ebc3e
to
d25f9dc
Compare
docs/source/ser_deser.rst
Outdated
|
||
What happens when serializing an optimized model? | ||
================================================= | ||
To serilize an optimized model, we just need to call `torch.save(m.state_dict(), f)`, because in torchao, we use tensor subclass to represent different dtypes or support different optimization techniques like quantization and sparsity. So after optimization, the only thing that is updated is the weight Tensor and the model structure is not changed at all. For example: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subclass point is not well explained, I think what you're trying to say more plainly is at model save/load time we swap in the quantized weights
Which means we're instantiating the full precision model, which means we probably also want to explain why people might want to instantiate a model on cpu to later transfer to gpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we recommend people to initialize a model in meta device, this is explained in the deserialization section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the example help with the explanation? otherwise let me know what else I can change
Summary: Addressing following questions: 1. What happens if I save a quantized model 2. What happens if I load a quantized model and describing deteails like assign=True Specifically 1. Do you need ao as a dependency when you're loading a quantized model 2. Is the saved quantized model smaller on disk than the unquantized one Test Plan: . Reviewers: Subscribers: Tasks: Tags:
d25f9dc
to
4eec577
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice!
…pytorch#524) Summary: Addressing following questions: 1. What happens if I save a quantized model 2. What happens if I load a quantized model and describing deteails like assign=True Specifically 1. Do you need ao as a dependency when you're loading a quantized model 2. Is the saved quantized model smaller on disk than the unquantized one Test Plan: . Reviewers: Subscribers: Tasks: Tags:
Summary:
Addressing following questions:
Specifically
Test Plan:
.
Reviewers:
Subscribers:
Tasks:
Tags: