-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safetensors support. #381
Safetensors support. #381
Conversation
Great idea! This'll be great for getting pretrained weights since I had no idea how to load them from pickle files |
Nice, I'll open this PR up then. Biggest caveats I've seen: There's a copy on write because safetensors expects a flat It's usually not that bad on save, since it's not the biggest drawback, but still something to take into account when training large models. In general I think we could aim for zero-copy loads with memory mapping on aligned files: https://github.com/coreylowman/dfdx/pull/381/files#diff-a0868f528cbb396ebf73e5726654153f4ba44c8820003f936d809e8ae45e6cdaR124 |
Yeah I think given the tensors may be stored on other devices, getting a |
Yes, and training is most likely going to occur on GPU most of the time. Still extra copies in general are usually becoming bottleneck really fast so I'm acutely aware of them. Loading is happening much more than saving in practice (just like inference >> training overall) so as long as those are reduced to a minimum it's Ok. |
fa4d016
to
4ea24b9
Compare
Any new thoughts on this PR ? @coreylowman |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. After #460 it will be straightforward to impl this for nn layer as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TensorCollection api was just merged, which greatly simplified nn::SaveToNpz/LoadFromNpz. Should be really easy to add a safetensors loader/saver for nn layer now as well!
4ea24b9
to
c76e417
Compare
Great addition ! I updated everything ! We can now simply load/save entire modules. Nice ! |
Cargo.toml
Outdated
# safetensors = { version = "0.2", default-features = false, optional = true } | ||
safetensors = { git = "https://github.com/huggingface/safetensors", default-features = false, optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this a holdover from developing? Which should be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to release 0.3
. Should happen this week. (I will release python release candidate first, there has been a relatively big change in how things are saved, everythng backward compatible, but files on disk will be different now, to get alignement properly done).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 will be good to merge after version fix in Cargo.toml
512433a
to
9b408dd
Compare
@coreylowman I fixed with the released version. |
@Narsil looks good. didn't this used to have an impl for tensor collections? |
I'm as confused as you, I simply rebased. Doing the work again since I can't find that work anymore. I'm guessing I screwed my rebase but I can't figure out where the original code lives. |
9b408dd
to
ee6d197
Compare
I added the test with the feature as I'm seeing a super weird trait issue in the Booleans module, which doesn't seem linked to my changes ? Any ideas: https://github.com/coreylowman/dfdx/actions/runs/4364586004/jobs/7632084283 |
gpt2.weight | ||
.load_safetensor(&tensors, "wte.weight") | ||
.expect("Could not load tensor"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Don't know why the boolean test is failing, its passing locally for me. Will resolve separately. Thanks for the contribution!
I'm using this to do experiments feel free to close or ignore.
Main goal is to be able to use weights contained on https://hf.co
which doesn't include NPZ format.
All models supporting
safetensors
: https://huggingface.co/models?library=safetensors&sort=downloads