-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround Flux#1027 #4
Conversation
Codecov Report
@@ Coverage Diff @@
## main #4 +/- ##
==========================================
+ Coverage 90.90% 94.23% +3.32%
==========================================
Files 1 2 +1
Lines 33 52 +19
==========================================
+ Hits 30 49 +19
Misses 3 3
Continue to review full report at Codecov.
|
This works around FluxML/Flux.jl#1027 by essentially following the strategy of FluxML/Flux.jl#1509 (comment), where here I've called @ToucheSir's This adds a Flux dependency here which is a little unfortunate because downstream code might want to use the LegolasFlux schema without actually touching weights/flux stuff (e.g. looking at a bunch of losses or something), and now such code will pull in the Flux dependency. But when upstream fixes #1027 we can probably remove the Thanks to @hannahilea for some pair debugging on this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ping! I've left a couple of comments that might be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Other than the renaming issue below, this seems good to go. :)
Thanks for the reviews! Thanks especially for the pointers to Also, since we now have |
Nice catch for |
Ok good to hear this is likely a bug! I've filed an issue, FluxML/Functors.jl#16. |
Co-authored-by: Alex Arslan <ararslan@comcast.net>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...so I wrote out a whole big comment and was sure it was attached to my previous review, but now I don't see it anywhere? The gist is that weights
has a very specific meaning in the context of ML, so our use of weights
(in load_weights(...)
, weights(...)
, etc) should probably actually be renamed to something more like parameters
or values
. (Too bad params
is already taken by Flux!)
Basically, while all weights are parameters, not all parameters are weights (some are biases, etc).
Not only that, the name |
Ok, we've got some bikeshedding to do! Some ideas...
We're leaning towards |
...pytorch uses
...tensorflow uses
so maybe I sent us down this bikeshed prematurely, and it is okay to stick with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...in light of the renaming discussion above, I'm going to leave the renaming (or not) to your discretion and let you merge at your leisure.
Ok cool! Thanks for looking up how others do it, that was a great idea. I think then since But I think I'd like to use weights = fetch_weights(model)
row = ModelRow(; weights, ...) I'll add an explanation to the readme to say exactly what we mean by |
This example very much does not work:
The goal of this PR is to fix things so that we can correctly (de)-serialize this model so the test passes.