-
Notifications
You must be signed in to change notification settings - Fork 332
Use from_preset
to load architecture and weights
#1438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass --- overall I really like the abstractions around presets/weight loadings.
My only real concern revolves around the readability of ResNet50v2(...)
vs `ResNetV2Backbone.from_preset(...)1
keras_cv/models/backbone.py
Outdated
times smaller in width and height than the input image. | ||
|
||
Args: | ||
min_level: optional int, the lowest level of feature to be included |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This min_level / max_level API is very mysterious. Doubt anyone not already familiar with the implementation will figure out what it means. Can we find better argument names and descriptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Will file an Issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I really do not like this API. realistically it should be a list - either of level names or layer names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #1447
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Mainly took a pass to educate myself, but thoughts on this.
Overall I think this is great at brining the two libraries into a similar style and make it so to an outside observer they look like they are written by the same folks. Which I think is the big win to have here.
I think the biggest thing I noticed was that
model = keras_cv.models.RetinaNet(
classes=10,
bounding_box_format="xywh",
backbone=keras_cv.models.ResNetV2Backbone.from_preset(
"resnet50_v2",
).get_feature_extractor(),
)
feels a bit different than the approach we took on KerasNLP. In KerasNLP our high-level models do some surgery on our backbones often, but that logic stay in the class that need to extract certain features, rather than the backbone having an "export" function. The more similar move to KerasNLP seems like it would be
backbone = keras_cv.models.ResNetV2Backbone.from_preset("resnet50_v2")
model = keras_cv.models.RetinaNet(
classes=10,
bounding_box_format="xywh",
backbone=backbone,
min_backbone_level=xx, # None can still be the deafult here.
max_backbone_level=yy, # None can still be the deafult here.
)
Or even just
model = keras_cv.models.RetinaNet.from_preset(
"some_id",
classes=10,
bounding_box_format="xywh",
)
I suspect this is more something to think about (or it already has been), then to change on this PR, but what's the big reasons for the discrepancy here?
keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py
Outdated
Show resolved
Hide resolved
The approach you describe is cleaner, in fact. I believe the reason for the current discrepancy is that each backbone might require custom handling. If there is no universal way to do feature extraction, then it's more practical for each backbone implementation to specify how to do it for that particular backbone. But perhaps this is not a correct assumption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looking great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm comfortable merging this --- but it might make more sense to wait until the Classifier task model is done. Up to you ---
another note: when we do merge this we should ctrl-f ResNet50V2 in keras-io and any other r
Also: we should probably get a universal agreement
(apologies for any typos in my comments - my GitHub UI is bugged and my comments don't display as I type. The box simply remains empty - so no way to edit typos out.) |
Thanks @LukeWood am working on a cleaner implementation. Will update Monday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good - I also am glad to see that the Sequential([ResNet50V2Backbone(), layers.GlobalAveragePooling2D()])
API works as expected (per the Simclr training test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Looking good, I think we can ship it.
Thanks @LukeWood, I also replicated our quickstart using the following model: resnet_classifier = keras.Sequential(
[
keras_cv.models.ResNet18V2Backbone(),
keras.layers.GlobalAveragePooling2D(name="avg_pool"),
keras.layers.Dense(3, activation="softmax", name="predictions"),
],
) The model accuracy improved from 0.33 -> 0.51 over 10 epochs. See the attached colab for details. |
/gcbrun |
I do quite like the idea of having the thoughts? |
/gcbrun |
@jbischof Are these modifications required for resentv1? |
Yes @IMvision12 but not quite yet! I'm planning some followup PRs to refine the design somewhat and will start filing Issues later this week. Thank you for all your amazing contributions 🚀 |
* Introduce `Backbone` class and presets * Attach presets * Restore code we still need * First passing tests * Finish backbone tests * Get preset tests working * Remove unused marker * Remove dangling TPU reference * Add variable input channels test * Improve preset names * Better documentation for presets with weights * Fix import * Fix broken tests * Respond to comments * Respond to comments 2 * Add __init__.py files * Fix docstring * format * Respond to comments * Export new symbols * Fix bug in ResNet50V2Backbone * Respond to more comments * format * Inline error message * Change applications alias to subclass * Fix broken test * Remove unneeded overrides * Fix inheritence structure * Respond to comments * format * format2 * Add docstring examples
This PR pilots the biggest single step to unifying the KerasCV and KerasNLP APIs. It sits downstream of the pilot for functional subclasses #1401 and will be followed by a PR introducing
Task
models for classification.Highlights of this PR:
from_preset
constructor to load weights instead ofweights
argResNet50V2Backbone
) withfrom_preset
constructorpooling
andinclude_top
args to be handled byTask
modelsBackbone
class to hold generic methods and propertiesas_backbone
toget_feature_extractor
pytest
conftest.py
to allow control of weight RCP testingAPI preview
See gist for a preview of the new API.
Quick summary:
We will rely on the docstring and keras.io to communicate preset usage.
Example docstring:
Example usage:
keras.applications aliases
We've also reintroduced versions of the
keras.applications
config-in-code classes now powered byfrom_preset
.