Use `from_preset` to load architecture and weights #1438

jbischof · 2023-02-22T20:46:31Z

This PR pilots the biggest single step to unifying the KerasCV and KerasNLP APIs. It sits downstream of the pilot for functional subclasses #1401 and will be followed by a PR introducing Task models for classification.

Highlights of this PR:

Use from_preset constructor to load weights instead of weights arg
Reimplement config-in-code classes (e.g., ResNet50V2Backbone) with from_preset constructor
Remove pooling and include_top args to be handled by Task models
Add docstring examples to show basic usage
Introduce Backbone class to hold generic methods and properties
Rename as_backbone to get_feature_extractor
Decouple testing per model to improve readability
Test weight loading with pytest
Introduce conftest.py to allow control of weight RCP testing

API preview

See gist for a preview of the new API.

Quick summary:

We will rely on the docstring and keras.io to communicate preset usage.

Example docstring:

from_preset(*args, **kwargs) method of builtins.type instance
    Instantiate ResNetV2Backbone model from preset architecture and weights.
    Args:
        preset: string. Must be one of "resnet18_v2", "resnet34_v2", "resnet50_v2", "resnet101_v2", "resnet152_v2", "resnet50_v2_imagenet".
            If looking for a preset with pretrained weights, choose one of
            "resnet50_v2_imagenet".
        load_weights: Whether to load pre-trained weights into model.
            Defaults to `None`, which follows whether the preset has
            pretrained weights available.
    
    Examples:
    ```python
    # Load architecture and weights from preset
    model = keras_cv.models.ResNetV2Backbone.from_preset(
        "resnet50_v2_imagenet",
    )
    
    # Load randomly initialized model from preset architecture with weights
    model = keras_cv.models.ResNetV2Backbone.from_preset(
        "resnet50_v2_imagenet",
        load_weights=False,
    ```

Example usage:

# Generic constructor
model = ResNetV2Backbone(
    stackwise_filters=[64, 128, 256, 512],
    stackwise_blocks=[2, 2, 2, 2],
    stackwise_strides=[1, 2, 2, 2],
    include_rescaling=False,
    input_shape=[256, 256, 3],
)

# Load preset architecture without weights
# Can also include overrides
model = ResNetV2Backbone.from_preset(
    "resnet18_v2",
    include_rescaling=False,
)

# Load preset architecture with weights
model = ResNetV2Backbone.from_preset("resnet50_v2_imagenet")

# Use as backbone
model = RetinaNet(
    classes=20,
    bounding_box_format="xywh",
    backbone=ResNetV2Backbone.from_preset(
        "resnet50_v2_imagenet").get_feature_extractor(),

)

keras.applications aliases

We've also reintroduced versions of the keras.applications config-in-code classes now powered by from_preset.

# Load a preset without weights
model = ResNet50V2Backbone(
    include_rescaling=False,
)

# Load a preset with weights
model = ResNet50V2Backbone.from_preset("resnet50_v2_imagenet")

# Use as backbone
model = RetinaNet(
    classes=20,
    bounding_box_format="xywh",
    backbone=ResNet50V2Backbone().get_feature_extractor(),
)

jbischof · 2023-02-23T01:03:26Z

/gcbrun

LukeWood

I did a first pass --- overall I really like the abstractions around presets/weight loadings.

My only real concern revolves around the readability of ResNet50v2(...) vs `ResNetV2Backbone.from_preset(...)1

keras_cv/models/__init__.py

keras_cv/callbacks/pycoco_callback_test.py

keras_cv/models/backbone.py

keras_cv/models/resnet_v2_backbone.py

keras_cv/models/backbone.py

fchollet · 2023-02-23T23:23:45Z

keras_cv/models/backbone.py

+        times smaller in width and height than the input image.
+
+        Args:
+            min_level: optional int, the lowest level of feature to be included


This min_level / max_level API is very mysterious. Doubt anyone not already familiar with the implementation will figure out what it means. Can we find better argument names and descriptions?

Good point! Will file an Issue.

yeah I really do not like this API. realistically it should be a list - either of level names or layer names.

Filed #1447

keras_cv/models/backbone.py

keras_cv/models/resnet_v2_backbone.py

jbischof · 2023-02-24T01:13:41Z

/gcbrun

mattdangerw

Thanks! Mainly took a pass to educate myself, but thoughts on this.

Overall I think this is great at brining the two libraries into a similar style and make it so to an outside observer they look like they are written by the same folks. Which I think is the big win to have here.

I think the biggest thing I noticed was that

model = keras_cv.models.RetinaNet(
    classes=10,
    bounding_box_format="xywh",
    backbone=keras_cv.models.ResNetV2Backbone.from_preset(
        "resnet50_v2",
    ).get_feature_extractor(),
)

feels a bit different than the approach we took on KerasNLP. In KerasNLP our high-level models do some surgery on our backbones often, but that logic stay in the class that need to extract certain features, rather than the backbone having an "export" function. The more similar move to KerasNLP seems like it would be

backbone = keras_cv.models.ResNetV2Backbone.from_preset("resnet50_v2")
model = keras_cv.models.RetinaNet(
    classes=10,
    bounding_box_format="xywh",
    backbone=backbone,
    min_backbone_level=xx,  # None can still be the deafult here.
    max_backbone_level=yy,  # None can still be the deafult here.
)

Or even just

model = keras_cv.models.RetinaNet.from_preset(
    "some_id",
    classes=10,
    bounding_box_format="xywh",
)

I suspect this is more something to think about (or it already has been), then to change on this PR, but what's the big reasons for the discrepancy here?

keras_cv/models/backbones/backbone.py

keras_cv/utils/python_utils.py

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py

fchollet · 2023-02-24T20:48:51Z

I suspect this is more something to think about (or it already has been), then to change on this PR, but what's the big reasons for the discrepancy here?

The approach you describe is cleaner, in fact. I believe the reason for the current discrepancy is that each backbone might require custom handling. If there is no universal way to do feature extraction, then it's more practical for each backbone implementation to specify how to do it for that particular backbone.

But perhaps this is not a correct assumption.

fchollet

Thanks for the PR! Looking great.

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py

keras_cv/models/backbones/backbone.py

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py

LukeWood

I'm comfortable merging this --- but it might make more sense to wait until the Classifier task model is done. Up to you ---

another note: when we do merge this we should ctrl-f ResNet50V2 in keras-io and any other r

Also: we should probably get a universal agreement

LukeWood · 2023-02-25T06:02:48Z

(apologies for any typos in my comments - my GitHub UI is bugged and my comments don't display as I type. The box simply remains empty - so no way to edit typos out.)

jbischof · 2023-02-25T06:09:08Z

Thanks @LukeWood am working on a cleaner implementation. Will update Monday.

LukeWood

I think this looks good - I also am glad to see that the Sequential([ResNet50V2Backbone(), layers.GlobalAveragePooling2D()]) API works as expected (per the Simclr training test)

fchollet

Thanks for the updates. Looking good, I think we can ship it.

keras_cv/models/backbones/backbone.py

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py

jbischof · 2023-02-28T02:51:45Z

Thanks @LukeWood, I also replicated our quickstart using the following model:

resnet_classifier = keras.Sequential(
    [
        keras_cv.models.ResNet18V2Backbone(),
        keras.layers.GlobalAveragePooling2D(name="avg_pool"),
        keras.layers.Dense(3, activation="softmax", name="predictions"),
    ],
)

The model accuracy improved from 0.33 -> 0.51 over 10 epochs. See the attached colab for details.

jbischof · 2023-02-28T02:54:28Z

/gcbrun

LukeWood · 2023-02-28T05:45:53Z

Thanks! Mainly took a pass to educate myself, but thoughts on this.

Overall I think this is great at brining the two libraries into a similar style and make it so to an outside observer they look like they are written by the same folks. Which I think is the big win to have here.

I think the biggest thing I noticed was that
model = keras_cv.models.RetinaNet(
    classes=10,
    bounding_box_format="xywh",
    backbone=keras_cv.models.ResNetV2Backbone.from_preset(
        "resnet50_v2",
    ).get_feature_extractor(),
)
feels a bit different than the approach we took on KerasNLP. In KerasNLP our high-level models do some surgery on our backbones often, but that logic stay in the class that need to extract certain features, rather than the backbone having an "export" function. The more similar move to KerasNLP seems like it would be
backbone = keras_cv.models.ResNetV2Backbone.from_preset("resnet50_v2")
model = keras_cv.models.RetinaNet(
    classes=10,
    bounding_box_format="xywh",
    backbone=backbone,
    min_backbone_level=xx,  # None can still be the deafult here.
    max_backbone_level=yy,  # None can still be the deafult here.
)
Or even just
model = keras_cv.models.RetinaNet.from_preset(
    "some_id",
    classes=10,
    bounding_box_format="xywh",
)
I suspect this is more something to think about (or it already has been), then to change on this PR, but what's the big reasons for the discrepancy here?

I do quite like the idea of having the get_feature_extractor() logic live in the task models -- but I also think that can be a follow up PR.

thoughts?

jbischof · 2023-03-01T00:24:51Z

/gcbrun

IMvision12 · 2023-03-06T16:37:29Z

@jbischof Are these modifications required for resentv1?

jbischof · 2023-03-06T17:24:46Z

Yes @IMvision12 but not quite yet! I'm planning some followup PRs to refine the design somewhat and will start filing Issues later this week. Thank you for all your amazing contributions 🚀

* Introduce `Backbone` class and presets * Attach presets * Restore code we still need * First passing tests * Finish backbone tests * Get preset tests working * Remove unused marker * Remove dangling TPU reference * Add variable input channels test * Improve preset names * Better documentation for presets with weights * Fix import * Fix broken tests * Respond to comments * Respond to comments 2 * Add __init__.py files * Fix docstring * format * Respond to comments * Export new symbols * Fix bug in ResNet50V2Backbone * Respond to more comments * format * Inline error message * Change applications alias to subclass * Fix broken test * Remove unneeded overrides * Fix inheritence structure * Respond to comments * format * format2 * Add docstring examples

jbischof added 13 commits February 18, 2023 00:05

Introduce Backbone class and presets

f4d9dc8

Attach presets

36911ec

Restore code we still need

a41f308

First passing tests

896883f

Finish backbone tests

863bda2

Get preset tests working

301101c

Remove unused marker

555ba2f

Remove dangling TPU reference

3b0a694

Add variable input channels test

90b97cf

Improve preset names

63c677e

Better documentation for presets with weights

2a98f8d

Fix import

b21aded

Fix broken tests

8d9d61d

LukeWood reviewed Feb 23, 2023

View reviewed changes

fchollet reviewed Feb 23, 2023

View reviewed changes

jbischof added 5 commits February 24, 2023 00:39

Respond to comments

d978d72

Respond to comments 2

3c60493

Add __init__.py files

c57d227

Fix docstring

f04cafe

format

b0a4ed7

mattdangerw reviewed Feb 24, 2023

View reviewed changes

keras_cv/models/backbones/backbone.py Show resolved Hide resolved

keras_cv/utils/python_utils.py Show resolved Hide resolved

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py Outdated Show resolved Hide resolved

jbischof added 3 commits February 24, 2023 20:20

Respond to comments

b702a73

Export new symbols

2d2ffe8

Fix bug in ResNet50V2Backbone

508c2a2

fchollet reviewed Feb 24, 2023

View reviewed changes

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py Outdated Show resolved Hide resolved

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py Outdated Show resolved Hide resolved

jbischof added 2 commits February 24, 2023 23:34

Respond to more comments

4961a3f

format

d108bde

Inline error message

118e6c8

LukeWood reviewed Feb 25, 2023

View reviewed changes

keras_cv/models/backbones/backbone.py Show resolved Hide resolved

LukeWood reviewed Feb 25, 2023

View reviewed changes

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py Outdated Show resolved Hide resolved

LukeWood approved these changes Feb 25, 2023

View reviewed changes

jbischof added 4 commits February 27, 2023 21:29

Change applications alias to subclass

8354e71

Fix broken test

c810c3c

Remove unneeded overrides

99c3f27

Fix inheritence structure

72175df

LukeWood approved these changes Feb 28, 2023

View reviewed changes

fchollet reviewed Feb 28, 2023

View reviewed changes

keras_cv/models/backbones/backbone.py Show resolved Hide resolved

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py Outdated Show resolved Hide resolved

keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py Outdated Show resolved Hide resolved

jbischof mentioned this pull request Feb 28, 2023

Improve get_feature_extractor API #1447

Closed

jbischof added 2 commits February 28, 2023 01:35

Respond to comments

c389551

format

42acc0d

format2

b7ef737

Add docstring examples

a81d75a

jbischof mentioned this pull request Feb 28, 2023

Pilot KerasCV <> KerasNLP API unification #1451

Closed

5 tasks

jbischof merged commit f3d8582 into keras-team:master Mar 1, 2023

jbischof deleted the backbone branch March 1, 2023 00:37

haifeng-jin mentioned this pull request Mar 14, 2023

Need a clean example PR for migrating models to backbone subclasses #1504

Closed

Use from_preset to load architecture and weights #1438

Use from_preset to load architecture and weights #1438

Uh oh!

Conversation

jbischof commented Feb 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Highlights of this PR:

API preview

Quick summary:

keras.applications aliases

Uh oh!

jbischof commented Feb 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LukeWood left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fchollet Feb 23, 2023

Choose a reason for hiding this comment

Uh oh!

jbischof Feb 23, 2023

Choose a reason for hiding this comment

Uh oh!

LukeWood Feb 24, 2023

Choose a reason for hiding this comment

Uh oh!

jbischof Feb 28, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jbischof commented Feb 24, 2023

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fchollet commented Feb 24, 2023

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LukeWood left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LukeWood commented Feb 25, 2023

Uh oh!

jbischof commented Feb 25, 2023

Uh oh!

LukeWood left a comment

Choose a reason for hiding this comment

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbischof commented Feb 28, 2023

Use `from_preset` to load architecture and weights #1438

Use `from_preset` to load architecture and weights #1438

jbischof commented Feb 22, 2023 •

edited

Loading

jbischof commented Feb 23, 2023 •

edited

Loading

LukeWood left a comment •

edited

Loading