-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add image featurizer to AutoFeaturizer #6261
add image featurizer to AutoFeaturizer #6261
Conversation
Codecov Report
@@ Coverage Diff @@
## main #6261 +/- ##
==========================================
- Coverage 68.39% 68.38% -0.02%
==========================================
Files 1141 1144 +3
Lines 244820 244885 +65
Branches 25405 25405
==========================================
+ Hits 167444 167460 +16
- Misses 70722 70772 +50
+ Partials 6654 6653 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@@ -587,6 +587,44 @@ internal SweepableEstimator[] CatalogFeaturizer(string[] outputColumnNames, stri | |||
return new SweepableEstimator[] { SweepableEstimatorFactory.CreateOneHotEncoding(option), SweepableEstimatorFactory.CreateOneHotHashEncoding(option) }; | |||
} | |||
|
|||
internal MultiModelPipeline ImagePathFeaturizer(string outputColumnName, string inputColumnName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning on adding support for having a folder and not just an image column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for now, this only support full image path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
We are excited to review your PR.
So we can do the best job, please check:
Fixes #nnnn
in your description to cause GitHub to automatically close the issue(s) when your PR is merged.This PR adds featurizer for image path. When there's a column, or multiple columns that are referred as image path, a set of estimators with search space will be added for those columns which featurizes image using one of DNN featurizers (ResNet18, ResNet50, AlexNet...)
The initial idea comes from @justinormont, which is a great cross-platform solution to leverage automl in image classification, and can be a more efficient way compared with deep learning, especially on small datasets.
The estimators that use to featurize images are
which transfers an image into a numeric feature array for classifiers to learn and transform.
And while training, the search space from those estimators will be added to the global search space and will be optimized by the selected tuner