-
Notifications
You must be signed in to change notification settings - Fork 705
Initialize XGBoost codegen #765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize XGBoost codegen #765
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sql/codegen_ant_xgboost.go
Outdated
| } | ||
|
|
||
| func genXG(w io.Writer, pr *extendedSelect, ds *trainAndValDataset, fts fieldTypes, db *DB) error { | ||
| func genAntXGboost(w io.Writer, pr *extendedSelect, ds *trainAndValDataset, fts fieldTypes, db *DB) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about renaming it to genAntXGBoost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that's a typo, thx.
| ValidationDatasetSQL string | ||
| TrainCfg *xgbTrainConfig | ||
| Features []*featureMeta | ||
| Label *featureMeta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found our filler for tensorflow in codegen.go,
Lines 67 to 69 in 9a0dc86
| X []*featureMeta | |
| FeatureColumnsCode map[string][]string | |
| Y *featureMeta |
Features named to X and Label to Y.I would suggest such consistent naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a better way is reuse the filler in codegen, would optimize it in the next PR.
sql/expression_resolver_xgb.go
Outdated
| ParamsAttr map[string]*attribute | ||
| } | ||
|
|
||
| func resolveXGBTrainClause(tc *trainClause) (*resolvedXGBTrainClause, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try put attribute extraction into the codegen for each submitter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to resolve XGBoost parameters, the current expression_resover can only resolve the Tensorlfow parameters, maybe we can refactor the expression_resolver to extract getBoolAttr, getIntAttr as the common function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only have these two parameters currently, try all put it in expression_resolver.go and refine this later. We should let each codegen to deal with it's only attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sql/expression_resolver_xgb.go
Outdated
| ParamsAttr map[string]*attribute | ||
| } | ||
|
|
||
| func resolveXGBTrainClause(tc *trainClause) (*resolvedXGBTrainClause, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only have these two parameters currently, try all put it in expression_resolver.go and refine this later. We should let each codegen to deal with it's only attributes.
@typhoonzero , there is not only two parameters, and I removed the xgb resolver from this PR, would refactor the |
Initliaze XGBoost codegen, this PR reuse
db_generatorto generate XGBoost DMatrix object, part work of #749 .