Skip to content

Update xgboost learner to latest CRAN version #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Nov 20, 2019
Merged

Update xgboost learner to latest CRAN version #31

merged 26 commits into from
Nov 20, 2019

Conversation

pat-s
Copy link
Member

@pat-s pat-s commented Sep 7, 2019

  • add the already existing (new) parameters of the latest CRAN version of xgboost
  • add parameter dependencies

In addition, some small fixes which including #30

fixes #45

@mb706
Copy link
Contributor

mb706 commented Oct 10, 2019

tree_method default should be "auto" I believe: https://xgboost.readthedocs.io/en/latest/parameter.html

Also the inequality condition could be done with CondAnyOf.

@codecov-io
Copy link

codecov-io commented Oct 16, 2019

Codecov Report

Merging #31 into master will increase coverage by 0.2%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #31     +/-   ##
=========================================
+ Coverage   97.53%   97.74%   +0.2%     
=========================================
  Files          18       18             
  Lines         691      753     +62     
=========================================
+ Hits          674      736     +62     
  Misses         17       17
Impacted Files Coverage Δ
R/LearnerRegrXgboost.R 98.94% <100%> (+0.5%) ⬆️
R/LearnerClassifXgboost.R 98.36% <100%> (+0.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e5fd62a...e4f44e2. Read the comment docs.

@be-marc
Copy link
Member

be-marc commented Oct 16, 2019

All missing parameters are added. Some parameters have a complex dependency on the updater parameter. This might be impossible to check with paradox at the moment. However, xgboost prints the possible combinations if something is wrong.

@be-marc
Copy link
Member

be-marc commented Oct 16, 2019

Closes #45

@pat-s
Copy link
Member Author

pat-s commented Oct 17, 2019

👍

This might be impossible to check with paradox at the moment.

Did you open an issue or similar? Couldn't find anything in the repo.

In addition, once accepted, could you please reflect those change also to mlr2?

@pat-s pat-s requested a review from mllg October 17, 2019 20:02
Copy link
Member Author

@pat-s pat-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 👍

@be-marc
Copy link
Member

be-marc commented Oct 18, 2019

If the booster parameter is set to gbtree or dart, updater must be a comma separated string which can contain the following words "grow_colmaker", "distcol", "grow_histmaker", "grow_local_histmaker", "grow_skmaker", sync", "refresh", "prune". If the booster parameter is set to gblinear, updater must be a string containing shotgun or coord_descent.

We can check this with the following code

if(x$booster == "gblinear") {
  if(!x$updater %in%  c("shotgun", "coord_descent")) {
     # Error or replace by default value?
   }
  } else if (x$booster == "gbtree" || x$booster == "dart") {
    split_paramter = strsplit(x$updater, ",")
    split_paramter = lapply(split_paramter, function(x) gsub(" ", "", x))
    if(!all(split_paramter %in% c("grow_colmaker", "distcol", "grow_histmaker",
                                                   "grow_local_histmaker", "grow_skmaker",
                                                    "sync", "refresh", "prune"))) {
     # Error or replace by default value?
   }
  }

So updater is not a really tune-able parameter. This is more about preventing the user from inserting a wrong combination. Am I right that this is not possible with add_dep since it depends on the values in both parameters? We can include it as a trafo function. How do I apply the trafo before invoking the training?

@pat-s
Copy link
Member Author

pat-s commented Nov 9, 2019

AFAIR we agreed on taking the updater arg easy as it won't be used that often and people using it should be aware of what's required.
Is that correct?

@be-marc Maybe we can finish up this PR in the next days.

@pat-s
Copy link
Member Author

pat-s commented Nov 11, 2019

@be-marc some tests are failing. Could you take a look?

@be-marc
Copy link
Member

be-marc commented Nov 11, 2019

@pat-s The failing tests are resolved in #46

@pat-s
Copy link
Member Author

pat-s commented Nov 20, 2019

Thanks again @be-marc!

@pat-s pat-s merged commit d56c74c into master Nov 20, 2019
@pat-s pat-s deleted the xgboost branch April 14, 2020 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable more xgboost parameters
4 participants