Description
I wanted to use one of the many objectives in xgboost described here.
So I went on and wrote this code to set the objective
in set_engine()
, as ...
are passed to the engine as described in the docs for ...
in help(set_engine, package = 'parsnip')
:
Any optional arguments associated with the chosen computational engine. [...]
However, I think the objective is not respected, as it is always derived from mode
, which can only be regression or classification.
library(parsnip)
library(magrittr)
# boost_tree
model <- parsnip::boost_tree(mtry = NULL, trees = 5, mode = 'regression', learn_rate = .1)
model <- parsnip::set_engine(model, "xgboost", objective = "reg:squaredlogerror")
set.seed(4)
data <- matrix(rnorm(1000), ncol = 4) %>%
tibble::as_tibble() %>%
dplyr::mutate(y = sample(1000/4))
fit <- model %>%
parsnip::fit(y ~., data = data)
predict(fit, data)
This problem actually did come up in curso-r/treesnip#24, where I wanted to use the quantile objective for {lightgbm}
and could not and the answer was: "We just copied that code from parsnip". In the interest of a consistent solution to this problem, I am creating an issue here. As described in curso-r/treesnip#24, the maintainers of {treesnip}
are happy to let the user override the objective, but I thought maybe this topic merits a discussion here, as the same problem applies to {xgboost}
and other engines too.
I think it should be possible to override the objective if it is consistent with the mode
argument, because I it can in some instances be orthogonal to the mode (e.g. in my example, i want to use the squared log error) or specify the mode more precisely, e.g. survival:cox
is a Cox regression, which is consistent with mode = 'regression'
conceptually I think.