Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support early stopping of prediction in CLI #565

Merged
merged 13 commits into from
May 30, 2017
Merged

Support early stopping of prediction in CLI #565

merged 13 commits into from
May 30, 2017

Conversation

guolinke
Copy link
Collaborator

@cbecker
I push many changes.
I think the type of early_stoping_instance can be auto inferred by the objective function.

Another problem is, should users really need to create the early_stop_instance by themselves ? Passing parameters round_period and threshold_margin to the predict function are not enough ?

@Laurae2
Copy link
Contributor

Laurae2 commented May 29, 2017

Just some questions:

pred_early_stop: is it for the prediction early stopping or did the regular early stopping method change?

pred_early_stop_freq: is it good to set to 1?

pred_early_stop_margin: what is the margin?

@guolinke
Copy link
Collaborator Author

@Laurae2 this is only for the prediction.
@cbecker can you answer the pred_early_stop_freq and pred_early_stop_margin ?

@cbecker
Copy link
Contributor

cbecker commented May 29, 2017

Hello, I cannot take a look now but I'll do it first thing tomorrow in the morning, thanks for the PR!

@cbecker
Copy link
Contributor

cbecker commented May 30, 2017

@cbecker can you answer the pred_early_stop_freq and pred_early_stop_margin ?

Early stopping at prediction time for classification does the following:

  • For each sample, every pred_early_stop_freq iterations look at the score of the classifier and compute the margin. For binary classification, it is simply the absolute value of the score. For multiclass, it is the highest score minus the second highest score.
  • If margin > pred_early_stop_margin then stop iterating and return the current classifier score. This means we're confident enough of the label we are assigning to this sample.

Choosing appropriate values for the parameters:

  • pred_early_stop_freq: the lowest is 1, but then the margin is computed at every iteration, which can slow down prediction. I typically use ``pred_early_stop_freq = 25` in my experiments, but it depends on the classifier and how it was trained.
  • pred_early_stop_margin: Lowest is zero, which means that it will stop predicting as soon as possible and the classification is likely to be very poor. pred_early_stop_margin = infinity means that it will never apply early stopping. I typically use pred_early_stop_margin = 1.5. Again, this depends on the classifier: a classifier trained with low shrinkage (e.g. 0.0001) will need a much lower value here than a classifier trained with shrinkage 0.5.

Let me know if there's anything unclear, thanks for the CLI integration :)

@guolinke
Copy link
Collaborator Author

@cbecker
I want to delete these two c_api: https://github.com/Microsoft/LightGBM/blob/master/src/c_api.cpp#L1118-L1140 .
and pass three parameters (pred_early_stop, pred_early_stop_freq, pred_early_stop_margin) into predict function in c_api directly.
Do you think that is okay ?

@cbecker
Copy link
Contributor

cbecker commented May 30, 2017

I want to delete these two c_api: https://github.com/Microsoft/LightGBM/blob/master/src/c_api.cpp#L1118-L1140 .
and pass three parameters (pred_early_stop, pred_early_stop_freq, pred_early_stop_margin) into predict function in c_api directly.
Do you think that is okay ?

Yes, it works for me. I am using the pure C++ API whenever I can, and avoid the C api. About the latter, one thing I'm afraid of is that the predict() parameters are growing, and if we need to change something in the early prediction code we need to change all the python and R api, because otherwise it will crash because of missing C function arguments. Right now, if we were to change something about early stopping, we just need to modify LGBM_PredictionEarlyStopInstanceCreate and the changes propagate automatically.

Therefore I think it is safer and cleaner the way it is now, but if you think it's advantageous to have it in a single function that works for me too.

@guolinke
Copy link
Collaborator Author

@cbecker
I want to use only one additional parameter named "parameter", with string type. And we parse what we need from it. As a result, We don't need to worry about the paramater growing.

@cbecker
Copy link
Contributor

cbecker commented May 30, 2017

@cbecker
I want to use only one additional parameter named "parameter", with string type. And we parse what we need from it. As a result, We don't need to worry about the paramater growing.

I see, that makes sense. I don't know about the speed penalty though. We'd be doing this for every sample we want to classify, right?

@guolinke
Copy link
Collaborator Author

@cbecker
Not, it is one-time init.
But it will create an new instance of EarlyStopInstance (So as Predictor) everytime calling the prediction.

@@ -175,8 +175,11 @@ class Booster {
} else {
is_raw_score = false;
}
auto param = ConfigBase::Str2Map(parameter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really concerned about this. Do you know how long Str2Map() takes? We'll be doing this for each instance we want to classify. IMO it would be worth benchmarking the after/before to see how this 'hurts' performance when only a few trees are being used (with early stopping disabled)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw your comment about one-time-init, I will have a second look now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't call the c_api one-time for one sample, right ?
The prediction is called one-time for many instances.

@cbecker
Copy link
Contributor

cbecker commented May 30, 2017

Not, it is one-time init.
But it will create an new instance of EarlyStopInstance (So as Predictor) everytime calling the prediction.

I agree this makes sense when many samples are classified in a single call, which I guess is the goal of this function. For a bit I was confused with the PredictRaw() function, thanks for the clarification.

@cbecker
Copy link
Contributor

cbecker commented May 30, 2017

We won't call the c_api one-time for one sample, right ?
The prediction is called one-time for many instances.

yes, it makes sense.

@guolinke guolinke merged commit 6d4c7b0 into master May 30, 2017
@guolinke guolinke deleted the prediction branch May 31, 2017 03:15
guolinke added a commit that referenced this pull request Oct 9, 2017
* fix multi-threading.

* fix name style.

* support in CLI version.

* remove warnings.

* Not default parameters.

* fix if...else... .

* fix bug.

* fix warning.

* refine c_api.

* fix R-package.

* fix R's warning.

* fix tests.

* fix pep8 .
@lock lock bot locked as resolved and limited conversation to collaborators Mar 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants