-
Notifications
You must be signed in to change notification settings - Fork 1.9k
LightGBM Crash issue #4918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LightGBM Crash issue #4918
Conversation
@@ -76,7 +75,6 @@ public void LightGBMBinaryEstimatorUnbalanced() | |||
var trainer = ML.BinaryClassification.Trainers.LightGbm(new LightGbmBinaryTrainer.Options | |||
{ | |||
NumberOfLeaves = 10, | |||
NumberOfThreads = 1, | |||
MinimumExampleCountPerLeaf = 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a detailed note somewhere in code visibly about why this is done?
Can we also add a cross reference to that comment wherever we have removed the NumberOfThreads = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can add some comments with // Attention tags and put explain on comments
In reply to: 388442580 [](ancestors = 388442580)
Yes, these are the crashing tests, they not disabled because I has check in a mitigation before (disable test parallelization). This time I have remove the mitigation that remove the AssemblyInfo.cs from ML.Tests assembly In reply to: 595344964 [](ancestors = 595344964) Refers to: test/Microsoft.ML.Tests/TrainerEstimators/TreeEstimators.cs:51 in cf16c7d. [](commit_id = cf16c7d, deletion_comment = False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
microsoft/LightGBM#2820
LightGBM has dependency on OpenMP multi-threading library. In our tests we are setting number of threads to be 1 for LightGBM and LightGBM also sets OpenMP to use only 1 thread. While OpenMP is also used by other native libraries and they also tend to set number of threads for OpenMP to use (by default this is number of cores, in our case it is 2 as we are using https://docs.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dsv2-series). This setting(number of threads) is global in process and cause LightGBM referencing OpenMP from single threads to multi-threads. LightGBM is using number of threads for indexing and thus cause index out of range in native code and crash the process.
By this fix, we are not force single threading when run LightGBM related tests thus default behavior is applied and all the libraries use same number of threads for OpenMP.
Idealy LightGBM better not to rely number of threads to do indexing as this setting is global and likely be override by other library.