Skip to content

LightGBM Crash issue #4918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions test/Microsoft.ML.Tests/Microsoft.ML.Tests.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,8 @@
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>

<ItemGroup>
<Folder Include="Properties\" />
</ItemGroup>
</Project>
9 changes: 0 additions & 9 deletions test/Microsoft.ML.Tests/Properties/AssemblyInfo.cs

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -811,12 +811,13 @@ private void ExecuteTFTransformMNISTConvTrainingTest(bool shuffle, int? shuffleS
batchSize: 20))
.Append(mlContext.Transforms.Concatenate("Features", "Prediction"))
.AppendCacheCheckpoint(mlContext)
// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
.Append(mlContext.MulticlassClassification.Trainers.LightGbm(new Trainers.LightGbm.LightGbmMulticlassTrainer.Options()
{
LabelColumnName = "Label",
FeatureColumnName = "Features",
Seed = 1,
NumberOfThreads = 1,
NumberOfIterations = 1
}));

Expand Down
32 changes: 26 additions & 6 deletions test/Microsoft.ML.Tests/TrainerEstimators/TreeEstimators.cs
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,11 @@ public void LightGBMBinaryEstimator()
{
var (pipe, dataView) = GetBinaryClassificationPipeline();

// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
var trainer = ML.BinaryClassification.Trainers.LightGbm(new LightGbmBinaryTrainer.Options
{
NumberOfLeaves = 10,
NumberOfThreads = 1,
MinimumExampleCountPerLeaf = 2,
UnbalancedSets = false, // default value
});
Expand All @@ -73,10 +74,11 @@ public void LightGBMBinaryEstimatorUnbalanced()
{
var (pipe, dataView) = GetBinaryClassificationPipeline();

// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
var trainer = ML.BinaryClassification.Trainers.LightGbm(new LightGbmBinaryTrainer.Options
{
NumberOfLeaves = 10,
NumberOfThreads = 1,
MinimumExampleCountPerLeaf = 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a detailed note somewhere in code visibly about why this is done?
Can we also add a cross reference to that comment wherever we have removed the NumberOfThreads = 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can add some comments with // Attention tags and put explain on comments


In reply to: 388442580 [](ancestors = 388442580)

UnbalancedSets = true,
});
Expand All @@ -98,10 +100,11 @@ public void LightGBMBinaryEstimatorCorrectSigmoid()
var (pipe, dataView) = GetBinaryClassificationPipeline();
var sigmoid = .789;

// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
var trainer = ML.BinaryClassification.Trainers.LightGbm(new LightGbmBinaryTrainer.Options
{
NumberOfLeaves = 10,
NumberOfThreads = 1,
MinimumExampleCountPerLeaf = 2,
Sigmoid = sigmoid
});
Expand Down Expand Up @@ -218,9 +221,11 @@ public void FastTreeRegressorEstimator()
public void LightGBMRegressorEstimator()
{
var dataView = GetRegressionPipeline();

// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
var trainer = ML.Regression.Trainers.LightGbm(new LightGbmRegressionTrainer.Options
{
NumberOfThreads = 1,
NormalizeFeatures = NormalizeOption.Warn,
L2CategoricalRegularization = 5,
});
Expand Down Expand Up @@ -930,8 +935,15 @@ public void FastTreeTweedieRegressorTestSummary()
public void LightGbmRegressorTestSummary()
{
var dataView = GetRegressionPipeline();

// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
var trainer = ML.Regression.Trainers.LightGbm(
new LightGbmRegressionTrainer.Options { NumberOfIterations = 10, NumberOfThreads = 1, NumberOfLeaves = 5});
new LightGbmRegressionTrainer.Options
{
NumberOfIterations = 10,
NumberOfLeaves = 5
});

var transformer = trainer.Fit(dataView);

Expand Down Expand Up @@ -984,8 +996,16 @@ public void FastForestBinaryClassificationTestSummary()
public void LightGbmBinaryClassificationTestSummary()
{
var (pipeline, dataView) = GetOneHotBinaryClassificationPipeline();

// Attention: Do not set NumberOfThreads here, left this to use default value to avoid test crash.
// Details can be found here: https://github.com/dotnet/machinelearning/pull/4918
var trainer = pipeline.Append(ML.BinaryClassification.Trainers.LightGbm(
new LightGbmBinaryTrainer.Options { NumberOfIterations = 10, NumberOfThreads = 1, NumberOfLeaves = 5, UseCategoricalSplit = true }));
new LightGbmBinaryTrainer.Options
{
NumberOfIterations = 10,
NumberOfLeaves = 5,
UseCategoricalSplit = true
}));

var transformer = trainer.Fit(dataView);

Expand Down