Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing light gbm bad allocation #6968

Merged
merged 1 commit into from
Jan 23, 2024

Conversation

michaelgsharp
Copy link
Member

@michaelgsharp michaelgsharp commented Jan 19, 2024

Changes some of the LightGBM tests to run sequentially to try and fix the bad allocation issue.

This is in regards to issue #6961. I'll leave that issue open so we can continue to monitor after this goes in.

@ghost ghost assigned michaelgsharp Jan 19, 2024
Copy link

codecov bot commented Jan 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5e28578) 68.80% compared to head (b9e48ef) 68.81%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6968   +/-   ##
=======================================
  Coverage   68.80%   68.81%           
=======================================
  Files        1249     1249           
  Lines      249686   249686           
  Branches    25485    25485           
=======================================
+ Hits       171806   171814    +8     
+ Misses      71286    71282    -4     
+ Partials     6594     6590    -4     
Flag Coverage Δ
Debug 68.81% <100.00%> (+<0.01%) ⬆️
production 63.28% <ø> (+<0.01%) ⬆️
test 88.41% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
test/Microsoft.ML.TestFramework/BaseTestClass.cs 88.23% <100.00%> (ø)
...osoft.ML.Tests/TrainerEstimators/TreeEstimators.cs 97.84% <ø> (ø)

... and 3 files with indirect coverage changes

@michaelgsharp
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@michaelgsharp
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@michaelgsharp
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@michaelgsharp
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

[CollectionDefinition(nameof(NoParallelizationDefinition), DisableParallelization = true)]
public class NoParallelizationDefinition { }

[Collection(nameof(NoParallelizationDefinition))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something specific about these LightGBM tests that differs from others?
https://github.com/search?q=repo%3Adotnet%2Fmachinelearning%20LightGBMFact&type=code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I'm not sure. I just put this tag around the test I saw failing and it seems to have worked. So its possible there is? Or its possible it just has to do with test ordering? I'm not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to at comment that we've observed concurrency issues in LightGBM when running these in parallel. Consider keeping an issue open to track that and link to the issue since we still don't understand why it's failing.

@michaelgsharp michaelgsharp marked this pull request as ready for review January 22, 2024 22:56
Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems reasonable since it decreases CI flakiness - but make sure we have this commented with an issue linking.

@michaelgsharp michaelgsharp merged commit 54fa44f into dotnet:main Jan 23, 2024
25 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Feb 23, 2024
@ericstj
Copy link
Member

ericstj commented Mar 13, 2024

Just saw this in 3.0 branch Code Coverage build. Maybe we should backport this test reliability fix?

@ericstj
Copy link
Member

ericstj commented Mar 13, 2024

/backport to release/3.0

Copy link

Started backporting to release/3.0: https://github.com/dotnet/machinelearning/actions/runs/8272862595

@github-actions github-actions bot unlocked this conversation Mar 13, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants