Skip to content

Fix incorrect DataFrame min max computation with NULL #6734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 7, 2023
Merged

Fix incorrect DataFrame min max computation with NULL #6734

merged 5 commits into from
Jul 7, 2023

Conversation

asmirnov82
Copy link
Contributor

Fixes #6733

@codecov
Copy link

codecov bot commented Jun 14, 2023

Codecov Report

Merging #6734 (c28e1f0) into main (184e661) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6734      +/-   ##
==========================================
+ Coverage   68.87%   68.89%   +0.02%     
==========================================
  Files        1216     1216              
  Lines      250915   251125     +210     
  Branches    26259    26311      +52     
==========================================
+ Hits       172825   173021     +196     
+ Misses      71265    71242      -23     
- Partials     6825     6862      +37     
Flag Coverage Δ
Debug 68.89% <100.00%> (+0.02%) ⬆️
production 63.39% <100.00%> (+0.02%) ⬆️
test 88.91% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...a.Analysis/PrimitiveDataFrameColumnComputations.cs 50.37% <ø> (+0.56%) ⬆️
src/Microsoft.Data.Analysis/DateTimeComputation.cs 39.16% <100.00%> (+2.64%) ⬆️
...icrosoft.Data.Analysis/PrimitiveColumnContainer.cs 83.04% <100.00%> (ø)
....Analysis/PrimitiveDataFrameColumn.Computations.cs 77.77% <100.00%> (ø)
...oft.Data.Analysis/PrimitiveDataFrameColumn.Sort.cs 87.34% <100.00%> (ø)
...icrosoft.Data.Analysis/PrimitiveDataFrameColumn.cs 81.37% <100.00%> (ø)
...st/Microsoft.Data.Analysis.Tests/DataFrameTests.cs 99.27% <100.00%> (+0.01%) ⬆️

... and 10 files with indirect coverage changes

@JakeRadMSFT
Copy link
Contributor

JakeRadMSFT commented Jun 23, 2023

Thanks @asmirnov82! Can you merge this into https://github.com/JakeRadMSFT/machinelearning/tree/u/jakerad/generic-math as well?

It's possible this was fixed by other changes in the generic math branch. Can you make sure it meets your needs/add/update tests?

Copy link
Contributor

@JakeRadMSFT JakeRadMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like the null check to stay consistent everywhere. We can analyze perf later ... and see if in-lining would help.

@ghost ghost added the needs-author-action label Jun 23, 2023
@ghost ghost removed the needs-author-action label Jul 6, 2023
@asmirnov82
Copy link
Contributor Author

Thanks @asmirnov82! Can you merge this into https://github.com/JakeRadMSFT/machinelearning/tree/u/jakerad/generic-math as well?

It's possible this was fixed by other changes in the generic math branch. Can you make sure it meets your needs/add/update tests?

This wasn't fixed by the latest changes in generic-math branch. I merged my changes, however this requires additional work to change new generic implementation (CalculateReduction method), until changes are done 2 new unit tests fail (TestIntComputations_MaxMin_WithNulls and TestIntComputations_MaxMin_OnEmptyColumn)

@JakeRadMSFT
Copy link
Contributor

JakeRadMSFT commented Jul 6, 2023

Thanks @asmirnov82! Can you merge this into https://github.com/JakeRadMSFT/machinelearning/tree/u/jakerad/generic-math as well?
It's possible this was fixed by other changes in the generic math branch. Can you make sure it meets your needs/add/update tests?

This wasn't fixed by the latest changes in generic-math branch. I merged my changes, however this requires additional work to change new generic implementation (CalculateReduction method), until changes are done 2 new unit tests fail (TestIntComputations_MaxMin_WithNulls and TestIntComputations_MaxMin_OnEmptyColumn)

Awesome @asmirnov82 - I've made those changes and merged. Can you take a look at them when you have a chance?

Changes for Reduction method
JakeRadMSFT@2856d3a

If you're up for it - I'd love for you to review my generic math PR once I have the .NET 8 branch setup.

It should be ready for review after next week.

@JakeRadMSFT
Copy link
Contributor

Merged into Generic Math branch!

@JakeRadMSFT
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@JakeRadMSFT JakeRadMSFT merged commit 69cc4bc into dotnet:main Jul 7, 2023
@asmirnov82 asmirnov82 deleted the 6733_fix_incorrect_min_max_computation branch July 8, 2023 06:42
@ghost ghost locked as resolved and limited conversation to collaborators Aug 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame DateTime column Min and Max calculation doesn't support NULL
2 participants