Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DataFrame Arithmetics implementation #6763

Merged
merged 6 commits into from
Jul 28, 2023
Merged

Improve DataFrame Arithmetics implementation #6763

merged 6 commits into from
Jul 28, 2023

Conversation

asmirnov82
Copy link
Contributor

@asmirnov82 asmirnov82 commented Jul 13, 2023

Fixes #6762

From the issue:

The main goals of this Issue are:

To simplify the code related to all Element Wise comparison operations
Make consistent return types for all PrimitiveColumnContainer.BinaryOperations
To avoid unnecessary memory copying and type conversion
Currently all Elementwise comparison operations are defined to have a return type as one of method parameters:

void Operation ( ... , PrimitiveColumnContainer ret);

to correctly create this parameter the caller has to the container by himself. In current implementation cloning of the existing column with changing it's type is widely used:

PrimitiveDataFrameColumn retboolColumn = CloneAsBooleanColumn();
_columnContainer.Operation( ..., retboolColumn._columnContainer);

Cloning of column of any type as Boolean Column is error-prone and potentialy may lead to unnecessary memory copying and speed decreasing.

Possible solution is to change the signature of all Element Wise internal methods to correctly return required results and to create required objects inside the method, also make consistent retun type of PrimitiveColumnContainer for all PrimitiveColumnContainer.BinaryOperations (currently comparison operations return DataFrameColumn as other operation returns ColumnContainer)

Only internal or private method's signatures are proposed to be changed. Public API should not be affected

@codecov
Copy link

codecov bot commented Jul 13, 2023

Codecov Report

Merging #6763 (d98d15f) into main (caee3c2) will increase coverage by 0.08%.
The diff coverage is 61.91%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6763      +/-   ##
==========================================
+ Coverage   68.88%   68.96%   +0.08%     
==========================================
  Files        1216     1216              
  Lines      251137   251245     +108     
  Branches    26269    26322      +53     
==========================================
+ Hits       172997   173277     +280     
+ Misses      71313    71101     -212     
- Partials     6827     6867      +40     
Flag Coverage Δ
Debug 68.96% <61.91%> (+0.08%) ⬆️
production 63.47% <51.71%> (+0.09%) ⬆️
test 88.93% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Microsoft.Data.Analysis/DataFrame.IO.cs 80.60% <ø> (-0.17%) ⬇️
....Data.Analysis/DataFrameColumn.BinaryOperations.cs 0.00% <0.00%> (ø)
...alysis/PrimitiveDataFrameColumn.BinaryOperators.cs 9.50% <ø> (ø)
...ata.Analysis/PrimitiveDataFrameColumnArithmetic.cs 49.67% <ø> (+0.78%) ⬆️
...a.Analysis/PrimitiveDataFrameColumnComputations.cs 50.37% <ø> (+0.56%) ⬆️
...eColumn.BinaryOperationImplementations.Exploded.cs 52.27% <31.41%> (+4.61%) ⬆️
...lysis/PrimitiveDataFrameColumn.BinaryOperations.cs 42.00% <37.14%> (+0.53%) ⬆️
src/Microsoft.Data.Analysis/DateTimeComputation.cs 39.16% <100.00%> (+2.64%) ⬆️
...lysis/PrimitiveColumnContainer.BinaryOperations.cs 82.85% <100.00%> (-1.36%) ⬇️
...icrosoft.Data.Analysis/PrimitiveColumnContainer.cs 86.53% <100.00%> (+2.91%) ⬆️
... and 5 more

... and 7 files with indirect coverage changes

@JakeRadMSFT JakeRadMSFT merged commit a823199 into dotnet:main Jul 28, 2023
23 checks passed
@asmirnov82 asmirnov82 deleted the refactor_dataframe_arithmetics branch July 28, 2023 15:16
@ghost ghost locked as resolved and limited conversation to collaborators Aug 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve DataFrame arithmetic implementation
2 participants