Skip to content

Conversation

@Jolanrensen
Copy link
Collaborator

@Jolanrensen Jolanrensen commented Mar 10, 2025

Helps and follows #961

To be merged after #1078

  • Redid implementation of mean. It's now based on TwoStepNumbersAggregator, such that mixed numbers are unified first.
  • Big number support is dropped.
  • Created generic rowX() function aggregateOfRow(), used by rowMean() and rowMeanOf<T>()
  • Removed big numbers from describe()
  • Mean statistics fixes #1091 (comment)

@Jolanrensen Jolanrensen marked this pull request as ready for review March 11, 2025 15:25
@Jolanrensen Jolanrensen mentioned this pull request Mar 12, 2025
9 tasks
Copy link
Collaborator

@AndreiKingsley AndreiKingsley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some notes in comments for public mean about new behavior, so that we don't forget to mention it there when writing KDocs in the future.

# Conflicts:
#	core/api/core.api
…lue. This simplifies logic in a lot of places. It can still be nullable for aggregators that require it (like min/max).
@Jolanrensen
Copy link
Collaborator Author

Jolanrensen commented Mar 17, 2025

Fixed feedback, but I also found some other important stuff:

  • We still had a lot of internal mean functions that we didn't use anymore, because we calculate mean as double.
  • Aggregators had a mandatory nullable return type, however this adds unnecessary logic for aggregators that never return null, such as mean, so I rewrote some of that.
  • Also, aggregators were allowed nulls for iterables, but for columns they were filtered out. They are now filtered out in all cases.
  • More will follow in later aggregator/statistics-PRs

@Jolanrensen Jolanrensen requested review from AndreiKingsley and removed request for AndreiKingsley March 17, 2025 15:29
…s of aggregateOf. Fixed nullability in lambda return types. Made sure all lambdas are crossinline. Added test for medianOf to check everything still works as expected.
Copy link
Collaborator

@AndreiKingsley AndreiKingsley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! However, I wonder - are there any aggregators that take null into account?

@Jolanrensen
Copy link
Collaborator Author

Great! However, I wonder - are there any aggregators that take null into account?

They all filter nulls out before aggregating. This was always done already for columns and it's now always the case.

There doesn't seem to be a statistic function that does something special with nulls, so I think this is the best choice.

@Jolanrensen Jolanrensen merged commit 3a71ce3 into master Mar 18, 2025
6 checks passed
@Jolanrensen Jolanrensen added this to the 0.16 milestone Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants