Skip to content

Conversation

@zanmato1984
Copy link

@zanmato1984 zanmato1984 commented Jan 17, 2025

This refinement is based on facts that:

  1. "Sort and mark duplicate" ONLY concerns:
    1. The shape of the input (array vs chunked array);
    2. The input data type and physical type;
    3. If duplicates are needed (can be simply deduced from the function options).
  2. "Create rankings" ONLY concerns:
    1. The NullPartitionResult generated by "sort and mark duplicate";
    2. The ranking type (ordinal vs percentile).

Therefore we can decouple these two operations from the current Ranker, and only combine them in the concrete RankXXXMetaFunctions, which themselves can be CRTP-ed.

This refinement:

  1. Maintains the functionality and performance;
  2. Totally avoids virtual functions;
  3. Doesn't introduce more code explosion than before;
  4. Doesn't increase the complexity (IMHO it might actually reduce the complexity a bit).

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

Copy link
Owner

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat, thank you. Can you just address the two comments I posted?

@pitrou
Copy link
Owner

pitrou commented Jan 20, 2025

Or I can just do so myself, actually.

@zanmato1984
Copy link
Author

Thank you. Let me do this right away.

@zanmato1984
Copy link
Author

Comments addressed. @pitrou

@pitrou pitrou merged this pull request into pitrou:percentile-rank-kernel Jan 20, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants