Skip to content

[C++][Compute] Add percentile rank function #45190

@pitrou

Description

@pitrou

Describe the enhancement requested

Arrow C++ already offers a rank function: https://arrow.apache.org/docs/cpp/compute.html#sorts-and-partitions

It would be useful to add a percentile rank function according to this definition: https://en.wikipedia.org/wiki/Percentile_rank

Proposed API:

/// \brief Percentile rank options
class ARROW_EXPORT PercentileRankOptions : public FunctionOptions {
 public:
  explicit PercentileRankOptions(std::vector<SortKey> sort_keys = {},
                                 NullPlacement null_placement = NullPlacement::AtEnd,
                                 double factor = 1.0);
  /// Convenience constructor for array inputs
  explicit PercentileRankOptions(SortOrder order,
                                 NullPlacement null_placement = NullPlacement::AtEnd,
                                 double factor = 1.0)
      : PercentileRankOptions({SortKey("", order)}, null_placement, factor) {}

  static constexpr char const kTypeName[] = "PercentileRankOptions";
  static PercentileRankOptions Defaults() { return PercentileRankOptions(); }

  /// Column key(s) to order by and how to order by these sort keys.
  std::vector<SortKey> sort_keys;
  /// Whether nulls and NaNs are placed at the start or at the end
  NullPlacement null_placement;
  /// Factor to apply to the output.
  /// Use 1.0 for results in (0, 1), 100.0 for percentages, etc.
  double factor;
};

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions