Skip to content

Conversation

ritvibhatt
Copy link
Contributor

@ritvibhatt ritvibhatt commented Sep 19, 2025

Description

Add support for min/max statistical eval functions, allowing users to find maximum and minimum values among multiple arguments within a single row.

  • Updated parser to add max and min syntax for eval command
  • Added MaxFunction and MinFunction classes with Calcite UDF implementation
  • Updated PPLBuiltinOperators and PPLFuncImpTable to register the new functions

Usage Examples

-- Returns the larger value between age field and 30 for each row
source=accounts | eval max_age = MAX(age, 30) | fields age, max_age

-- Returns either 'John' or value in firstname depending on what is larger lexicographically
source=accounts | eval result = MAX(age, 'John', firstname) | fields age, firstname, result

-- Returns either the value in the age field or 35
source=accounts | eval result = MIN(age, 35, firstname) | fields age, firstname, result

Related Issues

Resolves #4341

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Copy link
Collaborator

@RyanL1997 RyanL1997 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ritvibhatt, thanks for the change. I just left some comments for having a better understanding of the change.

@RyanL1997 RyanL1997 added calcite calcite migration releated PPL Piped processing language backport 2.19-dev feature labels Sep 22, 2025
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level QQ: is supporting max(int, string) a hard requirement for now?

I'm just thinking otherwise:

  1. We probably can reuse the existing array_min/max function;
  2. The implicit conversion (string -> numeric) can be supported in #4349. cc: @penghuo

:local:
:depth: 1

.. versionadded:: 3.3.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the doc

boolean aIsNumeric = isNumeric(a);
boolean bIsNumeric = isNumeric(b);

if (aIsNumeric != bIsNumeric) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is expectataion is max(4, "2")? should be 4, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Max(4, "2") should result in "2" since strings are always considered larger than numeric values

@penghuo
Copy link
Collaborator

penghuo commented Sep 23, 2025

A high level QQ: is supporting max(int, string) a hard requirement for now?

I'm just thinking otherwise:

  1. We probably can reuse the existing array_min/max function;
  2. The implicit conversion (string -> numeric) can be supported in [RFC] Support permissive mode in PPL #4349. cc: @penghuo

array_min/max required all arguments have same type.

max/min required speical handling, implicit cast may not work, e.g. the expactation is

max(20, "4")    should return 20
max(20, "4a")   should return 4a
  • convert fields to int does not work
array_max(20, 4)      should return 20
array_max(20, null)   should return 20
  • convert fields to string does not work
array_max("20", "4")    should return "4"
array_max("20", "4a")   should return "4a"

@dai-chen
Copy link
Collaborator

@ritvibhatt I synced with @penghuo offline. For Q2, it doesn’t seem to be a type conversion issue as I originally thought, but rather a data sorting one. For example, we could define a custom comparator in Java to handle the new sorting rule for numeric and string values, and then apply it in min/max/sort APIs. Perhaps we can do something similar within the Calcite accumulator?

ritvibhatt and others added 3 commits September 29, 2025 11:27
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
import java.util.Comparator;

/** Comparator for MAX operations where strings have higher precedence than numbers. */
public class MaxTypeComparator implements Comparator<Object> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is MinTypeComparator simply the inverse of MaxTypeComparator? I'm thinking of only one comparator and use it in Min/MaxFunction below by Java Stream's min/max(comparator) API.

Also this may worth adding dedicated UT to show its behavior for num-num, num-string, string-string etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated and added tests thank you!

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

if (aIsNumeric) {
return Double.compare(((Number) a).doubleValue(), ((Number) b).doubleValue());
} else {
return Integer.compare(a.toString().compareTo(b.toString()), 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this Integer.compare with 0 meaning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will normalize it so if the string comparison returns a negative it will make it -1 and if it returns a positive it will be 1. Don't think that is necessary, can remove and just leave the string comparison

}

private static boolean isNumeric(Object obj) {
return obj instanceof Number;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you confirm whether comparison between "1" and "2" also considered as numerical comparison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No they will be compared as strings so max("9", "21") will return "9"

@Swiddis Swiddis merged commit fae0687 into opensearch-project:main Sep 30, 2025
33 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4333-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 fae06873a057f3bcdea1a7a113c74932fc801deb
# Push it to GitHub
git push --set-upstream origin backport/backport-4333-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4333-to-2.19-dev.

ritvibhatt added a commit to ritvibhatt/sql that referenced this pull request Sep 30, 2025
ritvibhatt added a commit to ritvibhatt/sql that referenced this pull request Oct 7, 2025
(cherry picked from commit fae0687)
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
penghuo pushed a commit that referenced this pull request Oct 7, 2025
(cherry picked from commit fae0687)

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Oct 9, 2025
asifabashar added a commit to asifabashar/sql that referenced this pull request Oct 10, 2025
* main-apple: (218 commits)
  Add ignorePrometheus Flag for integTest and docTest (opensearch-project#4442)
  Create fab-radar.yml
  PPL `fillnull` command enhancement (opensearch-project#4421)
  reverting to _doc + _id (opensearch-project#4435)
  Support `multisearch` command in calcite (opensearch-project#4332)
  Add 3.3 release notes (opensearch-project#4422) (opensearch-project#4423)
  [SQL/PPL] Fix the `count(*)` and `dc(field)` to be capped at MAX_INTEGER opensearch-project#4416 (opensearch-project#4418)
  Change the default search sort tiebreaker to `_shard_doc` for PIT search (opensearch-project#4378)
  [Enhancement] Add error handling for known limitation of sql `JOIN` (opensearch-project#4344)
  Bugfix: SQL type mapping for legacy JDBC output (opensearch-project#3613)
  Version bump: 3.3 (opensearch-project#4417)
  Add max/min eval functions (opensearch-project#4333)
  Support time modifiers in search command  (opensearch-project#4224)
  Fix numbered token bug and make it optional output in patterns command (opensearch-project#4402)
  refactor span (opensearch-project#4334)
  Move release notes categories (opensearch-project#3818)
  [Doc] Enable doctest with Calcite (opensearch-project#4379)
  Mod function should return decimal instead of float when handle the operands are decimal literal (opensearch-project#4407)
  Scale of decimal literal should always be positive in Calcite (opensearch-project#4401)
  Enable Calcite by default and implicit fallback the unsupported commands (opensearch-project#4372)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated feature PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add max/min functions for eval command

6 participants