Skip to content

Conversation

@yuancu
Copy link
Collaborator

@yuancu yuancu commented Apr 3, 2025

Problems

  1. Division by 0 will raise the following exception:

    org.opensearch.sql.exception.CalciteUnsupportedException
        at __randomizedtesting.SeedInfo.seed([F8186AD466FD7B69:78AF71CCE4027497]:0)
        at org.opensearch.sql.executor.QueryService.execute(QueryService.java:103)
        at org.opensearch.sql.executor.execution.QueryPlan.execute(QueryPlan.java:68)
        at org.opensearch.sql.util.ExecuteOnCallerThreadQueryManager.submit(ExecuteOnCallerThreadQueryManager.java:20)
    
  2. Mod does not return wider types for BYTE & SHORT. I.e. MOD(BYTE, SHORT) -> INTEGER, but is expected to return SHORT (a wider type between operands, as what is returned in opensearch-sql v2.x)

Fix

  1. Fix / function with a UDF. It returns null if the divisor is 0.

    The rest behaviors is aligned with original implementation with SqlStdOperatorTable.DIVIDE & opensearch-sql v2.x in terms of return types and integral division.
    I.e.

    • It will return wider types between the dividend and the divisor. E.g. (BYTE, SHORT) -> SHORT
    • For integer operands, the result is also integer; the fractional part is discarded. e.g. (INTEGER, LONG) -> LONG no matter it is dividable or not.
  2. Fixed the return types of MODfunction such that it returns a value with the least restrictive type between its operands. E.g. (SHORT, BYTE) -> SHORT

Additionally, this PR did a refactor for TimeAddSub UDF -- I moved a util function from UDFUtils to TimeAddSubFunction.java, as the util function is only used by TimeAdd & TimeSub. This also aligns with DateAddSubFunction -- it has a similar util function placed within the UDF definition.

Related Issues

Relevant to #3390

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • [] New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
@yuancu yuancu marked this pull request as ready for review April 3, 2025 10:14
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not all changes are not covered in the PR description?

3.142857142857143,
3.0,
2.4333334,
0.8225806704669051));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you confirm this won't be flaky due to floating precision? I see error = 1e-10 in `closeTo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's likely. I think I'd better replace it with a calculated value, e.g. 22.0 / 7

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that the returned representation from calcite is already truncated. See OpenSearchExecutionEngine.java#L147 and JdbcOpenSearchDataTypeConvertor.java#L67.

In the line 67

Object value = rs.getObject(i);

The value for e.g. 7.3 / 6.2 is returned as Float(1.1774194). Therefore, it will fail tests if the expected value is set to the precise computed results (around 1.1774193548387095 in this case).

That was why I used the literal values which exactly match its actual result in tests.

This is also how previous versions before introducing calcite behaves (opensearch-sql 2.19.1.0).

IMO, it might not to the user's interests whether they are shown a more precise value with an error smaller to 1e-10, or a truncated value with an error around 1e-5.

Admittedly, the current implementation is not resilient to changes in underlying implementations. But I can not come up with a better solution than using literal values in tests. Please feel free to leave your thoughts!

@yuancu
Copy link
Collaborator Author

yuancu commented Apr 4, 2025

I think not all changes are not covered in the PR description?

I updated the description to explain what else I modified. Thanks for reminding me!

Copy link
Collaborator

@Swiddis Swiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts, one suggestion I'd like to see either revised or documented

@LantaoJin
Copy link
Member

LantaoJin commented Apr 7, 2025

CalcitePPLDateTimeBuiltinFunctionIT > testDateFormatAndDatetimeAndFromDays FAILED

Could you check this? @yuancu

@yuancu
Copy link
Collaborator Author

yuancu commented Apr 7, 2025

CalcitePPLDateTimeBuiltinFunctionIT > testDateFormatAndDatetimeAndFromDays FAILED

Could you check this? @yuancu

I think it has been resolved by @xinyual in issue #3516 . I'll double check.

Update: I have confirmed that #3517 solved the flaky test.

yuancu added 2 commits April 7, 2025 14:04
- additionally update expressions.rst to clarify how DIVIDE handles division by zero

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
@yuancu yuancu changed the title Fix divide function with a UDF Fix return types of MOD and DIVIDE UDFs Apr 7, 2025
@Swiddis Swiddis merged commit b4e8b2e into opensearch-project:main Apr 7, 2025
22 checks passed
penghuo pushed a commit that referenced this pull request Jun 16, 2025
* Fix divide function with a UDF

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Coerce return types for DIVIDE and MOD UDFs

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

* Remove the magic threshold for DIVIDE and MOD
- additionally update expressions.rst to clarify how DIVIDE handles division by zero

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

---------

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
@yuancu yuancu deleted the fix-divide branch September 23, 2025 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants