Skip to content

Conversation

@xinyual
Copy link
Contributor

@xinyual xinyual commented Mar 4, 2025

Description

This pr is main:

  1. Add UDF/UDAF interface
  2. add some examples:
    a. Directly map to calcite built-in function, but need modification for argument (optional). e.g. Atan/Atan2/pow
    b. cannot find suitable built-in function, need to write our own logic. E.g. sqrt function, take aggregation function
  3. add UT/IT for them.
  4. Fix join data table from here

./gradlew :integ-test:integTest --tests 'CalciteIT' succeed locally

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#3310

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

xinyual added 5 commits March 4, 2025 13:58
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
}
}

static List<RexNode> translateArgument(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a framework work, we will change this method frequently, could you add some comments to explain this method for developers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, already added.

package org.opensearch.sql.calcite.udf;

public interface Accumulator {
Object result();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. some comments for this method.
  2. could it be moved to class UserDefinedAggFunction.java since it is only used for UDAF
  3. how about name to value()
  4. as a accumulator, do we need another method merge() to merge two Acc objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Already done.
  2. Already done.
  3. Already done.
  4. Can you elaborate more in which scenario we may need to do so? Accumulator is something like AggregationState, do we need to consider merge here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4

The merge method in an Accumulator is used to combine two accumulators into one. It's particularly important in distributed computing scenarios where partial results from different workers/nodes need to be combined into a final result. In parallel processing, different partitions of data might be processed by different workers.

Example:

class CountAccumulator implements Accumulator<Integer> {
    private int count = 0;
    
    public void add(Integer value) {
        count++;
    }
    
    public void merge(Accumulator other) {
        // Combine the counts from both accumulators
        count += ((CountAccumulator)other).count;
    }
}

Seems it's not required in current framework since SQL plugin is running in coordinator node only.


Object result(S accumulator);

// Add values to the accumulator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use javadoc /** */ before signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public interface UserDefinedAggFunction<S extends Accumulator> {
S init();

Object result(S accumulator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in UDF, the method name is eval, should we change to eval either?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the function name of agg function, it's defined in calcite here https://github.com/apache/calcite/blob/1793ba79a328c61fb42842f443334cc1353c985f/core/src/main/java/org/apache/calcite/schema/impl/AggregateFunctionImpl.java#L91. We cannot modify them. I will left comment.

package org.opensearch.sql.calcite.udf;

public interface UserDefinedFunction {
Object eval(Object... args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments.

@@ -0,0 +1,55 @@
package org.opensearch.sql.calcite.udf.udaf;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright header missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return hits;
}

public void add(Object value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add be a part of interface method signature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just compare the interface AggregationState. I think we could just make sure we have add in UserDefinedAggFunction. For Accumulator, they can implement their own functions.

@LantaoJin LantaoJin added the calcite calcite migration releated label Mar 4, 2025
@LantaoJin
Copy link
Member

Add ./gradlew :integ-test:integTest --tests '*Calcite*IT' succeed locally. in your description before the CI issue fixed. @xinyual

@LantaoJin
Copy link
Member

Related Issues should be addressed

Signed-off-by: xinyual <xinyual@amazon.com>
*/
interface Accumulator {
/**
* @return the final aggregation value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you run ./gradlew spotlessApply before pushing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget to run. Just rerun the push.

S init();

/**
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 47 to 50
@Override
public Object result() {
public Object value() {
return hits;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code indentation problem, please run ./gradlew spotlessApply

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: xinyual <xinyual@amazon.com>
@LantaoJin LantaoJin merged commit 6668aa8 into opensearch-project:feature/calcite-engine Mar 4, 2025
5 of 13 checks passed
penghuo pushed a commit to penghuo/os-sql that referenced this pull request Mar 12, 2025
* add udf/udaf interface and take/sqrt function

Signed-off-by: xinyual <xinyual@amazon.com>

* add UT

Signed-off-by: xinyual <xinyual@amazon.com>

* add POW, Atan, Atan2 and corresponding UT

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

* fix table for join it

Signed-off-by: xinyual <xinyual@amazon.com>

* add java doc

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
penghuo pushed a commit that referenced this pull request Jun 16, 2025
* add udf/udaf interface and take/sqrt function

Signed-off-by: xinyual <xinyual@amazon.com>

* add UT

Signed-off-by: xinyual <xinyual@amazon.com>

* add POW, Atan, Atan2 and corresponding UT

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

* fix table for join it

Signed-off-by: xinyual <xinyual@amazon.com>

* add java doc

Signed-off-by: xinyual <xinyual@amazon.com>

* apply spotless

Signed-off-by: xinyual <xinyual@amazon.com>

---------

Signed-off-by: xinyual <xinyual@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

calcite calcite migration releated

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants