-
Notifications
You must be signed in to change notification settings - Fork 181
Implement Parameter Validation for PPL functions on Calcite #3626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Heng Qian <qianheng@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…hecking for non-operator registrated udf Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…ions - additionally fixed span type checker Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
… composition field of CompositeOperandTypeChecker via relection. May cause fatal errors if IllegalAccessException is thrown. Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
… checker) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
| composition.name())); | ||
| } | ||
| } catch (IllegalAccessException e) { | ||
| LogManager.getLogger(PPLTypeChecker.class).error(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using @log4j2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it was in an interface, I was hesitated in adding a static member logger to it. Therefore, I used LogManageer.getLogger().error(). It's also part of log4j.
Now this line is removed.
| @Test | ||
| public void testLowerWithIntegerType() { | ||
| String ppl = "source=EMP | eval lower_name = lower(EMPNO) | fields lower_name"; | ||
| Assert.assertThrows(Exception.class, () -> getRelNode(ppl)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also see VARCHAR in other test cases. for instance, testTimestampWithWrongArg, does this issue all solved?
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
LantaoJin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could investigate how to leverage SqlNode validation later 3.1.0. Approved.
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
* Support type checker For PPL functions Signed-off-by: Heng Qian <qianheng@amazon.com> * Implement type checkers for UDF Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Implement composite type checker Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Define type checkers for UDFs Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Display expected signatures when type checking fails & support type checking for non-operator registrated udf Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add ITs for UDF type checking Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Fix: correct CONV's operand types Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Fix test parse IT Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Remove unused code in UserDefinedFunctionUtils Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Refactor: improve ITs, javadocs, type checkers Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Improve javadoc for PPLTypeChecker Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Return false at validation for mismatched operand count with expectations - additionally fixed span type checker Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add type checker for cidrmatch Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Correct cidrmatch type checker Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Fix compile error in CalcitePPLAbstractTest.java Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Experiment: support type checking for built-in operators by accessing composition field of CompositeOperandTypeChecker via relection. May cause fatal errors if IllegalAccessException is thrown. Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Check composition type for only calcite's built-in operators Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Support parameter validation for comparabale operators Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Support parameter validation for coalesce (by reusing comparable type checker) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Support parameter validation for substring Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Support parameter validation for if Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Support parameter validation for nullif, isempty, isblank Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Align types in type check error information to actual types in PPL Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Display PPL types when failing to resolve a function Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Create type checker for geoip function Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Define type checker for grok and item operators Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com> Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> Co-authored-by: Heng Qian <qianheng@amazon.com> Signed-off-by: xinyual <xinyual@amazon.com>
Description
Implement parameter validations. It supports validations for
registerOperator(ABS, SqlStdOperatorTable.ABS)registerOperator(SHA2, PPLBuiltinOperators.SHA2);registerOperator). E.g.Besides, it support functions with signature overriddings. E.g.
timestamphas multiple signatures:This implementation is based on the PoC in #3562, where it explains why we can not directly adopt Calcite's built-in type checking mechanism for SQL and how our type checking works.
Issues Resolved
Resolves #3608
Implementation Notes
Classes related to type checking & their roles
UDFOperandMetadata: Created for compatibility withSqlUserDefinedFunction, which serves as the basis for implementing our UDFs. SinceSqlUserDefinedFunctiononly acceptsSqlOperandMetadataas its type checker, andSqlOperandMetadatais incompatible with Calcite's built-in operators,UDFOperandMetadatawas introduced. It acts as a wrapper around Calcite's operand type checkers, allowing us to reuse them. This ensures that type checkers can be utilized in the same way as Calcite's built-in operators for type checking.PPLTypeChecker: The type checker that does the real work. It retrieved acceptable types from Calcite's type checkers, and compare them with the actual ones.SqlOperandTypeChecker,ImplicitCastOperandTypeChecker,FamilyOperandTypeChecker, ...: Calcite's builtin type checkers. Cannot be directly used since it requires aSqlCallBinding:boolean checkOperandTypes(SqlCallBinding callBinding, boolean throwOnFailure).classDiagram class SqlOperandTypeChecker { <<interface>> +checkOperandTypes(SqlCallBinding, boolean) } class CompositeOperandTypeChecker { } class FamilyOperandTypeChecker { } class PPLTypeChecker { <<interface>> +checkOperandTypes(actualTypes) : boolean } class PPLFamilyTypeChecker { - families: List[SqlTypeFamily] } class PPLCompositeTypeChecker { - rules: List[FamilyOperandTypeChecker] } class UDFOperandMetadata { <<interface>> <<extends>> SqlOperandMetadata <<extends>> ImplicitCastOperandTypeChecker +getInnerTypeChecker() : SqlOperandTypeChecker } class SqlOperator { <<abstract>> - operandTypeChecker: SqlOperandTypeChecker + getOperandTypeChecker() : SqlOperandTypeChecker } class SqlUserDefinedFunction { +SqlUserDefinedFunction(SqlOperandMetadata, ...) } %% Relationships SqlOperandTypeChecker <|.. FamilyOperandTypeChecker SqlOperandTypeChecker <|.. CompositeOperandTypeChecker SqlOperator <|-- SqlUserDefinedFunction PPLTypeChecker <|.. PPLFamilyTypeChecker PPLTypeChecker <|.. PPLCompositeTypeChecker UDFOperandMetadata --> SqlUserDefinedFunction : feeds to constructor SqlOperator *-- SqlOperandTypeChecker: contains %% UDFOperandMetadata *-- SqlOperandTypeChecker PPLFamilyTypeChecker ..> FamilyOperandTypeChecker: wraps PPLCompositeTypeChecker ..> CompositeOperandTypeChecker: wrapsType checking for Calcite's built-in operators & for self-implemented UDFs
SqlOperandTypeChecker, then wrapped withPPLTypeCheckerwhich does the actual type checking with the expected param types in theSqlOperandTypeChecker.UDFOperandMetadatato create aSqlUderDefinedFunction, which implementsSqlOperator. Afterwards, the actual Calcite type checker can be retrieved and used in the same way as Calcite's built-in operators.register(funcName, FuncImp)interface, since there is noSqlOperatorinvolved, we directly pass aPPLTypeCheckerwhen creating aFuncImpinstance.Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.