Skip to content

Epic: Simplify functions signature with LogicalType #13301

Open
@jayzhan211

Description

@jayzhan211

Is your feature request related to a problem or challenge?

Some functions signature is quite verbose, like

impl RegexpLikeFunc {
    pub fn new() -> Self {
        Self {
            signature: Signature::one_of(
                vec![
                    TypeSignature::Exact(vec![Utf8View, Utf8]),
                    TypeSignature::Exact(vec![Utf8View, Utf8View]),
                    TypeSignature::Exact(vec![Utf8View, LargeUtf8]),
                    TypeSignature::Exact(vec![Utf8, Utf8]),
                    TypeSignature::Exact(vec![Utf8, Utf8View]),
                    TypeSignature::Exact(vec![Utf8, LargeUtf8]),
                    TypeSignature::Exact(vec![LargeUtf8, Utf8]),
                    TypeSignature::Exact(vec![LargeUtf8, Utf8View]),
                    TypeSignature::Exact(vec![LargeUtf8, LargeUtf8]),
                    TypeSignature::Exact(vec![Utf8View, Utf8, Utf8]),
                    TypeSignature::Exact(vec![Utf8View, Utf8View, Utf8]),
                    TypeSignature::Exact(vec![Utf8View, LargeUtf8, Utf8]),
                    TypeSignature::Exact(vec![Utf8, Utf8, Utf8]),
                    TypeSignature::Exact(vec![Utf8, Utf8View, Utf8]),
                    TypeSignature::Exact(vec![Utf8, LargeUtf8, Utf8]),
                    TypeSignature::Exact(vec![LargeUtf8, Utf8, Utf8]),
                    TypeSignature::Exact(vec![LargeUtf8, Utf8View, Utf8]),
                    TypeSignature::Exact(vec![LargeUtf8, LargeUtf8, Utf8]),
                ],
                Volatility::Immutable,
            ),
        }
    }
}

Can replace it with `Signature::string(2, Volatility::Immutable)`
impl LPadFunc {
    pub fn new() -> Self {
        use DataType::*;
        Self {
            signature: Signature::one_of(
                vec![
                    Exact(vec![Utf8View, Int64]),
                    Exact(vec![Utf8View, Int64, Utf8View]),
                    Exact(vec![Utf8View, Int64, Utf8]),
                    Exact(vec![Utf8View, Int64, LargeUtf8]),
                    Exact(vec![Utf8, Int64]),
                    Exact(vec![Utf8, Int64, Utf8View]),
                    Exact(vec![Utf8, Int64, Utf8]),
                    Exact(vec![Utf8, Int64, LargeUtf8]),
                    Exact(vec![LargeUtf8, Int64]),
                    Exact(vec![LargeUtf8, Int64, Utf8View]),
                    Exact(vec![LargeUtf8, Int64, Utf8]),
                    Exact(vec![LargeUtf8, Int64, LargeUtf8]),
                ],
                Volatility::Immutable,
            ),
        }
    }
}

Can replace it with one of `Signature::coercible(string, int)` and `Signature::coercible(string, int, string)`

Describe the solution you'd like

#13240 starts an attempt to bring logical type to function signature

There are more functions to be cleanup.
The example above can be replaced with TypeSiganture::String, TypeSiganture::Numeric, TypeSiganture::Coercible.

We might also need time related signature for time function.

Improve test coverage with these functions especially with different kinds of types would be great 👍

Describe alternatives you've considered

No response

The role of TypeSignature

TypeSignature used in function which is responsible for handling

  1. length check
  2. type checking and casting

The functions behaviour follows Postgres, DuckDB or other well-designed database. Can check whether the result and coercion is consistent with them.

If the result is consistent in both Postgres, DuckDB, we should follow them. Otherwise, we follow either of them.

For the casting rule, I think we can follow DuckDB's casting rule described here

TypeSignature should handle implicit casting

Implicit Casting
In many situations, the system will add casts by itself. This is called implicit casting. This happens for example when a function is called with an argument that does not match the type of the function, but can be casted to the desired type.

Consider the function sin(DOUBLE). This function takes as input argument a column of type DOUBLE, however, it can be called with an integer as well: sin(1). The integer is converted into a double before being passed to the sin function.

Implicit casts can only be added for a number of type combinations, and is generally only possible when the cast cannot fail. For example, an implicit cast can be added from INTEGER to DOUBLE – but not from DOUBLE to INTEGER.

Screenshot 2024-11-08 at 1 44 56 PM

Good first issue list (non-completed)

  • RegexpLikeFunc / String(2)
  • RegexpMatchFunc / String(2)
  • RegexpReplaceFunc / One of String(3) and String(4)
    ...

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions