Skip to content

Adding support for PostgreSQL's CustomOperator(String) #1299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Adding support for PostgreSQL's CustomOperator(String) #1299

wants to merge 1 commit into from

Conversation

amrutadotorg
Copy link

This code adds the following PGroonga operators to the BinaryOperator enum:

PGroongaFullTextSearch for the &@ operator
PGroongaEasyQuerySearch for the &@~ operator
PGroongaSimilarSearch for the &@* operator
PGroongaAdvancedSearch for the `&`` operator
PGroongaArrayKeywordSearch for the &@| operator
PGroongaArrayQuerySearch for the &@~| operator

Additionally, the fmt::Display implementation has been updated to handle these new operators appropriately.

This code adds the following PGroonga operators to the BinaryOperator enum:

PGroongaFullTextSearch for the &@ operator
PGroongaEasyQuerySearch for the &@~ operator
PGroongaSimilarSearch for the &@* operator
PGroongaAdvancedSearch for the `&`` operator
PGroongaArrayKeywordSearch for the &@| operator
PGroongaArrayQuerySearch for the &@~| operator

Additionally, the fmt::Display implementation has been updated to handle these new operators appropriately.
@amrutadotorg amrutadotorg changed the title Update operator.rs adding PGroonga operators Adding support for PGroonga operators Jun 3, 2024
Copy link
Contributor

@lovasoa lovasoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works. You didn't write any parsing code for these new operators.

And I think a better approach would be to have a CustomOperator(String), because anyone can define any operator, and it would be great if we were able to parse them independently of the particular case of pgroonga.

I think we need to work on this in the tokenizer.

@lovasoa
Copy link
Contributor

lovasoa commented Jun 3, 2024

The documentation for allowed characters in custom operators is here: https://www.postgresql.org/docs/current/sql-createoperator.html

@amrutadotorg
Copy link
Author

amrutadotorg commented Jun 3, 2024

@lovasoa what would you of having something like this in the tokenizer.rs ?

pub enum Token {
    EOF,
    Word(Word),
    Number(String, bool),
    Char(char),
    SingleQuotedString(String),
    DoubleQuotedString(String),
    CustomOperator(String),
}
impl<'a> Tokenizer<'a> {
    fn next_token(&mut self) -> Token {
        // Existing logic...

        // Detect and return a `CustomOperator` token if a custom operator is found.
        let start_pos = self.chars.clone();  // Save the starting position
        let mut operator = String::new();
        while let Some(&ch) = self.chars.peek() {
            // Check if the character is valid for a custom operator
            if "+-*/<>~=!@#%^&|`?".contains(ch) {
                operator.push(ch);
                self.chars.next();
            } else {
                break;
            }
        }

        if operator.len() > 1 && (operator.ends_with('+') || operator.ends_with('-')) {
            if !operator.contains('~') && !operator.contains('!') && !operator.contains('@') &&
               !operator.contains('#') && !operator.contains('%') && !operator.contains('^') &&
               !operator.contains('&') && !operator.contains('|') && !operator.contains('`') && 
               !operator.contains('?') {
                // Invalid operator according to the rules <https://www.postgresql.org/docs/current/sql-createoperator.html>
                self.chars = start_pos; // Rewind to the start position
                return Token::Char(self.chars.next().unwrap());  // Fallback to a single char token
            }
        }

        if !operator.is_empty() {
            return Token::CustomOperator(operator);
        }

        // Fallback to other token parsing logic
        // ...
    }
}

@amrutadotorg amrutadotorg changed the title Adding support for PGroonga operators Adding support for PostgreSQL's CustomOperator(String) Jun 3, 2024
@lovasoa
Copy link
Contributor

lovasoa commented Jun 3, 2024

No, you would have to integrate with the existing operator tokenization logic not to break it for other existing operators.

It's a little bit technical, so if what you are interested in is just getting full text search to work for you, the easiest is probably for you to declare your own function that wraps pgroonga's custom operator, and can that function instead of the operator in your code to work around the current operator parsing bug.

@lovasoa
Copy link
Contributor

lovasoa commented Jun 3, 2024

@alamb , currently, the tokenizer is dialect-independent.

However, the same expression a&@b would be tokenized as identifier(a) operator(&@) identifier(b) in postgres and identifier(a) operator(&) identifier(@b) in Microsoft SQL Server.

Do you think it would be okay to add partial support for custom operators, ignoring the ones that end in @ and are not followed by a whitespace? Or would you require full support for differentiated tokenization before merging support for custom operators?

EDIT: Sorry, my bad. The tokenizer actually already has access to a dialect object.

lovasoa added a commit to lovasoa/sqlparser-rs that referenced this pull request Jun 6, 2024
Copy link
Contributor

@lovasoa lovasoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR can be closed in favor of #1302

@amrutadotorg amrutadotorg deleted the patch-1 branch June 6, 2024 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants