Skip to content

[SPARK-26976][SQL] Forbid reserved keywords as identifiers when ANSI mode is on #23880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -736,7 +736,6 @@ qualifiedName

identifier
: strictIdentifier
| {ansi}? ansiReserved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only code change I expect. What's the rationale of the other changes? I think select current_date() should not be supported in ansi mode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or, we can find some more examples to advocate it. AFAIK presto supports select current_date() although it's not SQL standard. Do you know of any other systems that support it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it seems current_timestamp() is not a topic of the ANSI standard. So, I'll drop it in this pr.

But, some databases support current_timestamp() and this is implementation-specific.
For example, postgresql/oracle support current_timestamp(precision) as follows;

postgres=# select CURRENT_TIMESTAMP;
              now              
-------------------------------
 2019-03-12 20:22:52.065108+09
(1 row)

postgres=# select CURRENT_TIMESTAMP(1);
       timestamptz        
--------------------------
 2019-03-12 20:22:56.2+09
(1 row)

postgres=# select CURRENT_TIMESTAMP();
ERROR:  syntax error at or near ")" at character 26
STATEMENT:  select CURRENT_TIMESTAMP();
ERROR:  syntax error at or near ")"
LINE 1: select CURRENT_TIMESTAMP();
                                 ^

So, it might be worth supporting this function for better portability even when ansi mode enabled (this is future work though...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the ANSI starndard, we need to support these functions below for datetime, too;

postgres=# select CURRENT_TIME;
       timetz       
--------------------
 20:33:08.179954+09
(1 row)

postgres=# select LOCALTIME;
      time       
-----------------
 20:33:54.281054
(1 row)

postgres=# select LOCALTIMESTAMP;
         timestamp         
---------------------------
 2019-03-12 20:33:57.85737
(1 row)

I'll file a jira later.

| {!ansi}? defaultReserved
;

Expand All @@ -761,89 +760,6 @@ number
| MINUS? BIGDECIMAL_LITERAL #bigDecimalLiteral
;

// NOTE: You must follow a rule below when you add a new ANTLR token in this file:
// - All the ANTLR tokens = UNION(`ansiReserved`, `ansiNonReserved`) = UNION(`defaultReserved`, `nonReserved`)
//
// Let's say you add a new token `NEWTOKEN` and this is not reserved regardless of a `spark.sql.parser.ansi.enabled`
// value. In this case, you must add a token `NEWTOKEN` in both `ansiNonReserved` and `nonReserved`.
//
// It is recommended to list them in alphabetical order.

// The list of the reserved keywords when `spark.sql.parser.ansi.enabled` is true. Currently, we only reserve
// the ANSI keywords that almost all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011,
// and SQL-2016) and PostgreSQL reserve.
ansiReserved
: ALL
| AND
| ANTI
| ANY
| AS
| AUTHORIZATION
| BOTH
| CASE
| CAST
| CHECK
| COLLATE
| COLUMN
| CONSTRAINT
| CREATE
| CROSS
| CURRENT_DATE
| CURRENT_TIME
| CURRENT_TIMESTAMP
| CURRENT_USER
| DISTINCT
| ELSE
| END
| EXCEPT
| FALSE
| FETCH
| FOR
| FOREIGN
| FROM
| FULL
| GRANT
| GROUP
| HAVING
| IN
| INNER
| INTERSECT
| INTO
| IS
| JOIN
| LEADING
| LEFT
| NATURAL
| NOT
| NULL
| ON
| ONLY
| OR
| ORDER
| OUTER
| OVERLAPS
| PRIMARY
| REFERENCES
| RIGHT
| SELECT
| SEMI
| SESSION_USER
| SETMINUS
| SOME
| TABLE
| THEN
| TO
| TRAILING
| UNION
| UNIQUE
| USER
| USING
| WHEN
| WHERE
| WITH
;


// The list of the non-reserved keywords when `spark.sql.parser.ansi.enabled` is true.
ansiNonReserved
: ADD
Expand Down
Loading