Skip to content

Change SQLFlow extended SQL syntax from TRAIN into TO TRAIN #473

@wangkuiyi

Description

@wangkuiyi

As mentioned in #450 (comment), SQL syntax allows aliases of table names. (Thank to Jian Zhang from PingCap helped me realized this.)

The following SQL statements are all legal:

SELECT * FROM t1
SELECT * FROM t1 t2 // t2 is an alias of t1
SELECT * FROM t1, t2 // t1 and t2 are table names
SELECT * FROM t1 t2, t3 // t1 and t3 are table names, t2 is an alias of t1
SELECT * FROM t1, t2, t3 t4 // t2 is an alias of t1, t4 is an alias of t3

So, if SQLFlow extends SQL SELECT statements by allowing users to append a TRAIN clause, we would see that the external parser (Calcite parser, TiDB parser, or Hive SQL parser) accepts the following statement until they meet the token "DNNClassifier", because they would think TRAIN is an alias of table t1.

SELECT * FROM t1 TRAIN DNNClassifier
                       ^

The above behavior is not what we want; we want the external parsers err on TRAIN instead of DNNClassifier.

A possible solution is to change the way SQLFlow extends SELECT statements from appending TRAIN to appending TO TRAIN.

SELECT * FROM t1 TO TRAIN DNNClassifier

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions