Turn on expand identifier in SqlValidator config#11457
Turn on expand identifier in SqlValidator config#11457Jackie-Jiang merged 1 commit intoapache:masterfrom
Conversation
5a77c45 to
eb0c81d
Compare
Codecov Report
@@ Coverage Diff @@
## master #11457 +/- ##
============================================
+ Coverage 62.99% 63.00% +0.01%
- Complexity 1094 1098 +4
============================================
Files 2302 2302
Lines 124025 124041 +16
Branches 18901 18903 +2
============================================
+ Hits 78126 78156 +30
+ Misses 40351 40333 -18
- Partials 5548 5552 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 18 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
eb0c81d to
84c3221
Compare
|
This PR seems to produce a regression where |
|
i was wondering if there's a better way to handle it. expanding both into a CAST seems a bit overkill b/c we might not really want to perform the cast operation, is there any rule in Calcite during parsing time to convert the group-by into just an index reference (e.g. |
| // Test select with group by order by limit | ||
| sqlQuery = "SELECT toBase64(toUtf8(AirlineID)) " | ||
| + "FROM mytable " | ||
| + "GROUP BY toBase64(toUtf8(AirlineID)) " | ||
| + "ORDER BY toBase64(toUtf8(AirlineID)) DESC " | ||
| + "LIMIT 10"; |
There was a problem hiding this comment.
there's a problem with this.
SELECT toBase64(toUtf8(AirlineID)) AS AirlineID
FROM mytable
GROUP BY toBase64(toUtf8(AirlineID))
ORDER BY toBase64(toUtf8(AirlineID)) DESC
LIMIT 10
will failed b/c of the reference to AirlineID is both in the function toBase64 and AS aliasing
did this query ever succeed before the expand identifier?
Fix for #11447
For sql:
The issue is that calcite expand group by
toBase64(toUtf8(AirlineID))totoBase64(toUtf8(CAST(mytable.AirlineID AS VARCHAR CHARACTER SET ISO-8859-1)))but not the expression in SELECT.Enabling identifier expansion is useful in situations where you want to be explicit about the source of each column or table. It can help in reducing ambiguity and making the query more self-describing.