query option to use parser.jj extension instead of regex#8880
query option to use parser.jj extension instead of regex#8880walterddr wants to merge 4 commits intoapache:masterfrom
Conversation
pinot-common/src/test/java/org/apache/pinot/sql/parsers/CalciteSqlCompilerTest.java
Outdated
Show resolved
Hide resolved
670d91a to
ea41297
Compare
pinot-common/src/main/java/org/apache/pinot/sql/parsers/parser/SqlOptions.java
Outdated
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/sql/parsers/parser/SqlOptions.java
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
Outdated
Show resolved
Hide resolved
| sql = removeOptionsFromSql(sql); | ||
| } | ||
|
|
||
| try (StringReader inStream = new StringReader(sql)) { |
There was a problem hiding this comment.
Not related to this PR --
Looks like this function was added when INSERT grammar extension was done and is being called from broker query endpoint in PinotClientRequest.java.
It seems like for DQL, the caller will parse the SQL statement twice. Once at line 153 when it calls this method in the parser to retrieve SqlNodeAndOptions and then if the type is not DML, it sends the query to requestHandler.handleRequest(sql) which will call the parser again (compileToPinotQuery(sql)) and the statement will be parsed again.
Not sure if this is expected. I just noticed it
There was a problem hiding this comment.
i dont think so but yeah I will take a look to address this as well
There was a problem hiding this comment.
actually the logic here is a bit more complex than simple reduce the compilation. I will create another PR to follow up.
|
|
||
| pinotQuery = CalciteSqlParser.compileToPinotQuery( | ||
| "select * from vegetables where name <> 'Brussels sprouts' OPTION (delicious=yes)"); | ||
| "OPTION (delicious='yes'); select * from vegetables where name <> 'Brussels sprouts'"); |
There was a problem hiding this comment.
Can we also add tests using examples called out in https://blog.doyensec.com/2022/06/09/apache-pinot-sqli-rce.html#query-options. Since they now don't fit in the grammar, we can check if they fail ?
There was a problem hiding this comment.
oh yes absolutely. I will add to the test set
pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
Show resolved
Hide resolved
|
I will spend sometime examining the production logs to see how OPTION is being used in production (if at all) and how it will break with this change. Will get back with the info by early next week. |
005f6e3 to
8b6cab3
Compare
8b6cab3 to
90ff274
Compare
Codecov Report
@@ Coverage Diff @@
## master #8880 +/- ##
============================================
- Coverage 69.78% 69.67% -0.12%
- Complexity 4672 4680 +8
============================================
Files 1803 1807 +4
Lines 93752 94212 +460
Branches 13935 14053 +118
============================================
+ Hits 65426 65643 +217
- Misses 23790 24011 +221
- Partials 4536 4558 +22
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
This is awesome! Should we consider also supporting identifier as value? That way the current way of putting options can work |
|
+1 thanks @walterddr for adding this |
|
Actually after a second thought, I think it is okay to keep the current way without affecting current query options. The only query option value that is not number/boolean is There is a major issue not handled though. In |
|
Just noticed that seems it requires an extra |
adding
This is a good catch, will fix; but also do we know if there's any other places that adds the OPTION regex to the query?
Yes this is the standard SQL statement separator. however the |
|
closing this in favor of #8906 for a more systematic fix. |
Background
Currently Query Options are parsed via regex match and replaced with empty string. This causes SQLi issue.
Solution
Release Note:
This is a backward incompatible change for utilizing
OPTIONkeyword.OPTION(key=value)ANYWHERE in the SQL query string; with this change only allow at the beginning or the end of the SQL string payload.OPTION(foo=bar, alice=123, bob='potato', complex.key=valueLiteral)is now:
OPTION(foo='bar', alice='123', bob='''potato''', "complex.key"='valueLiteral');;must be added between statements. both among multiple OPTION definitions; and between OPTION definitions and SQL DQL/DML statement.Additional Remark
This new syntax can theoretically support other literal types for example:
OPTION(foo='bar', alice=123)-->Map<String, Object>.of("foo", (String) bar", "alice", (Integer) 123);OPTION(foo=true)-->Map<String, Object>.of("foo", (boolean) true);However since currently all k-v options support by Pinot is a string/string mapping, we will defer this to a new feature if necessary
Appendix
Full SQL syntax algebra