-
Couldn't load subscription status.
- Fork 176
Clean up syntax error reporting #3278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up syntax error reporting #3278
Conversation
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
31649f1 to
a47be17
Compare
| POST /_plugins/_ppl | ||
| { | ||
| "query" : "SELECT * FROM sample:data" | ||
| "query" : "SOURCE = test_index | where a > 0)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This SQL error no longer outputs the same error message (new parsing engine?). I couldn't hit the ANTLR exception with a new SQL query, so I updated it to a PPL one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the improvement of this PR is truncate the long error message? If so, is it possible to simplify the changes especially line 60-80?
common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java
Show resolved
Hide resolved
common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Should be better now, refactored it to only do the mapping for the first few tokens so we can just use |
common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/ppl/QueryAnalysisIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
common/src/main/java/org/opensearch/sql/common/antlr/SyntaxAnalysisErrorListener.java
Show resolved
Hide resolved
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
f09a985 to
7fcb50e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
Description
While working on fixing PPL AST bugs (#3273), it stood out to me that our error reporting for syntax errors really isn't that good. This PR cleans up the handling of our errors.
Example query that's currently returning syntax errors:
Before:
{ "error": { "reason": "Invalid Query", "details": "Failed to parse query due to offending symbol [OR] at: 'SOURCE = test_19a673e2 | WHERE x OR' <--- HERE... More details: Expecting tokens in {'SEARCH', 'DESCRIBE', 'SHOW', 'FROM', 'WHERE', 'FIELDS', 'RENAME', 'STATS', 'DEDUP', 'SORT', 'EVAL', 'HEAD', 'TOP', 'RARE', 'PARSE', 'METHOD', 'REGEX', 'PUNCT', 'GROK', 'PATTERN', 'PATTERNS', 'NEW_FIELD', 'KMEANS', 'AD', 'ML', 'FILLNULL', 'TRENDLINE', 'SOURCE', 'INDEX', 'D', 'DESC', 'DATASOURCES', 'SORTBY', 'AUTO', 'STR', 'NUM', 'KEEPEMPTY', 'CONSECUTIVE', 'DEDUP_SPLITVALUES', 'PARTITIONS', 'ALLNUM', 'DELIM', 'CENTROIDS', 'ITERATIONS', 'DISTANCE_TYPE', 'NUMBER_OF_TREES', 'SHINGLE_SIZE', 'SAMPLE_SIZE', 'OUTPUT_AFTER', 'TIME_DECAY', 'ANOMALY_RATE', 'CATEGORY_FIELD', 'TIME_FIELD', 'TIME_ZONE', 'TRAINING_DATA_SIZE', 'ANOMALY_SCORE_THRESHOLD', 'TRUE', 'FALSE', 'CONVERT_TZ', 'DATETIME', 'DAY', 'DAY_HOUR', 'DAY_MICROSECOND', 'DAY_MINUTE', 'DAY_OF_YEAR', 'DAY_SECOND', 'HOUR', 'HOUR_MICROSECOND', 'HOUR_MINUTE', 'HOUR_OF_DAY', 'HOUR_SECOND', 'INTERVAL', 'MICROSECOND', 'MILLISECOND', 'MINUTE', 'MINUTE_MICROSECOND', 'MINUTE_OF_DAY', 'MINUTE_OF_HOUR', 'MINUTE_SECOND', 'MONTH', 'MONTH_OF_YEAR', 'QUARTER', 'SECOND', 'SECOND_MICROSECOND', 'SECOND_OF_MINUTE', 'WEEK', 'WEEK_OF_YEAR', 'YEAR', 'YEAR_MONTH', 'IP', '.', '+', '-', '(', '`', 'AVG', 'COUNT', 'DISTINCT_COUNT', 'ESTDC', 'ESTDC_ERROR', 'MAX', 'MEAN', 'MEDIAN', 'MIN', 'MODE', 'RANGE', 'STDEV', 'STDEVP', 'SUM', 'SUMSQ', 'VAR_SAMP', 'VAR_POP', 'STDDEV_SAMP', 'STDDEV_POP', 'PERCENTILE', 'TAKE', 'FIRST', 'LAST', 'LIST', 'VALUES', 'EARLIEST', 'EARLIEST_TIME', 'LATEST', 'LATEST_TIME', 'PER_DAY', 'PER_HOUR', 'PER_MINUTE', 'PER_SECOND', 'RATE', 'SPARKLINE', 'C', 'DC', 'ABS', 'CBRT', 'CEIL', 'CEILING', 'CONV', 'CRC32', 'E', 'EXP', 'FLOOR', 'LN', 'LOG', 'LOG10', 'LOG2', 'MOD', 'PI', 'POSITION', 'POW', 'POWER', 'RAND', 'ROUND', 'SIGN', 'SQRT', 'TRUNCATE', 'ACOS', 'ASIN', 'ATAN', 'ATAN2', 'COS', 'COT', 'DEGREES', 'RADIANS', 'SIN', 'TAN', 'ADDDATE', 'ADDTIME', 'CURDATE', 'CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP', 'CURTIME', 'DATE', 'DATEDIFF', 'DATE_ADD', 'DATE_FORMAT', 'DATE_SUB', 'DAYNAME', 'DAYOFMONTH', 'DAYOFWEEK', 'DAYOFYEAR', 'DAY_OF_MONTH', 'DAY_OF_WEEK', 'EXTRACT', 'FROM_DAYS', 'FROM_UNIXTIME', 'GET_FORMAT', 'LAST_DAY', 'LOCALTIME', 'LOCALTIMESTAMP', 'MAKEDATE', 'MAKETIME', 'MONTHNAME', 'NOW', 'PERIOD_ADD', 'PERIOD_DIFF', 'SEC_TO_TIME', 'STR_TO_DATE', 'SUBDATE', 'SUBTIME', 'SYSDATE', 'TIME', 'TIMEDIFF', 'TIMESTAMP', 'TIMESTAMPADD', 'TIMESTAMPDIFF', 'TIME_FORMAT', 'TIME_TO_SEC', 'TO_DAYS', 'TO_SECONDS', 'UNIX_TIMESTAMP', 'UTC_DATE', 'UTC_TIME', 'UTC_TIMESTAMP', 'WEEKDAY', 'YEARWEEK', 'SUBSTR', 'SUBSTRING', 'LTRIM', 'RTRIM', 'TRIM', 'LOWER', 'UPPER', 'CONCAT', 'CONCAT_WS', 'LENGTH', 'STRCMP', 'RIGHT', 'LEFT', 'ASCII', 'LOCATE', 'REPLACE', 'REVERSE', 'CAST', 'LIKE', 'ISNULL', 'ISNOTNULL', 'CIDRMATCH', 'IFNULL', 'NULLIF', 'IF', 'TYPEOF', 'ALLOW_LEADING_WILDCARD', 'ANALYZE_WILDCARD', 'ANALYZER', 'AUTO_GENERATE_SYNONYMS_PHRASE_QUERY', 'BOOST', 'CUTOFF_FREQUENCY', 'DEFAULT_FIELD', 'DEFAULT_OPERATOR', 'ENABLE_POSITION_INCREMENTS', 'ESCAPE', 'FLAGS', 'FUZZY_MAX_EXPANSIONS', 'FUZZY_PREFIX_LENGTH', 'FUZZY_TRANSPOSITIONS', 'FUZZY_REWRITE', 'FUZZINESS', 'LENIENT', 'LOW_FREQ_OPERATOR', 'MAX_DETERMINIZED_STATES', 'MAX_EXPANSIONS', 'MINIMUM_SHOULD_MATCH', 'OPERATOR', 'PHRASE_SLOP', 'PREFIX_LENGTH', 'QUOTE_ANALYZER', 'QUOTE_FIELD_SUFFIX', 'REWRITE', 'SLOP', 'TIE_BREAKER', 'TYPE', 'ZERO_TERMS_QUERY', 'SPAN', 'MS', 'S', 'M', 'H', 'W', 'Q', 'Y', ID, INTEGER_LITERAL, DECIMAL_LITERAL, DQUOTA_STRING, SQUOTA_STRING, BQUOTA_STRING}", "type": "SyntaxCheckException" }, "status": 400 }After:
{ "error": { "reason": "Invalid Query", "details": "[OR] is not a valid term at this part of the query: '..._19a673e2 | WHERE x OR' <-- HERE. Expecting one of 284 possible tokens. Some examples: 'SEARCH', 'DESCRIBE', 'SHOW', 'FROM', 'WHERE', ...", "type": "SyntaxCheckException" }, "status": 400 }Even though this particular query should be valid, I think it's much clearer from this what the parser is mad about, and easier to put in a bug report too.
Related Issues
N/A
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.