Skip to content

[SPARK-39255][SQL] Improve error messages #36635

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented May 23, 2022

What changes were proposed in this pull request?

In the PR, I propose to improve errors of the following error classes:

  1. NON_PARTITION_COLUMN - a non-partition column name -> the non-partition column
  2. UNSUPPORTED_SAVE_MODE - a not existent path -> a non existent path.
  3. INVALID_FIELD_NAME. Quote ids to follow the rules [SPARK-39243][SQL][DOCS] Rules of quoting elements in error messages #36621.
  4. FAILED_SET_ORIGINAL_PERMISSION_BACK. It is renamed to FAILED_PERMISSION_RESET_ORIGINAL.
  5. NON_LITERAL_PIVOT_VALUES - Wrap error's expression by double quotes. The PR adds new helper method toSQLExpr() for that.
  6. CAST_INVALID_INPUT - Add the recommendation: ... Correct the syntax for the value before casting it, or change the type to one appropriate for the value.

Why are the changes needed?

To improve user experience with Spark SQL by making error message more clear.

Does this PR introduce any user-facing change?

Yes, it changes user-facing error messages.

How was this patch tested?

By running the affected test suites:

$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsDSv2Suite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsSuite"
$ build/sbt "sql/testOnly *QueryExecutionAnsiErrorsSuite"
$ build/sbt "sql/testOnly *QueryExecutionErrorsSuite"
$ build/sbt "sql/testOnly *QueryParsingErrorsSuite*"

@MaxGekk MaxGekk changed the title [WIP][SPARK-39255][SQL] Improve error messages [SPARK-39255][SQL] Improve error messages May 23, 2022
@MaxGekk MaxGekk marked this pull request as ready for review May 23, 2022 14:19
@MaxGekk
Copy link
Member Author

MaxGekk commented May 23, 2022

The test failure:

[info] - Hide credentials in show create table *** FAILED *** (31 milliseconds)
[info]   "[0,10000000d5,5420455441455243,62617420454c4241,414e20200a282031,4e4952545320454d,45485420200a2c47,a29544e49204449,726f20474e495355,6568636170612e67,732e6b726170732e,a6362646a2e6c71,20534e4f4954504f,7462642720200a28,203d2027656c6261,45502e5453455427,200a2c27454c504f,6f77737361702720,2a27203d20276472,2a2a2a2a2a2a2a2a,6574636164657228,2720200a2c272964,27203d20276c7275,2a2a2a2a2a2a2a2a,746361646572282a,20200a2c27296465,3d20277265737527,7355747365742720,a29277265]" did not contain "TEST.PEOPLE" (JDBCSuite.scala:1146)

is not related to the PR, I believe. @gengliangwang @HyukjinKwon @cloud-fan Could you have a look at this, please.

@@ -23,7 +23,7 @@
"message" : [ "Cannot up cast <value> from <sourceType> to <targetType>.\n<details>" ]
},
"CAST_INVALID_INPUT" : {
"message" : [ "The value <value> of the type <sourceType> cannot be cast to <targetType> because it is malformed. To return NULL instead, use `try_cast`. If necessary set <config> to \"false\" to bypass this error." ],
"message" : [ "The value <value> of the type <sourceType> cannot be cast to <targetType> because it is malformed. Correct the syntax for the value before casting it, or change the type to one appropriate for the value. To return NULL instead, use `try_cast`. If necessary set <config> to \"false\" to bypass this error." ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "Correct the syntax for the value"? Do you mean "Correct the values as per the syntax"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "change the type to one appropriate for the value" => "change an appropriate type for the value"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean "Correct the values as per the syntax"?

Yep

nit: "change the type to one appropriate for the value" => "change an appropriate type for the value"?

Maybe just, "change the target type"?

How about:
Correct the value as per the syntax, or change its target type.?

@@ -48,13 +48,13 @@
"FAILED_EXECUTE_UDF" : {
"message" : [ "Failed to execute user defined function (<functionName>: (<signature>) => <result>)" ]
},
"FAILED_PERMISSION_RESET_ORIGINAL" : {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about FAILED_TO_SET_ORIGINAL_PERMISSION_BACK?
I didn't follow the error message improvement project closely. Do we have a naming pattern?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a naming pattern?

no, we don't as far as I know.

Copy link
Member Author

@MaxGekk MaxGekk May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about FAILED_TO_SET_ORIGINAL_PERMISSION_BACK?

It would be nice to have shorter names for error classes since we are going to use them as tags in docs headers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's hard to come up with a pattern that really works.
Some thought:

  • The error class is typically followed by the message. No need for a sentence.. That's what the message does.
  • Short is beautiful
  • Try to avoid stating that an error is an error in the name.. provides no new information. It's an error class! ;-)

How about
RESET_PERMISSION_TO_ORIGINAL
(again we KNOW it's failed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SET_PERMISSION_BACK is shorter than RESET_PERMISSION_TO_ORIGINAL ;-)

@gengliangwang
Copy link
Member

LGTM except some minor comments. Thanks for the work!

@MaxGekk
Copy link
Member Author

MaxGekk commented May 23, 2022

cc @srielau

@MaxGekk
Copy link
Member Author

MaxGekk commented May 24, 2022

Merging to master. Thank you, @gengliangwang and @cloud-fan @srielau for review.

@MaxGekk MaxGekk closed this in 625afb4 May 24, 2022
MaxGekk added a commit to MaxGekk/spark that referenced this pull request May 24, 2022
In the PR, I propose to improve errors of the following error classes:
1. NON_PARTITION_COLUMN - `a non-partition column name` -> `the non-partition column`
2. UNSUPPORTED_SAVE_MODE - `a not existent path` -> `a non existent path`.
3. INVALID_FIELD_NAME. Quote ids to follow the rules apache#36621.
4. FAILED_SET_ORIGINAL_PERMISSION_BACK. It is renamed to FAILED_PERMISSION_RESET_ORIGINAL.
5. NON_LITERAL_PIVOT_VALUES - Wrap error's expression by double quotes. The PR adds new helper method `toSQLExpr()` for that.
6. CAST_INVALID_INPUT - Add the recommendation: `... Correct the syntax for the value before casting it, or change the type to one appropriate for the value.`

To improve user experience with Spark SQL by making error message more clear.

Yes, it changes user-facing error messages.

By running the affected test suites:
```
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsDSv2Suite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsSuite"
$ build/sbt "sql/testOnly *QueryExecutionAnsiErrorsSuite"
$ build/sbt "sql/testOnly *QueryExecutionErrorsSuite"
$ build/sbt "sql/testOnly *QueryParsingErrorsSuite*"
```

Closes apache#36635 from MaxGekk/error-class-improve-msg-3.

Lead-authored-by: Max Gekk <max.gekk@gmail.com>
Co-authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
(cherry picked from commit 625afb4)
Signed-off-by: Max Gekk <max.gekk@gmail.com>
MaxGekk added a commit that referenced this pull request May 25, 2022
### What changes were proposed in this pull request?
In the PR, I propose to improve errors of the following error classes:
1. NON_PARTITION_COLUMN - `a non-partition column name` -> `the non-partition column`
2. UNSUPPORTED_SAVE_MODE - `a not existent path` -> `a non existent path`.
3. INVALID_FIELD_NAME. Quote ids to follow the rules #36621.
4. FAILED_SET_ORIGINAL_PERMISSION_BACK. It is renamed to RESET_PERMISSION_TO_ORIGINAL.
5. NON_LITERAL_PIVOT_VALUES - Wrap error's expression by double quotes. The PR adds new helper method `toSQLExpr()` for that.
6. CAST_INVALID_INPUT - Add the recommendation: `... Correct the syntax for the value before casting it, or change the type to one appropriate for the value.`

This is a backport of #36635.

### Why are the changes needed?
To improve user experience with Spark SQL by making error message more clear.

### Does this PR introduce _any_ user-facing change?
Yes, it changes user-facing error messages.

### How was this patch tested?
By running the affected test suites:
```
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsDSv2Suite"
$ build/sbt "sql/testOnly *QueryCompilationErrorsSuite"
$ build/sbt "sql/testOnly *QueryExecutionAnsiErrorsSuite"
$ build/sbt "sql/testOnly *QueryExecutionErrorsSuite"
$ build/sbt "sql/testOnly *QueryParsingErrorsSuite*"
```

Lead-authored-by: Max Gekk <max.gekkgmail.com>
Co-authored-by: Maxim Gekk <max.gekkgmail.com>
Signed-off-by: Max Gekk <max.gekkgmail.com>
(cherry picked from commit 625afb4)
Signed-off-by: Max Gekk <max.gekkgmail.com>

Closes #36655 from MaxGekk/error-class-improve-msg-3-3.3.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants