Skip to content

Conversation

@ueshin
Copy link
Member

@ueshin ueshin commented Feb 9, 2023

What changes were proposed in this pull request?

Adds details to non-fatal errors to raise a proper exception in the Python client, which makes df.sample raise IllegalArgumentException as same as PySpark, except for the timing that is delayed to call actions.

Why are the changes needed?

Currently SparkConnectService does not add details for NonFatal exceptions to the PRCStatus, so the Python client can't detect the exception properly and raises SparkConnectGrpcException instead.

It also should have the details for the Python client.

Does this PR introduce any user-facing change?

Users will see a proper exception when they call df.sample with illegal arguments, but in a different timing.

How was this patch tested?

Enabled DataFrameParityTests.test_sample.

@ueshin
Copy link
Member Author

ueshin commented Feb 9, 2023

@HyukjinKwon
Copy link
Member

Connect tests passed. Let me merge this one (so I can update my Pr).

Merged to master and branch-3.4.

HyukjinKwon pushed a commit that referenced this pull request Feb 9, 2023
…per exception in the Python client

### What changes were proposed in this pull request?

Adds details to non-fatal errors to raise a proper exception in the Python client, which makes `df.sample` raise `IllegalArgumentException` as same as PySpark, except for the timing that is delayed to call actions.

### Why are the changes needed?

Currently `SparkConnectService` does not add details for `NonFatal` exceptions to the `PRCStatus`, so the Python client can't detect the exception properly and raises `SparkConnectGrpcException` instead.

It also should have the details for the Python client.

### Does this PR introduce _any_ user-facing change?

Users will see a proper exception when they call `df.sample` with illegal arguments, but in a different timing.

### How was this patch tested?

Enabled `DataFrameParityTests.test_sample`.

Closes #39957 from ueshin/issues/SPARK-42338/sample.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit ced6750)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
…per exception in the Python client

### What changes were proposed in this pull request?

Adds details to non-fatal errors to raise a proper exception in the Python client, which makes `df.sample` raise `IllegalArgumentException` as same as PySpark, except for the timing that is delayed to call actions.

### Why are the changes needed?

Currently `SparkConnectService` does not add details for `NonFatal` exceptions to the `PRCStatus`, so the Python client can't detect the exception properly and raises `SparkConnectGrpcException` instead.

It also should have the details for the Python client.

### Does this PR introduce _any_ user-facing change?

Users will see a proper exception when they call `df.sample` with illegal arguments, but in a different timing.

### How was this patch tested?

Enabled `DataFrameParityTests.test_sample`.

Closes apache#39957 from ueshin/issues/SPARK-42338/sample.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit ced6750)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants