-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35343][PYTHON] Make the conversion from/to pandas data-type-based for non-ExtensionDtypes #32592
[SPARK-35343][PYTHON] Make the conversion from/to pandas data-type-based for non-ExtensionDtypes #32592
Conversation
Test build #138714 has finished for PR 32592 at commit
|
Kubernetes integration test starting |
Test build #138717 has finished for PR 32592 at commit
|
Kubernetes integration test status success |
Kubernetes integration test starting |
Kubernetes integration test status success |
b2a6442
to
6af8165
Compare
Test build #139177 has finished for PR 32592 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #139184 has finished for PR 32592 at commit
|
@@ -87,7 +92,6 @@ def __init__(self, dtype: Dtype, spark_type: DataType): | |||
self.spark_type = spark_type | |||
|
|||
@property | |||
@abstractmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abstractmethod
is removed in order to pass mypy checks.
Otherwise,
mypy checks failed:
python/pyspark/pandas/internal.py:1050: error: Cannot instantiate abstract class 'DataTypeOps' with abstract attribute 'pretty_name'
python/pyspark/pandas/internal.py:1441: error: Cannot instantiate abstract class 'DataTypeOps' with abstract attribute 'pretty_name'
Reference mypy issue: python/mypy#1843.
5966987
to
6f0bc5e
Compare
Test build #139186 has finished for PR 32592 at commit
|
Kubernetes integration test starting |
Kubernetes integration test starting |
Kubernetes integration test starting |
Kubernetes integration test status success |
Kubernetes integration test status success |
Kubernetes integration test status failure |
Test build #139185 has finished for PR 32592 at commit
|
Test build #139190 has finished for PR 32592 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status failure |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #139360 has finished for PR 32592 at commit
|
de83d86
to
7c3efca
Compare
Kubernetes integration test unable to build dist. exiting with code: 1 |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #139426 has finished for PR 32592 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks! merging to master. |
What changes were proposed in this pull request?
Make the conversion from/to pandas (for non-ExtensionDtype) data-type-based.
NOTE: Ops class per ExtensionDtype and its data-type-based from/to pandas will be implemented in a separate PR as https://issues.apache.org/jira/browse/SPARK-35614.
Why are the changes needed?
The conversion from/to pandas includes logic for checking data types and behaving accordingly.
That makes code hard to change or maintain.
Since we have introduced the Ops class per non-ExtensionDtype data type, we ought to make the conversion from/to pandas data-type-based for non-ExtensionDtypes.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit tests.
Keyword: SPARK-35337