Skip to content

Conversation

@asl3
Copy link
Contributor

@asl3 asl3 commented Dec 3, 2025

What changes were proposed in this pull request?

After #53299, explicitly set conf spark.sql.execution.pandas.structHandlingMode to row. This is needed because when Arrow optimization was previously disabled, structHandlingMode converted to Row object by default, but when Arrow optimization is enabled, it converts to dict or raise an Exception if duplicated nested field names.

To match the docs behavior after enabling arrow by default, we explicitly set this conf to row.

Why are the changes needed?

Fix pyspark-pandas doctest and remove the skip of doctests

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI running pyspark-pandas doctest

Was this patch authored or co-authored using generative AI tooling?

No

@asl3 asl3 changed the title [SPARK-54555][PYTHON][TESTS][FOLLOW-UP] Fix pyspark-pandas doctest [SPARK-54555][PYTHON][TESTS] Set spark.sql.execution.pandas.structHandlingMode in pyspark pandas doctest Dec 3, 2025
Copy link
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending tests.

@asl3 asl3 force-pushed the pysparkpandasdoctest branch from c776daf to b5b9675 Compare December 3, 2025 15:17
@asl3 asl3 requested a review from zhengruifeng December 3, 2025 15:35
@zhengruifeng
Copy link
Contributor

merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants