Skip to content

[SPARK-55662][PS] Implementation of idxmin Axis argument#54455

Open
devin-petersohn wants to merge 2 commits intoapache:masterfrom
devin-petersohn:devin/idxmin_axis_v2
Open

[SPARK-55662][PS] Implementation of idxmin Axis argument#54455
devin-petersohn wants to merge 2 commits intoapache:masterfrom
devin-petersohn:devin/idxmin_axis_v2

Conversation

@devin-petersohn
Copy link
Contributor

What changes were proposed in this pull request?

Add axis=1 support for DataFrame.idxmin, matching the existing idxmax axis=1 implementation.

Why are the changes needed?

Implements missing API parameter

Does this PR introduce any user-facing change?

Yes, new parameter

How was this patch tested?

CI

Was this patch authored or co-authored using generative AI tooling?

Co-authored-by: Claude Opus 4

Signed-off-by: Devin Petersohn <devin.petersohn@gmail.com>
Co-authored-by: Devin Petersohn <devin.petersohn@snowflake.com>
@devin-petersohn devin-petersohn changed the title [SPARK-46168][PS] Implementation of idxmin Axis argument [SPARK-55662][PS] Implementation of idxmin Axis argument Feb 24, 2026
@HyukjinKwon
Copy link
Member

cc @ueshin @gaogaotiantian FYI

column_labels = self._internal.column_labels

if len(column_labels) == 0:
return ps.Series([], dtype=np.int64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems different from idxmax. Could you confirm?

)

result = None
for label in reversed(column_labels):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a comment here?

Comment on lines +12569 to +12572
sdf = self._internal.spark_frame.select(
*self._internal.index_spark_columns,
result.alias(SPARK_DEFAULT_SERIES_NAME),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also use self._internal.with_new_columns to avoid creating a new Spark DataFrame?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants