forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-49713][PYTHON][CONNECT] Make function
count_min_sketch
accep…
…t number arguments ### What changes were proposed in this pull request? 1, Make function `count_min_sketch` accept number arguments; 2, Make argument `seed` optional; 3, fix the type hints of `eps/confidence/seed` from `ColumnOrName` to `Column`, because they require a foldable value and actually do not accept column name: ``` In [3]: from pyspark.sql import functions as sf In [4]: df = spark.range(10000).withColumn("seed", sf.lit(1).cast("int")) In [5]: df.select(sf.hex(sf.count_min_sketch("id", sf.lit(0.5), sf.lit(0.5), "seed"))) ... AnalysisException: [DATATYPE_MISMATCH.NON_FOLDABLE_INPUT] Cannot resolve "count_min_sketch(id, 0.5, 0.5, seed)" due to data type mismatch: the input `seed` should be a foldable "INT" expression; however, got "seed". SQLSTATE: 42K09; 'Aggregate [unresolvedalias('hex(count_min_sketch(id#1L, 0.5, 0.5, seed#2, 0, 0)))] +- Project [id#1L, cast(1 as int) AS seed#2] +- Range (0, 10000, step=1, splits=Some(12)) ... ``` ### Why are the changes needed? 1, seed is optional in other similar functions; 2, existing type hint is `ColumnOrName` which is misleading since column name is not actually supported ### Does this PR introduce _any_ user-facing change? yes, it support number arguments ### How was this patch tested? updated doctests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#48157 from zhengruifeng/py_fix_count_min_sketch. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
- Loading branch information
1 parent
ca726c1
commit a5ac80a
Showing
3 changed files
with
77 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters