[SPARK-55713][PYTHON][TESTS] Add benchmark for long type conversions by zhengruifeng · Pull Request #54513 · apache/spark

zhengruifeng · 2026-02-26T11:11:00Z

What changes were proposed in this pull request?

Add benchmark for long type conversions

Why are the changes needed?

to check the performance of critical code path

Does this PR introduce any user-facing change?

no, test-only

How was this patch tested?

manually check for now, the ASV is not yet set up in CI

(spark-dev-313) ➜  benchmarks git:(update_benchmark_null_int) asv run --python=same --quick -b 'bench_arrow.LongArrowToPandasBenchmark'
· Discovering benchmarks
· Running 2 total benchmarks (1 commits * 1 environments * 2 benchmarks)
[ 0.00%] ·· Benchmarking existing-py_Users_ruifeng.zheng_.dev_miniconda3_envs_spark-dev-313_bin_python3.13
[25.00%] ··· bench_arrow.LongArrowToPandasBenchmark.peakmem_long_to_pandas                                                 ok
[25.00%] ··· ========= ======== ==================== ===========
             --                          method                 
             --------- -----------------------------------------
               n_rows   simple   arrow_types_mapper   pd.Series 
             ========= ======== ==================== ===========
               10000     109M           110M             111M   
               100000    117M           115M             110M   
              1000000    164M           165M             165M   
             ========= ======== ==================== ===========

[50.00%] ··· bench_arrow.LongArrowToPandasBenchmark.time_long_to_pandas                                                    ok
[50.00%] ··· ========= ========= ==================== ===========
             --                          method                  
             --------- ------------------------------------------
               n_rows    simple   arrow_types_mapper   pd.Series 
             ========= ========= ==================== ===========
               10000    131±0μs        310±0μs          162±0μs  
               100000   134±0μs        482±0μs          173±0μs  
              1000000   155±0μs        1.35±0ms         273±0μs  
             ========= ========= ==================== ===========

(spark-dev-313) ➜  benchmarks git:(update_benchmark_null_int) asv run --python=same --quick -b 'bench_arrow.NullableLongArrowToPandasBenchmark'
· Discovering benchmarks
· Running 2 total benchmarks (1 commits * 1 environments * 2 benchmarks)
[ 0.00%] ·· Benchmarking existing-py_Users_ruifeng.zheng_.dev_miniconda3_envs_spark-dev-313_bin_python3.13
[25.00%] ··· bench_arrow.NullableLongArrowToPandasBenchmark.peakmem_long_with_nulls_to_pandas_ext                          ok
[25.00%] ··· ========= ====================== ==================== ===========
             --                                 method                        
             --------- -------------------------------------------------------
               n_rows   integer_object_nulls   arrow_types_mapper   pd.Series 
             ========= ====================== ==================== ===========
               10000            110M                  110M             108M   
               100000           132M                  115M             113M   
              1000000           246M                  201M             201M   
             ========= ====================== ==================== ===========

[50.00%] ··· bench_arrow.NullableLongArrowToPandasBenchmark.time_long_with_nulls_to_pandas_ext                             ok
[50.00%] ··· ========= ====================== ==================== ===========
             --                                 method                        
             --------- -------------------------------------------------------
               n_rows   integer_object_nulls   arrow_types_mapper   pd.Series 
             ========= ====================== ==================== ===========
               10000          1.49±0ms              1.81±0ms         3.19±0ms 
               100000         13.2±0ms              12.2±0ms         30.0±0ms 
              1000000         158±0ms               123±0ms          296±0ms  
             ========= ====================== ==================== ===========

Was this patch authored or co-authored using generative AI tooling?

no

zhengruifeng · 2026-02-26T11:13:45Z

also cc @fangchenli

zhengruifeng added 2 commits February 26, 2026 19:04

test

b8768ed

test

eb41ee8

zhengruifeng requested review from HyukjinKwon and dongjoon-hyun February 26, 2026 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-55713][PYTHON][TESTS] Add benchmark for long type conversions#54513

[SPARK-55713][PYTHON][TESTS] Add benchmark for long type conversions#54513
zhengruifeng wants to merge 2 commits intoapache:masterfrom
zhengruifeng:update_benchmark_null_int

zhengruifeng commented Feb 26, 2026

Uh oh!

zhengruifeng commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhengruifeng commented Feb 26, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

zhengruifeng commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant