Skip to content

Conversation

@bdraco
Copy link

@bdraco bdraco commented Feb 21, 2025

main 192,792,792ns
this PR 168,216,041ns

                            Benchmark Results                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃             Benchmark ┃   Time (best) ┃ Rel. StdDev ┃ Run time ┃ Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ test_large_parse_yaml │ 192,792,792ns │        0.9% │    2.94s │    15 │
└───────────────────────┴───────────────┴─────────────┴──────────┴───────┘
                            Benchmark Results                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃             Benchmark ┃   Time (best) ┃ Rel. StdDev ┃ Run time ┃ Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ test_large_parse_yaml │ 168,216,041ns │        1.3% │    2.93s │    17 │
└───────────────────────┴───────────────┴─────────────┴──────────┴───────┘

Benchmark https://github.com/bdraco/pyyaml/blob/bench/tests_bench/benchmarks/test_simple_load.py (execute with pytest --codspeed tests_bench/benchmarks/test_simple_load.py)

Cython analysis via cythonize -X language_level=3 -a -i yaml/_yaml.pyx

Before
Screenshot 2025-02-21 at 12 07 34 PM

After
Screenshot 2025-02-21 at 12 07 56 PM

@nitzmahone
Copy link
Member

Thanks, this looks like some low-hanging fruit for a nice speed boost! I don't love changing all the Cython callsites to use the two-call pattern, though... I might try playing with a cdef factory method that could accomplish the same thing with a single callsite (or feel free to do so yourself if you're interested!)- IIRC we've got a lot of flexibility there since we're not worried about non-Cython subclasses of this Mark.

@bdraco
Copy link
Author

bdraco commented May 16, 2025

I didn't make a helper because Cython would end up ref counting the Mark object being passed around in the helper.. It might not make much difference though. Will test

@bdraco
Copy link
Author

bdraco commented May 16, 2025

Still a nice speed up even with the helper. Note that 3.1 got a bit faster already since I opened this PR

With Cython 3.1 main (new baseline)

                           Benchmark Results                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃             Benchmark ┃   Time (best) ┃ Rel. StdDev ┃ Run time ┃ Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ test_large_parse_yaml │ 177,877,584ns │        0.7% │    2.88s │    16 │
└───────────────────────┴───────────────┴─────────────┴──────────┴───────┘

With Cython 3.1 64e40d89ddb54b192d84132e6540fcc6553a8050 (original commit)

                          Benchmark Results                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃             Benchmark ┃   Time (best) ┃ Rel. StdDev ┃ Run time ┃ Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ test_large_parse_yaml │ 159,839,750ns │        1.4% │    2.94s │    18 │
└───────────────────────┴───────────────┴─────────────┴──────────┴───────┘

With Cython 3.1 (added _create_mark) 964c641aad18b5df3ef92b9202a7b82283939d9e (original commit)

                            Benchmark Results                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃             Benchmark ┃   Time (best) ┃ Rel. StdDev ┃ Run time ┃ Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ test_large_parse_yaml │ 162,833,125ns │        0.8% │    2.97s │    18 │
└───────────────────────┴───────────────┴─────────────┴──────────┴───────┘

@bdraco
Copy link
Author

bdraco commented May 16, 2025

Pushed another one that a bit different. Single call point but it does have a bit more maintenance burden. 688ebe7

                            Benchmark Results                             
┏━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃             Benchmark ┃   Time (best) ┃ Rel. StdDev ┃ Run time ┃ Iters ┃
┡━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ test_large_parse_yaml │ 155,708,875ns │        0.7% │    2.84s │    18 │
└───────────────────────┴───────────────┴─────────────┴──────────┴───────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants