Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-127936: convert marshal module to use import/export API for ints (PEP 757) #128530

Merged
merged 14 commits into from
Jan 23, 2025

Conversation

skirpichev
Copy link
Member

@skirpichev skirpichev commented Jan 6, 2025

Benchmark ref-dumps patch-dumps
dumps 1<<38 628 ns 640 ns: 1.02x slower
dumps 1<<300 635 ns 696 ns: 1.10x slower
dumps 1<<3000 2.02 us 2.28 us: 1.13x slower
Geometric mean (ref) 1.06x slower

Benchmark hidden because not significant (1): dumps 1<<7

Benchmark ref-loads patch-loads
loads 1<<7 303 ns 311 ns: 1.03x slower
loads 1<<38 334 ns 388 ns: 1.16x slower
loads 1<<300 516 ns 605 ns: 1.17x slower
loads 1<<3000 2.08 us 2.56 us: 1.23x slower
Geometric mean (ref) 1.15x slower
scripts
# bench-dumps.py

from marshal import dumps
import pyperf

values = ['1<<7', '1<<38', '1<<300', '1<<3000']
runner = pyperf.Runner()
for v in values:
    i = eval(v)
    bn = 'dumps '+v
    runner.bench_func(bn, dumps, i)
# bench-loads.py

from marshal import loads, dumps
import pyperf

values = ['1<<7', '1<<38', '1<<300', '1<<3000']
runner = pyperf.Runner()
for v in values:
    d = dumps(eval(v))
    bn = 'loads '+v
    runner.bench_func(bn, loads, d)

@skirpichev skirpichev force-pushed the port-marshal-to-pep757/127936 branch from 1ab0c30 to 4bd6b0c Compare January 7, 2025 05:06
@skirpichev skirpichev marked this pull request as ready for review January 7, 2025 05:34
@skirpichev
Copy link
Member Author

CC @vstinner

I suspect that major slowdown for marshal.loads() benchmarks is due to normalization&singletonization in the PyLongWriter_Finish(). In case of marshal.dumps() I suspect things were better without the value field.

Python/marshal.c Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
@skirpichev skirpichev requested a review from vstinner January 8, 2025 07:53
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm (just) a bit unfortunate that the using PEP 757 makes the code slower.

@serhiy-storchaka @picnixz: Would you mind to review this change?

@vstinner
Copy link
Member

cc @picnixz @serhiy-storchaka

Python/marshal.c Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
skirpichev and others added 2 commits January 14, 2025 17:31
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Outdated Show resolved Hide resolved
skirpichev and others added 3 commits January 14, 2025 17:43
@skirpichev skirpichev requested a review from picnixz January 14, 2025 15:23
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of assertions to add and we're good I think. The most important assertion is assert(layout->bits_per_digit >= PyLong_MARSHAL_SHIFT); because marshal_ratio is 0 otherwise.

Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
@skirpichev skirpichev requested a review from picnixz January 15, 2025 03:52
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still appreciate an assert on any T that does digits[T - 1] for those who only read the macros (or for reviewers) even though you're asserting that at the caller site. The assertions cost nothing and it's semantically better to have assertions verified once you've called the function before calling it (for debugging it's easier to put them before the call so that you know which caller was the bad caller, however I think that assertions should be logically placed in the callee; thus I would put them at both places).

Now, I won't block on this nitpick since it's a matter of taste.

@skirpichev
Copy link
Member Author

I would still appreciate an assert on any T that does digits[T - 1] for those who only read the macros (or for reviewers) even though you're asserting that at the caller site.

Of course, assert's costs nothing, but I doubt we need this redundancy. Macros code not too big to see obvious assumptions on n/size arguments.

So, I'll add suggestions and postpone decision to other reviewers.

Python/marshal.c Show resolved Hide resolved
Python/marshal.c Show resolved Hide resolved
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
@vstinner
Copy link
Member

@skirpichev: Can you just add assert(size >= 1);? I don't want to apply your suggestion because of a missing space: size>= 1.

@vstinner
Copy link
Member

Of course, assert's costs nothing, but I doubt we need this redundancy. Macros code not too big to see obvious assumptions on n/size arguments.

While the code is obvious to you, assertions can help reviewers, who are not used to the code, to understand it. It's not only to check for correctness at runtime.

@skirpichev
Copy link
Member Author

Can you just add assert(size >= 1);?

Done. Feel free just to edit my pr/suggestions in similar cases.

While the code is obvious to you, assertions can help reviewers

Well, just the next "for" line, IMO, revival code assumptions for most code readers.

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
@vstinner vstinner enabled auto-merge (squash) January 23, 2025 02:07
@vstinner vstinner merged commit d7d066c into python:main Jan 23, 2025
40 checks passed
@skirpichev skirpichev deleted the port-marshal-to-pep757/127936 branch January 23, 2025 02:57
@vstinner
Copy link
Member

Merged. Congrats @skirpichev, this one was tricky :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants