-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-37812: Expand confusing CHECK_SMALL_INT so return is explicit. #15216
bpo-37812: Expand confusing CHECK_SMALL_INT so return is explicit. #15216
Conversation
It's not obvious when looking at a call site of this macro that it affects the control flow. Move the `return` into the callers so it's explicit there, and factor out the fiddly bit as IS_SMALL_INT. As a bonus, a couple of other spots can also use IS_SMALL_INT with a benefit to clarity.
Objects/longobject.c
Outdated
@@ -41,6 +41,9 @@ PyObject *_PyLong_One = NULL; | |||
-NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive). | |||
*/ | |||
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS]; | |||
|
|||
#define IS_SMALL_INT(ival) (-NSMALLNEGINTS <= (ival) && (ival) < NSMALLPOSINTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make this a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have a patch for that too 🙂 : gnprice@cfc91e6ca
I left that out of this commit in order to change the fewest possible aspects at once, and figured I'd send it as a separate followup provided this one is merged. But I'd also be happy to add it to this PR, if there's agreement that it's desirable; I think it'd fit well as part of the same change.
(Note the "before" side of that commit looks slightly different from the "after" of this one, as there are a couple of other commits in between on that development branch. But it'd be no trouble for me to rebase it to directly on top of this one in order to add to this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if making a function is incoming, this PR lgtm.
I'll let Raymond weight in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go ahead and make it a static inlined function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a somewhat minor change, but let's add a blurb NEWS entry in case anyone cases about these details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Thanks for the review.
Objects/longobject.c
Outdated
@@ -41,6 +41,9 @@ PyObject *_PyLong_One = NULL; | |||
-NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive). | |||
*/ | |||
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS]; | |||
|
|||
#define IS_SMALL_INT(ival) (-NSMALLNEGINTS <= (ival) && (ival) < NSMALLPOSINTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go ahead and make it a static inlined function.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Thanks for the reviews! I have made the requested changes; please review again. |
Thanks for making the requested changes! @rhettinger: please review the changes made to this pull request. |
Using inlined function generates unnecessary machine code than using macro. The use inlined function: _PyLong_FromLong PROC ; COMDAT
; 317 : {
push ebp
mov ebp, esp
and esp, -8 ; fffffff8H
push ecx
mov ecx, DWORD PTR _ival$[ebp]
; 318 : PyLongObject *v;
; 319 : unsigned long abs_ival;
; 320 : unsigned long t; /* unsigned so >> doesn't propagate sign bit */
; 321 : int ndigits = 0;
; 322 : int sign;
; 323 :
; 324 : if (is_small_int(ival)) {
mov eax, ecx
push ebx
push esi
cdq
push edi
xor edi, edi
; 48 : return -NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS;
add eax, 5
adc edx, edi
test edx, edx
ja SHORT $LN6@PyLong_Fro
jb SHORT $LN32@PyLong_Fro
cmp eax, 261 ; 00000105H
ja SHORT $LN6@PyLong_Fro
$LN32@PyLong_Fro:
; 60 : v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];
movsx eax, cx
shl eax, 4
add eax, OFFSET _small_ints+80
; File d:\dev\cpython\include\object.h
; 459 : op->ob_refcnt++;
inc DWORD PTR [eax]
; File d:\dev\cpython\objects\longobject.c
; 383 : }
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 0 use macro: _PyLong_FromLong PROC ; COMDAT
; 318 : {
push ebp
mov ebp, esp
mov eax, DWORD PTR _ival$[ebp]
push edi
; 319 : PyLongObject *v;
; 320 : unsigned long abs_ival;
; 321 : unsigned long t; /* unsigned so >> doesn't propagate sign bit */
; 322 : int ndigits = 0;
xor edi, edi
; 323 : int sign;
; 324 :
; 325 : if (IS_SMALL_INT(ival)) {
cmp eax, -5 ; fffffffbH
jl SHORT $LN6@PyLong_Fro
cmp eax, 257 ; 00000101H
jge SHORT $LN6@PyLong_Fro
; 60 : v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];
cwde
shl eax, 4
add eax, OFFSET _small_ints+80
pop edi
; File d:\dev\cpython\include\object.h
; 459 : op->ob_refcnt++;
inc DWORD PTR [eax]
; File d:\dev\cpython\objects\longobject.c
; 384 : }
pop ebp
ret 0 |
Interesting, thanks! It looks like several compilers produce code much like this when compiling for x86-32. Here's a Compiler Explorer example I just created: On the other hand, as you can see in the same Compiler Explorer example (in the upper-right pane), VC2017 compiling for x86-64 produces essentially identical code for the macro version and the static-inline-function version. So does Clang; and in fact, because in my example both are present, GCC notices they're identical and makes the macro version just jump immediately to the function version 😄 I believe the longer code represents a missed optimization by the compilers; I don't think there's any semantic reason why (on any architecture) they have to emit anything different for the function version vs. the macro version. In any case, stepping back: we're talking about a handful of additional instructions, of the kinds a CPU is very fast at (no random reads from memory, which can take thousands of cycles and which the interpreter by its nature is full of); also only in 32-bit builds, not 64-bit. It's interesting to see the code that's generated and study it... but convincing the compiler to generate one of these versions over another is very much a micro-optimization, the kind that generally isn't worth more than a very small delta in the simplicity of the code. For more fun, I just went and added to the Compiler Explorer example a version with With So, switching from |
If build a 32-bit version with gcc 9.2, it also generates unnecessary machine code (convert to Same with ARM gcc 8.2: https://godbolt.org/z/qycDzR
|
In this case, the
I know why, on 64-bit Windows, |
No, compilers have a lot more freedom than that. The code just needs to compute the right answer. The types in the source are part of defining what the right answer is, but there doesn't need to be anything at runtime that physically looks like them (in a case like this where they aren't part of any external interface.) Here's a quick demo of that: |
If one day I have no opinion on the use of inlined function or macro here. |
That's true (provided |
It seems that all the major compilers have generated unnecessary code. |
It's not obvious when looking at a call site of this macro that it
affects the control flow. Move the
return
into the callers soit's explicit there, and factor out the fiddly bit as IS_SMALL_INT.
As a bonus, a couple of other spots can also use IS_SMALL_INT with a
benefit to clarity.
https://bugs.python.org/issue37812