-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
GH-105229: Replace some superinstructions with single instruction equivalent. #105230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if these need a name, but suggestions for a better name are welcome.
"Combined instructions", maybe?
@@ -222,6 +222,9 @@ def pseudo_op(name, op, real_ops): | |||
def_op('DICT_MERGE', 164) | |||
def_op('DICT_UPDATE', 165) | |||
|
|||
def_op('LOAD_FAST_LOAD_FAST', 168) | |||
def_op('STORE_FAST_LOAD_FAST', 169) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, how are the stats on this? Is it worth keeping?
I would expect most of these pairs to span multiple lines. Although maybe comprehensions and generator expressions make up for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, as you say it might be quite low.
I think we should keep it for this PR anyway, as we are replacing two-codeunit superinstructions with the one-codeunit equivalent. We can change the choice of superinstructions in another PR.
@@ -0,0 +1 @@ | |||
Replace some dynamic superinstructions will single instruction equivalents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace some dynamic superinstructions will single instruction equivalents. | |
Replace some dynamic superinstructions with single instruction equivalents. |
@@ -1586,6 +1586,54 @@ optimize_cfg(cfg_builder *g, PyObject *consts, PyObject *const_cache) | |||
return SUCCESS; | |||
} | |||
|
|||
static void | |||
make_super_instruction(cfg_instr *inst1, cfg_instr *inst2, int super_op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_super_instruction(cfg_instr *inst1, cfg_instr *inst2, int super_op) | |
make_combined_instruction(cfg_instr *inst1, cfg_instr *inst2, int combined_op) |
@@ -0,0 +1 @@ | |||
Replace some dynamic superinstructions will single instruction equivalents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace some dynamic superinstructions will single instruction equivalents. | |
Replace some dynamic superinstructions with single instruction equivalents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. Crossed posts.
Python/flowgraph.c
Outdated
/* Skip if instructions are on different lines */ | ||
if (inst1->i_loc.lineno != inst2->i_loc.lineno) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a comment from @brandtbucher that this requirement eliminated most super-instructions (faster-cpython/ideas#585 (comment)). And yet your pyperformance results on this PR are quite good. I wonder how these two things are both true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instructions without observable side effects (e.g. LOAD_FAST
, or maybe POP_TOP
), can't we still do a super instruction and keep the NOP
for the extra lineno?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a comment
Oh, I realize now that this comment was specific to STORE_FAST__LOAD_FAST
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how these two things are both true.
Oh, I guess the most likely answer is in faster-cpython/ideas#585 (comment) :
it appears that the reduction in code size more than compensates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is just my hypothesis.
The goal is to remove superinstructions that cross block and line boundaries, as they are problematic.
I was just trying to not slow things down by removing them. A speedup was just a pleasant surprise.
Even if the speedup is mainly from layout changes, or cache effects, the main motivation of this PR remains valid.
Shows a surprising, but pleasing, 1.7% speedup.
These are common instructions, so the reduction in code size and memory reads is considerable.
In faster-cpython/ideas#585 I refer these to as "single instruction superinstructions", which is a rather clumsy name. I'm not sure if these need a name, but suggestions for a better name are welcome.
I've left the superinstructions involving
LOAD_CONST
alone, as we are looking at other approaches to speed up constants.