-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support comparing two sets of pystats #98816
Conversation
This adds support for comparing pystats collected from two different builds. - The `--json-output` can be used to load in a set of raw stats and output a JSON file. - Two of these JSON files can be provided on the next run, and then comparative results between the two are output. The refactor required is basically to: - Separate out the building of table contents from emitting the table - Call these new functions from functions designed for either single results or comparative results Part of the work for: faster-cpython/tools#115
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that you've checked that this produces the same output.
return [] | ||
|
||
if len(a_rows): | ||
a_ncols = list(set(len(x) for x in a_rows)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the list(set(...))
, wouldn't set(...)
be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further down, I want to get the single value out of the set. (a_ncols[0]
)
Tools/scripts/summarize_stats.py
Outdated
ncols = b_ncols[0] | ||
|
||
default = [""] * (ncols - 1) | ||
a_data = dict((x[0], x[1:]) for x in a_rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a_data = { x[0]: x[1:] for x in x in a_rows}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
Tools/scripts/summarize_stats.py
Outdated
|
||
default = [""] * (ncols - 1) | ||
a_data = dict((x[0], x[1:]) for x in a_rows) | ||
b_data = dict((x[0], x[1:]) for x in b_rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding a check for duplicate keys? len(a_data) == len(a_rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
@@ -377,8 +552,7 @@ def emit_pair_counts(opcode_stats, total): | |||
succ_rows | |||
) | |||
|
|||
def main(): | |||
stats = gather_stats() | |||
def output_single_stats(stats): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using the "emit_" and "output_" prefixes interchangeably, or is there a difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's keeping the naming from the original code (which is @markshannon's), which has output_stats
as a top-level function (which I split into three). I guess the difference is that output_
is these top-level functions, whereas each of the emit_
functions emits a single section. But we certainly could use emit_
everywhere.
Yep. And you can see the comparative output, and the single output on my prototype PR. |
This adds support for comparing pystats collected from two different builds.
--json-output
can be used to load in a set of raw stats and output a JSON file.The refactor required is basically to:
Part of the work for: faster-cpython/tools#115
See mdboom#3 for a prototype of where this is possibly headed in a Github Action.