Skip to content

Introduce -Zmacro-stats #142069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

nnethercote
Copy link
Contributor

Introduce -Zmacro-stats.

It collects data about macro expansions and prints them in a table after expansion finishes. It's very useful for detecting macro bloat, especially for proc macros.

r? @petrochenkov

They will be used in a subsequent commit.
It currently only inserts separators into `usize`s, because that's all
that has been needed so far. `-Zmacro-stats` will need `isize` and `f64`
handling, so this commit adds that.
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@nnethercote
Copy link
Contributor Author

I used this recently on a non-open-source codebase with ~100,000 lines of Rust code. cargo expand produced ~325,000 lines, due to heavy proc macro usage. Avoiding some unnecessary proc macro usage got this down to ~205,000 lines, reducing compile times by ~20%.

@lqd
Copy link
Member

lqd commented Jun 5, 2025

-Zself-profile has the expand_proc_macro activity, didn't that work?

@Kobzol
Copy link
Contributor

Kobzol commented Jun 5, 2025

This sounds awesome!

@lqd recently let me know about the fact that -Zself-profile can actually be used to record how many times was a specific proc macro invoked (#95739), and I wanted to use that in a cargo command for profiling builds, but the thing that was missing was how many lines of code did the macro generate. With -Zmacro-stats that information would be available, which is awesome (although for programmatic usage, I'd probably want a JSON based output, but that can be added later if needed).

@nnethercote
Copy link
Contributor Author

-Zself-profile has the expand_proc_macro activity, didn't that work?

I hadn't heard about that...

@Kobzol
Copy link
Contributor

Kobzol commented Jun 5, 2025

FWIW I tried to use it, and it gives me the durations of proc macro executions for individual macros (e.g. #[derive(Serialize, Deserialize)] would give me time for expanding Serialize and time for expanding Deserialize. However, it only seemed to work for actual proc macros, not things like #[derive(Hash)], and it doesn't contain information about how much code was generated. It also requires using -Zself-profile, which can make builds way slower than they normally are.

@lqd
Copy link
Member

lqd commented Jun 5, 2025

builtin derives are not proc-macros, but you're more than welcome to add their expansion to the self-profiler :)

code-size is an interesting metric but it's not always correlated with compile times which you're interested in optimizing: you still want expansion times (example: cranelift-codegen has two huge functions of the same size, one of which takes 8ms in dataflow, the other 2.5s)

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@Kobzol
Copy link
Contributor

Kobzol commented Jun 5, 2025

builtin derives are not proc-macros, but you're more than welcome to add their expansion to the self-profiler :)

Right, that was my first question on this PR, if we can just add it to -Zself-profile without a new flag :) But I think that the richness of the potentially useful metadata that we could produce here might not be a good fit for the quite limited data model of the self-profile.

code-size is an interesting metric but it's not always correlated with compile times which you're interested in optimizing: you still want expansion times (example: cranelift-codegen has two huge functions of the same size, one of which takes 8ms in dataflow, the other 2.5s)

It's a very rough approximation, of course. Ideally, I would like to see "this macro generated code that took the compiler 2.5s to typecheck", but I don't suppose that's really feasible easily today, so I'll take what I can get :)

@lqd
Copy link
Member

lqd commented Jun 5, 2025

AFAIK there's nothing inherent to the data model that would prevent adding information to correlate between events of different queries, if that's what was needed for nick's project or your query (e.g. if compilation time was not in expansion but elsewhere). But this is getting offtopic so I'll stop here.

Ty(P<ast::Ty>) { "type"; one fn visit_ty; fn visit_ty; fn make_ty; }
Expr(P<ast::Expr>) {
"expression";
one fn visit_expr; fn visit_expr; fn pprust::expr_to_string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
one fn visit_expr; fn visit_expr; fn pprust::expr_to_string;
one fn visit_expr; fn visit_expr; pprust::expr_to_string;

fn is used because this macro generates definitions for all these functions and it's convenient if the definitions can be found by searching for fn foo, but this macro doesn't define pprust::expr_to_string.

Style nit: thus it would look better on the same line with "expression";.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn is used because this macro generates definitions for all these functions

No, it doesn't, it just calls them. We could get rid of all the fns, they aren't necessary.

As for the position, I put these three on the same line because they're the three that get grouped with the one or the many.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if there are no definitions then the other fns can also be removed.


// This code is useful for debugging `-Zmacro-stats`. For every
// invocation it prints the full input and output.
if false {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo: not forget to remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep it. I used it a lot while writing and testing this PR, and there's a good chance it'll be needed in the future. I don't want to have to recreate it.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 5, 2025
It collects data about macro expansions and prints them in a table after
expansion finishes. It's very useful for detecting macro bloat,
especially for proc macros.

Details:
- It measures code snippets by pretty-printing them and then measuring
  lines and bytes. This required a bunch of additional pretty-printing
  plumbing, in `rustc_ast_pretty` and `rustc_expand`.
- The measurement is done in `MacroExpander::expand_invoc`.
- The measurements are stored in `ExtCtxt::macro_stats`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants