Skip to content

rustdoc: Add DocVisitor and use it where possible #90475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 4, 2021

Conversation

camelid
Copy link
Member

@camelid camelid commented Nov 1, 2021

DocFolder allows transforming the docs, accomplished by making its methods take and return types by-value. However, several of the rustdoc DocFolder impls only visit the docs; they don't change anything. Passing around types by-value is thus unnecessary, confusing, and potentially inefficient for those impls.

DocVisitor is very similar to DocFolder, except that its methods take shared references and return nothing (i.e., the unit type). This should both be more efficient and make the code clearer.

There is an additional reason to add DocVisitor, too. As part of my cleanup of external_traits, I'm planning to add a fn cache(&mut self) -> &mut Cache method to DocFolder so that external_traits can be retrieved explicitly from the Cache, rather than implicitly via Crate.external_traits (which is an Rc<RefCell<...>>). However, some of the DocFolder impls that could be turned into DocVisitor impls only have a shared reference to the Cache, because they are used during rendering. (They have to access the Cache via html::render::Context.shared.cache, which involves an Rc.)

Since DocVisitor does not mutate any of the types it's visiting, its equivalent cache() method will only need a shared reference to the Cache, avoiding the problem described above.

r? @GuillaumeGomez
cc @jyn514

Also, contrary to the comment, the clone is not that small, since
`Variant` contains `Item`s, which are quite large when you factor in
both stack- and heap-allocated memory.
This simplifies the code and future-proofs it against changes to
`Variant`.
`DocFolder` allows transforming the docs, accomplished by making its
methods take and return types by-value. However, several of the rustdoc
`DocFolder` impls only *visit* the docs; they don't change anything.
Passing around types by-value is thus unnecessary, confusing, and
potentially inefficient for those impls.

`DocVisitor` is very similar to `DocFolder`, except that its methods
take shared references and return nothing (i.e., the unit type). This
should both be more efficient and make the code clearer.

There is an additional reason to add `DocVisitor`, too. As part of my
cleanup of `external_traits`, I'm planning to add a `fn cache(&mut self)
-> &mut Cache` method to `DocFolder` so that `external_traits` can be
retrieved explicitly from the `Cache`, rather than implicitly via
`Crate.external_traits` (which is an `Rc<RefCell<...>>`). However, some
of the `DocFolder` impls that could be turned into `DocVisitor` impls
only have a shared reference to the `Cache`, because they are used
during rendering. (They have to access the `Cache` via
`html::render::Context.shared.cache`, which involves an `Rc`.)

Since `DocVisitor` does not mutate any of the types it's visiting, its
equivalent `cache()` method will only need a shared reference to the
`Cache`, avoiding the problem described above.
One of the FIXMEs is irrelevant since that code is only run if
`include_sources` is set. I fixed the other FIXME.
* Flip conjuncts of `&&` in rustdoc

  The `CrateNum` comparison should be very cheap, while
  `span.filename()` fetches and clones a `FileName`.

* Use `into_local_path()` instead of `local_path().clone()`
Many of `DocFolder`'s impls didn't actually transform the syntax tree,
so they can be visitors instead.
I think these are the last of the impls that can be easily converted to visitors.
Until `external_traits` is cleaned up (i.e., no longer behind a
`RefCell`), `DocVisitor` will have to `take` `external_traits` -- just
like `DocFolder` -- to prevent `RefCell` runtime errors.
@camelid camelid added C-cleanup Category: PRs that clean code up or issues documenting cleanup. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Nov 1, 2021
@rust-highfive
Copy link
Contributor

Some changes occurred in intra-doc-links.

cc @jyn514

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 1, 2021
@camelid
Copy link
Member Author

camelid commented Nov 1, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 1, 2021
@bors
Copy link
Collaborator

bors commented Nov 1, 2021

⌛ Trying commit c03cab3 with merge d9a63bc4a59bd86b29b0070be1dbfe3ace5c67b1...

/// don't override!
fn visit_item_recur(&mut self, item: &Item) {
match &*item.kind {
StrippedItem(i) => self.visit_inner_recur(i),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be moved into visit_inner_recur instead of having the unreachable!. (This is also the case in DocFolder.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for visit_inner_recur to not be merged into visit_item_recur ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know of one, but it could be some of the code relies on them being separate.

@bors
Copy link
Collaborator

bors commented Nov 1, 2021

☀️ Try build successful - checks-actions
Build commit: d9a63bc4a59bd86b29b0070be1dbfe3ace5c67b1 (d9a63bc4a59bd86b29b0070be1dbfe3ace5c67b1)

@rust-timer
Copy link
Collaborator

Queued d9a63bc4a59bd86b29b0070be1dbfe3ace5c67b1 with parent db14a17, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d9a63bc4a59bd86b29b0070be1dbfe3ace5c67b1): comparison url.

Summary: This change led to small relevant mixed results 🤷 in compiler performance.

  • Small improvement in instruction counts (up to -0.8% on full builds of serde)
  • Small regression in instruction counts (up to 0.5% on full builds of webrender)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Nov 1, 2021
@camelid
Copy link
Member Author

camelid commented Nov 1, 2021

Hmm, not sure why there are regressions, but the improvements outweigh the regressions, and this is a good cleanup regardless. Perhaps I can look into if there's a way to avoid the regression; it's a bit surprising, but it could just be the optimizer behaving differently.

@camelid camelid removed the I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. label Nov 1, 2021
@camelid
Copy link
Member Author

camelid commented Nov 2, 2021

@notriddle Guillaume is busy right now so can't do an in-depth review. Would you be interested in reviewing this? You don't have to do a full review, but I'd appreciate some feedback if you have time :)

@camelid
Copy link
Member Author

camelid commented Nov 2, 2021

It looks like the largest regression (webrender) only regressed by 0.01 seconds of wall-clock time.

let num_fields = j.fields.len();
j.fields = j.fields.into_iter().filter_map(|x| self.fold_item(x)).collect();
j.fields_stripped |=
num_fields != j.fields.len() || j.fields.iter().any(|f| f.is_stripped());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was done this way before, but any idea why it isn't using a conditional, or a short-circuiting OR, for this?

                     if !j.fields_stripped {
                         j.fields_stripped =
                             num_fields != j.fields.len() || j.fields.iter().any(|f| f.is_stripped());
                     }

The length comparison probably isn't a problem, but unconditionally iterating over all of the fields seems like leaving performance on the table (or gambling on a compiler optimization pass that may or may not fire).

From a dive through Blame, this use of the |= operator dates back to a version of the code that was only doing the length comparison:

d6d31d7#diff-3ba434e74829d360308543dff0f880edbd86bdc9fd90f025d990d91319b7002cR82

The code that started doing the any() loop doesn't have any documentation that justifies why it isn't a perf problem, either. It looks like they just added the any() loop without reconsidering the use of a non-short-circuiting OR.

b1543a1#diff-3ba434e74829d360308543dff0f880edbd86bdc9fd90f025d990d91319b7002cR48-R49

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the reason it doesn't use a conditional is that no one thought of it ;)

It's a good idea; do you want to open a followup PR for it? Honestly, I think we should just remove the fields_stripped fields altogether and compute this information on-demand, but that'll take some time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can assign me on it if you like.

impl DocVisitor for SourceCollector<'_, '_> {
fn visit_item(&mut self, item: &clean::Item) {
if !self.cx.include_sources {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line of code cries out for a more explicit way to do control flow than this. This conditional seems to stop it from trying to recurse into child nodes, but it can't stop the parent from trying to visit all of the siblings (and that seems wasteful), and I really only figured that out by carefully reading the implementation.

The API used by TypeVisitor seems like a good source for plagiarizing, since ControlFlow::Break provides a way to "just bail out" like we want here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ControlFlow makes everything better 🙃

I'd be happy to try out switching DocVisitor to use ControlFlow, but that'd make this change even bigger, so I think let's save it for a future PR.

self.visit_item_recur(item)
}

/// don't override!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like how TypeFoldable provides this method for types, could an extension trait be attached to ItemKind to take care of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could you elaborate? I'm not sure why that would be better. I'm generally not a fan of too many extension traits since they add complexity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't like the idea, then it's fine, let's not do that. I brought it up because it would enforce the "don't override" rule instead of just making it a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see what you mean. We could do this without an extension trait since ItemKind is local. But I'm not sure if the extra complexity of not having all the visitor code in one place is worth it. Also, the current way DocVisitor (and DocFolder) works is consistent with similar visitors in rustc (like mir::Visitor). But I'd still be open to us trying it.

@@ -16,23 +16,25 @@ use std::ffi::OsStr;
use std::fs;
use std::path::{Component, Path, PathBuf};

crate fn render(cx: &mut Context<'_>, krate: clean::Crate) -> Result<clean::Crate, Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 That's way cleaner. I like this change; the other comments really are just nitpicking, and the PR could easily be merged as-is.

@camelid
Copy link
Member Author

camelid commented Nov 3, 2021

👍 That's way cleaner. I like this change; the other comments really are just nitpicking, and the PR could easily be merged as-is.

@notriddle does that mean it's okay to r=notriddle? Or would you like Guillaume to review this too?

And thanks for the review! ❤️

@notriddle
Copy link
Contributor

does that mean it's okay to r=notriddle? Or would you like Guillaume to review this too?

Yeah, that's what it means.

@camelid
Copy link
Member Author

camelid commented Nov 3, 2021

Thanks again! @bors r=notriddle rollup=never

@bors
Copy link
Collaborator

bors commented Nov 3, 2021

📌 Commit 8e4bcdf has been approved by notriddle

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 3, 2021
@camelid
Copy link
Member Author

camelid commented Nov 3, 2021

Marking the perf regression as triaged:

  • The improvements outweigh the regressions.
  • It's possible the regression was spurious or out of our control due to different machine code layout.
  • This is a good cleanup; the old code was more complex and harder to understand, while the new version makes it clear that no transformations are being performed.
  • This cleanup will allow future performance improvements.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Nov 3, 2021
@bors
Copy link
Collaborator

bors commented Nov 3, 2021

⌛ Testing commit 8e4bcdf with merge 4ff9023...

@bors
Copy link
Collaborator

bors commented Nov 4, 2021

☀️ Test successful - checks-actions
Approved by: notriddle
Pushing 4ff9023 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 4, 2021
@bors bors merged commit 4ff9023 into rust-lang:master Nov 4, 2021
@rustbot rustbot added this to the 1.58.0 milestone Nov 4, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4ff9023): comparison url.

Summary: This change led to moderate relevant mixed results 🤷 in compiler performance.

  • Moderate improvement in instruction counts (up to -0.8% on full builds of serde)
  • Small regression in instruction counts (up to 0.5% on full builds of webrender)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

@camelid camelid deleted the docvisitor branch November 4, 2021 16:29
notriddle added a commit to notriddle/rust that referenced this pull request Nov 4, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 5, 2021
…-conditional, r=camelid,GuillaumeGomez

rustdoc: Use conditional for _stripped fold

Followup: rust-lang#90475 (comment)
@rylev
Copy link
Member

rylev commented Nov 9, 2021

Going to mark this as triaged as the regressions are way smaller than the improvements and confined to the deeply-nested-async benchmark which has been acting up lately. This is really an improvement and not a mixed result. 🎉

@rustbot label: +perf-regression-triaged

@camelid
Copy link
Member Author

camelid commented Nov 9, 2021

Going to mark this as triaged as the regressions are way smaller than the improvements and confined to the deeply-nested-async benchmark which has been acting up lately. This is really an improvement and not a mixed result. 🎉

I agree that this is really an improvement, but AFAICT deeply-nested-async didn't have any changes here; only webrender and derive regressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. I-compiletime Issue: Problems and improvements with respect to compile times. merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants