Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mergeable rustdoc cross-crate info #3662

Merged
merged 28 commits into from
Sep 21, 2024
Merged
Changes from 1 commit
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
542dcbb
add `mergable_rustdoc_cross_crate_info`
EtomicBomb Jun 19, 2024
fdc1b70
generates no files by default
EtomicBomb Jun 20, 2024
8c78a93
fix typos
EtomicBomb Jun 20, 2024
08ac8a2
merged -> rendered in flags
EtomicBomb Jun 20, 2024
ef10ed5
fix typos
EtomicBomb Jun 20, 2024
4994251
whitespace and clear lines
EtomicBomb Jun 20, 2024
0f01c9b
typo
EtomicBomb Jun 20, 2024
7db0a4a
remove work in progress
EtomicBomb Jun 20, 2024
23e34e0
address comments from P1n3appl3
EtomicBomb Jun 21, 2024
97cf6c6
changes in response to jsha and notriddle
EtomicBomb Jun 24, 2024
3326b10
typos
EtomicBomb Jun 24, 2024
cbd62bb
nits
EtomicBomb Jun 25, 2024
eba7fc4
--include-info-json and --write-info-json take path directly to crate…
EtomicBomb Jun 28, 2024
7b82ab9
typos, --mode=auto -> --mode=read-write
EtomicBomb Jul 16, 2024
63107f9
add versioning, --include-rendered-docs flag
EtomicBomb Jul 16, 2024
3d0dd8c
version number is prefixed with V
EtomicBomb Jul 17, 2024
0ddbcf6
remove detail of crate-info, suggested workflow
EtomicBomb Jul 18, 2024
f639365
type impl,buck2 link,extern-html-root-url,period
EtomicBomb Jul 31, 2024
627a0ba
rename flags per camelid, clarify workflow
EtomicBomb Jul 31, 2024
b0dc37d
move extern-html-root-url, remove no_emit_shared
EtomicBomb Jul 31, 2024
4b076f6
manishearth nits + stabilize crate-info.json
EtomicBomb Aug 2, 2024
d7ea3b5
del --include-rendered-docs, doc.parts suggestions
EtomicBomb Aug 3, 2024
628c7a9
fix typo in merge sentence
EtomicBomb Aug 3, 2024
c765097
editorial changes in response to @jsha
EtomicBomb Aug 13, 2024
ac9208c
resolve in favor of no index crate
EtomicBomb Aug 13, 2024
e66bb63
fix typo identified by jsha
EtomicBomb Aug 20, 2024
665a58f
editorial fixes from @aDotInTheVoid's suggestions
EtomicBomb Aug 21, 2024
e1b5e5b
replace WIP -> implementation, per camelid
EtomicBomb Aug 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
editorial fixes from @aDotInTheVoid's suggestions
* full sentences in intro
* Clarify Cargo's expected use of `--merge=none|shared|finalize`
* Explain why you would use multiple rustdoc invocations with the same --out-dir
  and merge=none
* Remove justification for why rustdoc output can be unstable in general
* Permalink specific fuchsia commit
  • Loading branch information
EtomicBomb committed Aug 21, 2024
commit 665a58f7392572fe715a53bbc8bf048fca4d40a3
18 changes: 9 additions & 9 deletions text/0000-mergeable-rustdoc-cross-crate-info.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

# Summary

Mergeable cross-crate information in rustdoc. Facilitates the generation of documentation indexes in workspaces with many crates by allowing each crate to write to an independent output directory. Final documentation is rendered with a lightweight merge step. Configurable with command-line flags, this proposal writes a `doc.parts` directory to hold pre-merge cross-crate information. Currently, rustdoc requires global mutable access to a single output directory to generate cross-crate information, which is an obstacle to integrating rustdoc in build systems that enforce the independence of build actions.
This RFC discusses mergeable cross-crate information in rustdoc. It facilitates the generation of documentation indexes in workspaces with many crates by allowing each crate to write to an independent output directory. The final documentation is rendered by combining these independent directories with a lightweight merge step. When provided with `--parts-out-dir`, this proposal writes a `doc.parts` directory to hold pre-merge cross-crate information. Currently, rustdoc requires global mutable access to a single output directory to generate cross-crate information, which is an obstacle to integrating rustdoc in build systems that enforce the independence of build actions.

# Motivation

Expand All @@ -17,9 +17,7 @@ There are some files in the rustdoc output directory that are read and overwritt

Build systems may run build actions in a distributed environment across separate logical filesystems. It might also be desirable to run rustdoc in a lock-free parallel mode, where every rustdoc process writes to a disjoint set of files.

Cargo fully supports cross-crate information, at the cost of requiring global read-write access to the doc root (`target/doc`). There are significant scalability issues with this approach.

Rustdoc needing global mutable access to the files that encode this cross-crate information has implications for caching, reproducible builds, and content hashing. By adding an option to avoid this mutation, rustdoc will serve as a first-class citizen in non-cargo build systems.
Cross-crate information is supported in Cargo. It calls rustdoc with a single `--out-dir`, which requires global read-write access to the doc root (e.g. `target/doc`). There are significant scalability issues with this approach. Global mutable access to the files that encode this cross-crate information has implications for caching, reproducible builds, and content hashing. By adding an option to avoid this mutation, rustdoc will serve as a first-class citizen in non-cargo build systems.

These considerations motivate adding an option for outputting partial CCI (parts), which are merged (linked) with a later step.

Expand All @@ -31,9 +29,9 @@ This RFC has the goal of enabling the future deprecation of the default (called

More details are in the Reference-level explanation.

* `--merge=none`: Do not write cross-crate information to the `--out-dir`. The flag `--parts-out-dir` may instead be provided with the destination of the current crate's cross-crate information parts.
* `--parts-out-dir=path/to/doc.parts/<crate-name>`: Write cross-crate linking information to the given directory (only usable with the `--merge=none` mode). This information allows linking the current crate's documentation with other documentation at a later rustdoc invocation.
* `--include-parts-dir=path/to/doc.parts/<crate-name>`: Include cross-crate information from this previously written `doc.parts` directories into a collection that will be written by the current invocation of rustdoc. May only be provided with `--merge=finalize`. May be provided any number of times.
* `--merge=none`: Do not write cross-crate information to the `--out-dir`. The flag `--parts-out-dir` may instead be provided with the destination of the current crate's cross-crate information parts.
* `--merge=shared` (default): Append information from the current crate to any info files found in the `--out-dir`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we plan to deprecate --merge=shared anyway, what do you think about not adding it in the first place? --merge already has to be an optional flag to avoid breaking changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uniformity and ease of documentation. It's the same reason you're allowed to write edition = 2015, even though that's the default if you don't specify anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I guess we won't ever be able to remove the shared implementation anyway, so it doesn't matter. And if we somehow did find a way to remove it while maintaining the outward behavior, that wouldn't be a breaking change anyway since it's an implementation detail.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you say we won't ever be able to remove the shared implementation? I've been assuming that after updating Cargo and docs.rs, and providing a decently long deprecation period we could remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm trying to remember what I was thinking ^^. I think I was worried that there might be existing build systems or other tools that were depending on the shared representation, but we never guaranteed stability of it. So perhaps it would be fine to remove after all.

* `--merge=finalize`: Write cross-crate information from the current crate and any crates included via `--include-parts-dir` to the `--out-dir`, overwriting conflicting files. This flag may be used with or without an input crate root, in which case it only links crates included via `--include-parts-dir`.

Expand Down Expand Up @@ -224,7 +222,9 @@ This mode only renders the HTML item documentation for the current crate. It doe

In this mode, a user will provide `--parts-out-dir=<path to crate-specific directory>` and `--merge=none` to each crate's rustdoc invocation. The user should provide `--extern-html-root-url`, and specify a absolute final destination for the docs, as a URL. The `--extern-html-root-url` flag should be provided for each crate's rustdoc invocation, for every dependency.

The same `--out-dir` may be used for multiple parallel rustdoc invocations, as rustdoc will continue to acquire an flock on the `--out-dir` to address conflicts. A user may select a different `--out-dir` for each crate's rustdoc invocation.
A user may select a different `--out-dir` for each crate's rustdoc invocation.

The same `--out-dir` may also be used for multiple parallel rustdoc invocations, as rustdoc will continue to acquire an flock on the `--out-dir` to address conflicts. This is in anticipation of the possibility of deprecating `--merge=shared`, and Cargo adopting a `--merge=none` + `--merge=finalize` workflow. Cargo is expected continue using the same `--out-dir` for all crates in a workspace, as this eliminates the operations needed to merge multiple `--out-dirs`.

### Link documentation: `--merge=finalize`

Expand Down Expand Up @@ -300,13 +300,13 @@ This proposal is capable of addressing two primary use cases. It allows develope

CCI is not automatically enabled in either situation. A combination of the `--include-parts-dir`, `--merge`, and `--parts-out-dir` flags are needed to produce this behavior. This RFC provides a minimal set of tools that allow developers of build systems, like Bazel and Buck2, to create rules for these scenarios.

With separate `--out-dir`s, copying item docs to an output destination is needed. Rustdoc will never support the entire breadth of workflows needed to merge arbitrary directories, and will rely on users to run external commands like `mv`, `cp`, `rsync`, `scp`, etc. for these purposes. Most users are expected to use a single `--out-dir` for all crates, in which case these external tools are not needed.
With separate `--out-dir`s, copying item docs to an output destination is needed. Rustdoc will never support the entire breadth of workflows needed to merge arbitrary directories, and will rely on users to run external commands like `mv`, `cp`, `rsync`, `scp`, etc. for these purposes. Most users are expected to continue to use a single `--out-dir` for all crates, in which case these external tools are not needed. It is expected that build systems with the need to be hermetic will use separate `--out-dir`s for `--merge=none`, while Cargo will continue to use the same `--out-dir` for every rustdoc invocation.

## Compatibility

This RFC does not alter previous compatibility guarantees made about the output of rustdoc. In particular it does not stabilize the presence of the rendered cross-crate information files, their content, or the HTML generated by rustdoc.

In the same way that the [rustdoc HTML output is unstable](https://rust-lang.github.io/rfcs/2963-rustdoc-json.html#:~:text=The%20HTML%20output%20of%20rustdoc,into%20a%20different%20format%20impractical), the content of `doc.parts` will be considered unstable. Between versions of rustdoc, breaking changes to the content of `doc.parts` should be expected. Only the presence of a `doc.parts` directory is promised, under `--parts-out-dir`. Merging cross-crate information generated by disparate versions of rustdoc is not supported. To detect whether `doc.parts` is compatible, rustdoc includes a version number in these files (see New directory: `doc.parts`).
The content of `doc.parts` will be considered unstable. Between versions of rustdoc, breaking changes to the content of `doc.parts` should be expected. Only the presence of a `doc.parts` directory is promised, under `--parts-out-dir`. Merging cross-crate information generated by disparate versions of rustdoc is not supported. To detect whether `doc.parts` is compatible, rustdoc includes a version number in these files (see New directory: `doc.parts`).

The implementation of the RFC itself is designed to produce only minimal changes to cross-crate info files and the HTML output of rustdoc. Exhaustively, the implementation is allowed to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial: What is this talking about? There was a linked draft implementation in the PR, but that was on the master branch of your fork, which now doesn't have it. It's worth clairifying that this is your WIP implementation (unless it isn't), and linking to it.

Copy link
Contributor Author

@EtomicBomb EtomicBomb Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the reference. It was to a different RFC (#2963), but I don't think the reference is relevant anymore.

* Change the sorting order of trait implementations, type implementations, and other cross-crate info in the HTML output of rustdoc.
Expand Down Expand Up @@ -363,7 +363,7 @@ buck2 does not natively merge rustdoc from separate targets. The buck2 maintaine

## Ninja [(GN)](https://fuchsia.dev/fuchsia-src/development/build/build_system/intro) + Fuchsia

Currently, the Fuchsia project runs rustdoc on all of their crates to generate a [documentation index](https://fuchsia-docs.firebaseapp.com/rust/rustdoc_index/). This index is effectively generated as an [atomic step](https://cs.opensource.google/fuchsia/fuchsia/+/main:tools/devshell/contrib/lib/rust/rustdoc.py) in the build system. It takes [3 hours](https://ci.chromium.org/ui/p/fuchsia/builders/global.ci/firebase-docs/b8744777376580022225/overview) to document the ~2700 crates in the environment. With this proposal, building each crate's documentation could be done as separate build actions, which would have a number of benefits. These include parallelism, caching (avoid rebuilding docs unnecessarily), and robustness (automatically reject pull requests that break documentation).
Currently, the Fuchsia project runs rustdoc on all of their crates to generate a [documentation index](https://fuchsia-docs.firebaseapp.com/rust/rustdoc_index/). This index is effectively generated as an [atomic step](https://cs.opensource.google/fuchsia/fuchsia/+/4eefc272d36835959f2e44be6e06a6fbb504e418:tools/devshell/contrib/lib/rust/rustdoc.py) in the build system. It takes [3 hours](https://ci.chromium.org/ui/p/fuchsia/builders/global.ci/firebase-docs/b8744777376580022225/overview) to document the ~2700 crates in the environment. With this proposal, building each crate's documentation could be done as separate build actions, which would have a number of benefits. These include parallelism, caching (avoid rebuilding docs unnecessarily), and robustness (automatically reject pull requests that break documentation).

# Unresolved questions

Expand Down