Description
rustdoc has a number of static files that should really be long-cached by the browser for best loading performance: the fonts, the CSS, storage.js, main.js, and so on. We should add a hash to their names so that services can be more confident in setting long cache headers for them.
Right now we categorize things into Unversioned, ToolchainSpecific, and InvocationSpecific:
rust/src/librustdoc/html/render/write_shared.rs
Lines 44 to 57 in 10f4ce3
Unversioned is used just for the font files. ToolchainSpecific is used for the CSS, the images, and most of the JS. InvocationSpecific is used for search-indexN.NN.N.js
, source-filesN.NN.N.js
, cratesN.NN.N.js
, the JS that contains the list of implementors on trait pages, and the JS that contains the list of additional sidebar items (siblings in a module).
Unversioned gets no infix. ToolchainSpecific gets a version suffix, like main1.63.0.js
(from main.js
). InvocationSpecific gets the same version suffix.
Unversioned and ToolchainSpecific files should be infinitely cacheable. Right now, that's not the case for ToolchainSpecific, because multiple toolchains have the same version infix. For instance, every nightly build right now creates a main1.63.0.js
, but it's potentially different each night. That means https://doc.rust-lang.org/nightly/main1.63.0.js potentially changes every night, and can't be long-cached. Since docs.rs
uses the nightly toolchain, the main1.63.0.js
it produces for a crate today may be different than the one it produces for a crate it builds tomorrow.
docs.rs
has special code to recognize the ToolchainSpecific files and rename them to contain a date and a hash, like https://docs.rs/main-20220517-1.63.0-nightly-4c5f6e627.js. But doc.rust-lang.org
doesn't have that code, and as a result is less able to cache things that should be cached. And anyone who self-hosts docs is on their own.
I propose that we change our file naming scheme. All Unversioned and ToolchainSpecific files should be emitted to a subdirectory s/<hash>/
, where <hash>
is calculated over the contents of all of those files together. This makes it easy to configure a web server to set Cache-Control headers for everything under that subdirectory.
Advantage: this makes calculating URLs for such resources easy, especially when the calculation is done in JS. Disadvantage: if one file changes, the whole hash changes, potentially requiring the user to load more files when navigating between crates generated with different rustdoc versions.
Alternately, we could add a hash of each individual file to that file's name. That makes calculating URLs harder, but means better reuse of cached data across different nightly versions.
/cc @rust-lang/rustdoc @rust-lang/docs-rs