Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion PR: bundle analysis #10064

Closed
wants to merge 68 commits into from
Closed

Discussion PR: bundle analysis #10064

wants to merge 68 commits into from

Conversation

connorjclark
Copy link
Collaborator

@connorjclark connorjclark commented Dec 3, 2019

I've been working on a few new audits re: bundle analysis. There's a few good breakpoints for dividing this up into smaller PRs, but I wanted to throw this up here to start discussion on some implementation details.

This branch has:

  1. core(source-maps): workaround CORS for fetching maps #9459 merged in
  2. bundle-duplication audit
  3. valid-source-maps audit
  4. new detail type for rendering multiple values
  5. builds the CDT source map impl. to a commonjs lib

image

image

image

image

run

SIZE_MODE=cdt node lighthouse-cli https://web.dev/ --only-audits=bundle-duplication,valid-source-maps --view # passes

# or use https://www.coursehero.com to see a failing audit

# SIZE_MODE=moz (or don't set at all) will use a different counting impl.

source map implementation

There is the popular mozilla/source-map, and there is Chrome DevTool's version. The bundle-duplication audit uses this to count how many bytes are mapped to an input source file.

mozilla/source-map uses WASM to parse the mappings. CDT is all JS. They run in equivalent times, as far as the usage is bundle-duplication is concerned.

To efficiently attribute the byte costs, a lastGeneratedColumn is required, which is missing in CDT but exists in mozilla/source-map. It was a simple patch, and is something we need to upstream to CDT for @paulirish's CDT bundle visualizer anyhow.

CDT adds 20KB, mozilla/source-map adds 49KB.

CDT is written in esmodules, so I've made a build script that transpiles chrome-devtools-frontend to commonjs. There were also a few hacks required - there is some global prototype pollution, but I've contained all global scope pollution to a toplevel globalThis.cdt.

Question: Currently, there is no speed benefit to using CDT, but there is a size benefit. Which should we use?

Question: Is there a way to turn off tsc for lighthouse-core/lib/cdt/generated?

bundle-duplication

Question: I investigated a threshold for byte duplication, since there can be many "0KB"s and that's kinda noisy. I landed on a 100 byte threshold with 0.05 granularity. We could go higher, but there can be a long tail and the sum of all of them can be significant. We could throw them all into a "the rest" item (like font-size does), but that's not actionable. Thoughts?

multi

bundle-duplication reports files (from source maps) that are duplicated, and for each module (normalized to ignore nested node_modules) there can be n duplicates. For this audit to be actionable, it must be clear which source file each duplicate comes from. Additionally, each item needs to aggregate the information for a module, such that they can be sorted in the table.

That means we need a column that can contain multiple entries. In the interest of backwards compatibility and not needing to change tons internal of code, I suggest this format:

image

If the header is marked as multi: true, then the details renderer will lookup values in item.multi[key] and render each. To maintain the interface all the other byte efficiency audits use, the top-level item.wastedBytes is the sum of all multi.wastedBytes.

The other top-level properties are unnecessary for bundle-duplication, but I didn't attempt to tweak the TS interface to drop them entirely. Maybe ByteEfficiencyItem could be a union of a multi type and a normal one?

Thoughts?

valid-source-maps

Surfaces a few things.

  • Load errors (map was defined but fetching failed)
  • missing sourcesContent (we can punt on this, not used yet)
  • script looks like a bundle, but did not define a map (haven't done yet)

I'm also interested in pulling in source-map-validator to do some validations, but that isn't critical imo.

Question: is a best-practices audit the best way to surface this data? It's kinda meta to report, in that lack of this data means other audits can't function. What if those other audits hotlink to this one, if it's missing required data?

@connorjclark
Copy link
Collaborator Author

Closing, this has served its purpose.

@paulirish
Copy link
Member

FWIW I'm super excited about this whole effort! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants