refactor(es/preset-env): Use phf for corejs3 entry #10712

quininer · 2025-06-26T06:34:01Z

Description:

This is a follow-up to #10684. I simply chose an example to experiment with.

I did some simple testing and I believe there is no large regression in single query performance.

// before
es/preset-env/entry/import
                        time:   [531.47 ns 532.42 ns 533.55 ns]

// after                        
es/preset-env/entry/import
                        time:   [549.62 ns 551.02 ns 552.55 ns]

Since we do dedup when packing strings, it is more compact than the original json and has some size optimization.

// before
16397965	../../target/release/deps/polyfills-d11a2ed3340dd897

// after
16070261	../../target/release/deps/polyfills-4b49b166ca0322eb

Since ../../data/core-js-compat/entries.json is 500k large, I believe it also saves close to 500k of memory.

BREAKING CHANGE:

Related issue (if exists):

changeset-bot · 2025-06-26T06:34:08Z

⚠️ No Changeset found

Latest commit: 08c3a56

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

CLAassistant · 2025-06-26T06:34:11Z

All committers have signed the CLA.

codspeed-hq · 2025-06-26T06:54:01Z

CodSpeed Performance Report

Merging #10712 will not alter performance

_{Comparing quininer:s/use-precomputed-map2 (08c3a56) with main (43a4f11)}

Summary

✅ 141 untouched benchmarks

kdy1 · 2025-06-26T08:25:21Z

crates/swc_ecma_preset_env/src/corejs3/entry.rs

-    })
-    .collect()
-});
+include!("../generated/corejs3_entries.rs");


I'm not sure about committing binary files into the repository

In some ways this is not worse than include the original json, since these binary files are more compact. but I agree there must be a better way.

One solution is that we can use .gitignore to hide generated files, but we have to generate them manually when cargo publish. This is similar to the behavior of browserslist-rs.
Another solution is that we can use git-lfs to manage binary files. This is a common practice when you have to track binaries.

A binary file is hard to inspect so I consider it as a security risk

I expect many people to have similar opinion as me

The binary file is generated deterministically, and we can verify its consistency on ci.

Use .gitignore to prevent generated files from be stored. generate them only in publish-crates ci. provide tools for ci and anyone to verify the consistency of files in git tags and crates.

Store generated files in repository. Use ci to verify that the binary file modifications in the PR are consistent with the original json file.

kdy1 · 2025-06-27T00:52:16Z

crates/swc_ecma_preset_env/benches/polyfills.rs

Can you split this file as a separate PR (and I'll merge it first) so we can profile the difference using codspeed?

I'm not looking forward to the performance of this implementation, I know there is a lot of space for optimization. The main purpose here is to see if we are happy with the integration in this form.

**Description:** See #10712 (comment)

kdy1 · 2025-06-27T03:41:35Z

I think this PR regresses the performance, but can you rebase?

#10722 (comment) is 28.1 us

kdy1 · 2025-06-27T05:06:01Z

It regresses the performance

quininer · 2025-06-27T05:18:51Z

Sure. That matches my local testing. I think it's within the range that micro-optimizations can make up for. The main purpose of this PR is to see if we're happy with this integration.

kdy1 · 2025-07-04T05:55:04Z

Is the goal reducing the binary size? Or is the goal reducing the compile time? If it's the former, I'm fine with generating binary file with a build script at the build time.

quininer · 2025-07-04T06:16:59Z

Our goal is binary size, but I don't want to significantly affect the compilation speed.

Current our scale is acceptable to put it to build script, but should be note that build phf is a loop solve process, which cannot be guaranteed to be completed. If there is a larger scale of data in the future, then the phf solve in build script will significantly affect compilation time.

I made some optimizations and I believe we are now faster than the main branch.

Adjusted hot path
Eliminated indirect address access
Eliminated one bounds check (but still kept most of them to ensure handling of untrusted binary data)
Eliminated utf8 check

but the biggest speedup came from the hasher replacement. I used foldhash (It is the current default hasher for hashbrown), which is still slower than fxhash, but fxhash currently has no suitable crate. The rustc-hash crate does not guarantee the stable of output across architectures, and the other fxhash crate is unmaintained.

main
$ taskset -c 0 env RUST_LOG="off" cargo bench -- import
es/preset-env/entry/import
                        time:   [510.44 ns 511.30 ns 512.85 ns]
                        
phf
$ taskset -c 0 env RUST_LOG="off" cargo bench -- import
es/preset-env/entry/import
                        time:   [505.31 ns 506.69 ns 509.38 ns]

kdy1 · 2025-07-04T06:21:33Z

Then I think we should commit the binary file to the repository, but I think we need a way to verify it, like adding many snapshot tests for the embeded binary so if it's modified the test fails. (Without cryptic numbers in code, ideally. Is it possible?)

quininer · 2025-07-04T06:24:41Z

You expect foldhash to have a consistent output across versions or platforms, such as for persistent file formats or communication protocols.

Checking the README for foldhash, it doesn't seem to be suitable for this either. I think in one of the issues they mentioned ensure the hash output is consistent with the semvar version.

I realized another benefit of the build script, which is that it will not be affected by the stable of hasher.

quininer · 2025-07-04T06:28:11Z

Then I think we should commit the binary file to the repository, but I think we need a way to verify it, like adding many snapshot tests for the embeded binary so if it's modified the test fails.

This is ok because our codegen is completely deterministic. we can generate data to a temporary directory when test, and then compare it with the data in commit to see if it is consistent.

(Without cryptic numbers in code, ideally. Is it possible?)

The only cryptic numbers I suggest keep are the seed used in last successful phf build, to ensure fast builds.

kdy1 · 2025-07-04T07:37:55Z

Ah sounds good. So the seed numbers should be stored in the build scripts?

quininer · 2025-07-04T07:44:24Z

Yes, it is stored in build script. I'm trying to change to build script. At least for now it doesn't affect compilation performance too much.

socket-security · 2025-07-07T05:11:41Z

All alerts resolved. Learn more about Socket for GitHub.

This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored.

View full report

quininer · 2025-07-07T05:19:07Z

I'll release version precomputed-map 0.2 tonight so it can be merged. if we're happy with it, I'll start replace other static maps.

kdy1 · 2025-07-07T05:33:32Z

Can we use phf crate instead? What’s the difference?

kdy1 · 2025-07-07T05:34:02Z

It has a crate to generate code AFAIK

quininer · 2025-07-07T05:35:54Z

@kdy1 The phf crate uses a different algorithm and requires the use of siphasher, and it does not do any string pack. This makes it have no advantages in performance, compilation speed, and size.

quininer · 2025-07-07T05:39:23Z

I have an post about this (English coming soon)

Cargo.toml

socket-security · 2025-07-07T12:35:01Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	copyfiles@2.4.1
	webpack-cli@3.3.12
	webpack-dev-server@3.11.3
	webpack@4.47.0

View full report

**Description:** This is a follow-up to #10712 .

quininer force-pushed the s/use-precomputed-map2 branch from df5209f to 9543119 Compare June 26, 2025 06:59

kdy1 reviewed Jun 26, 2025

View reviewed changes

kdy1 reviewed Jun 27, 2025

View reviewed changes

quininer mentioned this pull request Jun 27, 2025

test(es/preset-env): Add entry import bench #10722

Merged

kdy1 pushed a commit that referenced this pull request Jun 27, 2025

test(es/preset-env): Add entry import bench (#10722)

9868b4d

**Description:** See #10712 (comment)

quininer force-pushed the s/use-precomputed-map2 branch from 9543119 to 1e42f37 Compare June 27, 2025 03:42

quininer force-pushed the s/use-precomputed-map2 branch from 1e42f37 to 36d7e97 Compare July 4, 2025 05:50

kdy1 added this to the Planned milestone Jul 4, 2025

quininer force-pushed the s/use-precomputed-map2 branch 2 times, most recently from 4e241d9 to 9bf1c8c Compare July 4, 2025 08:35

quininer added 4 commits July 7, 2025 12:21

perf(es): Use phf instead of lazy json

a225b61

perf(es/preset_env): Use phf for corejs3 entry

d0e54f3

Use foldhash and precomputed-map 0.2

c1c730f

Use build script

6a7de93

quininer force-pushed the s/use-precomputed-map2 branch from 9bf1c8c to 9f94427 Compare July 7, 2025 05:10

quininer marked this pull request as ready for review July 7, 2025 05:11

quininer requested a review from a team as a code owner July 7, 2025 05:11

quininer force-pushed the s/use-precomputed-map2 branch from 9f94427 to 74309de Compare July 7, 2025 05:17

fmt

418e297

quininer force-pushed the s/use-precomputed-map2 branch from 74309de to 418e297 Compare July 7, 2025 05:22

kdy1 requested changes Jul 7, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

Use precomputed-map 0.2

08c3a56

kdy1 approved these changes Jul 7, 2025

View reviewed changes

kdy1 changed the title ~~perf(es/preset-env): Use phf for corejs3 entry~~ refactor(es/preset-env): Use phf for corejs3 entry Jul 7, 2025

kdy1 merged commit 658b26d into swc-project:main Jul 7, 2025
169 checks passed

quininer deleted the s/use-precomputed-map2 branch July 7, 2025 15:42

kdy1 modified the milestones: Planned, v1.12.11 Jul 8, 2025

quininer mentioned this pull request Jul 8, 2025

refactor(es/preset-env): Use strpool,phf for corejs2 data #10803

Merged

kdy1 pushed a commit that referenced this pull request Jul 11, 2025

refactor(es/preset-env): Use strpool,phf for corejs2 data (#10803)

1652fd8

**Description:** This is a follow-up to #10712 .

quininer mentioned this pull request Jul 14, 2025

Use a better MPHF algorithm rust-phf/rust-phf#349

Open

swc-project locked as resolved and limited conversation to collaborators Aug 7, 2025

Uh oh!

refactor(es/preset-env): Use phf for corejs3 entry #10712

refactor(es/preset-env): Use phf for corejs3 entry #10712

Conversation

quininer commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

CLAassistant commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #10712 will not alter performance

Summary

Uh oh!

kdy1 Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

quininer Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

kdy1 Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

kdy1 Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quininer Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdy1 Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

quininer Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

kdy1 commented Jun 27, 2025

Uh oh!

kdy1 commented Jun 27, 2025

Uh oh!

quininer commented Jun 27, 2025

Uh oh!

kdy1 commented Jul 4, 2025

Uh oh!

quininer commented Jul 4, 2025

Uh oh!

kdy1 commented Jul 4, 2025

Uh oh!

quininer commented Jul 4, 2025

Uh oh!

quininer commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdy1 commented Jul 4, 2025

Uh oh!

quininer commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

socket-security bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quininer commented Jul 7, 2025

Uh oh!

kdy1 commented Jul 7, 2025

Uh oh!

kdy1 commented Jul 7, 2025

Uh oh!

quininer commented Jul 7, 2025

Uh oh!

quininer commented Jul 7, 2025

Uh oh!

Uh oh!

socket-security bot commented Jul 7, 2025

quininer commented Jun 26, 2025 •

edited

Loading

changeset-bot bot commented Jun 26, 2025 •

edited

Loading

CLAassistant commented Jun 26, 2025 •

edited

Loading

codspeed-hq bot commented Jun 26, 2025 •

edited

Loading

kdy1 Jun 26, 2025 •

edited

Loading

quininer Jun 26, 2025 •

edited

Loading

quininer commented Jul 4, 2025 •

edited

Loading

quininer commented Jul 4, 2025 •

edited

Loading

socket-security bot commented Jul 7, 2025 •

edited

Loading