Skip to content

perf: Speed up TOML parsing by upgrading toml #15736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 9, 2025
Merged

Conversation

epage
Copy link
Contributor

@epage epage commented Jul 8, 2025

What does this PR try to resolve?

For numbers, see https://epage.github.io/blog/2025/07/toml-09/

Further areas for improvement:

How to test and review this PR?

@rustbot
Copy link
Collaborator

rustbot commented Jul 8, 2025

r? @weihanglo

rustbot has assigned @weihanglo.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-configuration Area: cargo config files and env vars A-manifest Area: Cargo.toml issues S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 8, 2025
@rustbot rustbot added the A-testing-cargo-itself Area: cargo's tests label Jul 8, 2025
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these review comments are blocking. Thanks for the efforts!

10 | | registry = "bad name"
| |_____________________________________^
|
--> Cargo.toml:8:17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a diagnostic regression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two tiers for this

  • Untagged enums and flattened structs in serde which mean we don't get precise field data, only the entire table
  • its weird to know what spans are "right" for standard tables and implicit tables because their definition is split over multiple locations.

In this case, the error is for registrys value but we are highlighting the entire table.

In toml_parse, I kept things simple by only tracking spans for the table headers and not table bodies. I'm open to changing that.

@@ -2809,11 +2808,11 @@ fn invalid_git_dependency_manifest() {
.with_status(101)
.with_stderr_data(str![[r#"
[UPDATING] git repository `[ROOTURL]/dep1`
[ERROR] duplicate key `categories` in table `package`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diagnostic regression IMO, but not a big deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this highlights the duplicate key, I figured that removing it from the message wasn't too big of a deal.

} else {
key.span()
}
Some(key.span())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the span method is effectively the same as the previous implementation that includes decorations?
(Cannot tell unless I look into the implementation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were getting the span of the entire header by grabbing the spans of the decor and making a span of what came between. Looking at https://github.com/rust-lang/cargo/pull/15736/files/539a48452a03a68d18cd32481e364550ee42f329#diff-1c71ec9c445774ebb8b0c3dd4c35817dda698c78e15fb83d8a61d8b145c5dd34, it appears that that behavior is now the default for spans so this is built-in now. Even if it wasn't, its likely not worth it to keep using toml_edit.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the performance optimization and the good write-up!

@weihanglo weihanglo added this pull request to the merge queue Jul 9, 2025
@blyxyas
Copy link
Member

blyxyas commented Jul 9, 2025

Is there anything blocking from adding the fast_hash feature for a further ~14% improvement? Seems stable enough to already be used in hashbrown, why not also use it on Cargo?

@epage
Copy link
Contributor Author

epage commented Jul 9, 2025

@blyxyas I was just wanting to leave that to work on a custom hash more generally (#15649) to avoid any questions blocking this PR on the hash used, what hashes are appropriate, etc.

Merged via the queue into rust-lang:master with commit f9ee73d Jul 9, 2025
26 checks passed
@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 9, 2025
@epage epage deleted the toml branch July 9, 2025 16:59
bors added a commit to rust-lang/rust that referenced this pull request Jul 12, 2025
Update cargo

14 commits in 930b4f62cfcd1f0eabdb30a56d91bf6844b739bf..eabb4cd923deb73e714f7ad3f5234d68ca284dbe
2025-06-28 14:58:43 +0000 to 2025-07-09 22:07:55 +0000
- feat: Implementation and tests for `multiple-build-scripts` (rust-lang/cargo#15704)
- perf: Speed up TOML parsing by upgrading toml (rust-lang/cargo#15736)
- Mark cachelock tests that rely on interprocess blocking behaviour as unsupported on AIX. (rust-lang/cargo#15734)
- feat(publish): Stabilize multi-package publishing (rust-lang/cargo#15636)
- Update to Rust 2024 (rust-lang/cargo#15732)
- Clarify package ID specifications in SBOMs are fully qualified (rust-lang/cargo#15731)
- chore(deps): update cargo-semver-checks to v0.42.0 (rust-lang/cargo#15730)
- test: Switch config tests to use snapshots (rust-lang/cargo#15729)
- implement package feature unification (rust-lang/cargo#15684)
- chore: Upgrade dependencies (rust-lang/cargo#15722)
- Report valid file name when we can't find a build target for `name = "foo.rs"` (rust-lang/cargo#15707)
- chore(release): Publish build-rs on release (rust-lang/cargo#15708)
- Override `Cargo.lock` checksums when doing a dry-run `publish` (rust-lang/cargo#15711)
- test(rustfix): Update for nightly (rust-lang/cargo#15717)

r? ghost
@rustbot rustbot added this to the 1.90.0 milestone Jul 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-configuration Area: cargo config files and env vars A-manifest Area: Cargo.toml issues A-testing-cargo-itself Area: cargo's tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants