-
Notifications
You must be signed in to change notification settings - Fork 2.6k
perf: Speed up TOML parsing by upgrading toml #15736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Technically this will cause a small regression in performance from `toml_edit@0.23` but not `toml_edit@0.22`
r? @weihanglo rustbot has assigned @weihanglo. Use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these review comments are blocking. Thanks for the efforts!
10 | | registry = "bad name" | ||
| |_____________________________________^ | ||
| | ||
--> Cargo.toml:8:17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a diagnostic regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two tiers for this
- Untagged enums and flattened structs in serde which mean we don't get precise field data, only the entire table
- its weird to know what spans are "right" for standard tables and implicit tables because their definition is split over multiple locations.
In this case, the error is for registry
s value but we are highlighting the entire table.
In toml_parse
, I kept things simple by only tracking spans for the table headers and not table bodies. I'm open to changing that.
@@ -2809,11 +2808,11 @@ fn invalid_git_dependency_manifest() { | |||
.with_status(101) | |||
.with_stderr_data(str![[r#" | |||
[UPDATING] git repository `[ROOTURL]/dep1` | |||
[ERROR] duplicate key `categories` in table `package` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diagnostic regression IMO, but not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this highlights the duplicate key, I figured that removing it from the message wasn't too big of a deal.
} else { | ||
key.span() | ||
} | ||
Some(key.span()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the span method is effectively the same as the previous implementation that includes decorations?
(Cannot tell unless I look into the implementation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were getting the span of the entire header by grabbing the spans of the decor and making a span of what came between. Looking at https://github.com/rust-lang/cargo/pull/15736/files/539a48452a03a68d18cd32481e364550ee42f329#diff-1c71ec9c445774ebb8b0c3dd4c35817dda698c78e15fb83d8a61d8b145c5dd34, it appears that that behavior is now the default for spans so this is built-in now. Even if it wasn't, its likely not worth it to keep using toml_edit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the performance optimization and the good write-up!
Is there anything blocking from adding the |
Update cargo 14 commits in 930b4f62cfcd1f0eabdb30a56d91bf6844b739bf..eabb4cd923deb73e714f7ad3f5234d68ca284dbe 2025-06-28 14:58:43 +0000 to 2025-07-09 22:07:55 +0000 - feat: Implementation and tests for `multiple-build-scripts` (rust-lang/cargo#15704) - perf: Speed up TOML parsing by upgrading toml (rust-lang/cargo#15736) - Mark cachelock tests that rely on interprocess blocking behaviour as unsupported on AIX. (rust-lang/cargo#15734) - feat(publish): Stabilize multi-package publishing (rust-lang/cargo#15636) - Update to Rust 2024 (rust-lang/cargo#15732) - Clarify package ID specifications in SBOMs are fully qualified (rust-lang/cargo#15731) - chore(deps): update cargo-semver-checks to v0.42.0 (rust-lang/cargo#15730) - test: Switch config tests to use snapshots (rust-lang/cargo#15729) - implement package feature unification (rust-lang/cargo#15684) - chore: Upgrade dependencies (rust-lang/cargo#15722) - Report valid file name when we can't find a build target for `name = "foo.rs"` (rust-lang/cargo#15707) - chore(release): Publish build-rs on release (rust-lang/cargo#15708) - Override `Cargo.lock` checksums when doing a dry-run `publish` (rust-lang/cargo#15711) - test(rustfix): Update for nightly (rust-lang/cargo#15717) r? ghost
What does this PR try to resolve?
For numbers, see https://epage.github.io/blog/2025/07/toml-09/
Further areas for improvement:
fast_hash
(see Use another hashing algorithm? 250K calls to Siphash #15649)make_owned
call for most packagesHow to test and review this PR?