-
Notifications
You must be signed in to change notification settings - Fork 24
Optimize surrogate decoding. #894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use `char ^ 0xD800 <= 0x3FF` to check if a char code is a lead surrogate. That avoids doing a later `& 0x3FF` to get rid of the top bits. Similar for tail surrogate. This ensures that the `high` function gets values without high bits. Also optimize that function to reduce dependency depth and try to hit `base + (something < small)` expressions that can optimized into a single x64 address computation. Gives a ~7% increase on backwards traversal and 38% increase for forward traversal, based on tool/benchmark.dart compiled with `dart compile exe`.
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
PR HealthBreaking changes ✔️
Changelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs.
Coverage
|
File | Coverage |
---|---|
pkgs/characters/lib/src/characters_impl.dart | 💚 90 % ⬆️ 0 % |
pkgs/characters/lib/src/grapheme_clusters/breaks.dart | 💚 97 % ⬆️ 0 % |
pkgs/characters/lib/src/grapheme_clusters/table.dart | 💚 100 % |
pkgs/characters/tool/benchmark.dart | 💔 Not covered |
pkgs/characters/tool/bin/generate_tables.dart | 💔 Not covered |
pkgs/characters/tool/src/grapheme_category_loader.dart | 💔 Not covered |
pkgs/characters/tool/src/string_literal_writer.dart | 💔 Not covered |
This check for test coverage is informational (issues shown here will not fail the PR).
This check can be disabled by tagging the PR with skip-coverage-check
.
API leaks ✔️
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
Package | Leaked API symbols |
---|
License Headers ✔️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files |
---|
no missing headers |
All source files should start with a license header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Although I don't have much knowledge of this package, and don't understand its intricacies. But what I understand makes sense to me.
var index = chunkStart + (tail & 255); | ||
return _data.codeUnitAt(index); | ||
var offset = (tail >> 8) + (lead << 2); | ||
tail &= 255; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do the assignment instead of the original chunkStart + (tail & 255)
?
Revisions updated by `dart tools/rev_sdk_deps.dart`. ai (https://github.com/dart-lang/ai/compare/f2b48c6..12ac0a4): 12ac0a4 2025-06-18 Jacob MacDonald make the log file test less flaky by retrying deletes and reads (dart-lang/ai#180) dbedc6d 2025-06-18 Jacob MacDonald fix cruft in description (dart-lang/ai#179) 3ab9482 2025-06-18 Jacob MacDonald improve tool description for the dtd connection tool and improve error messages (dart-lang/ai#178) 4ca0ff1 2025-06-18 Jacob MacDonald Use shared Implementation type, add clientInfo field to MCPServer (dart-lang/ai#175) 885a4c5 2025-06-18 Jacob MacDonald Add `--log-file` argument to log all protocol traffic to a file (dart-lang/ai#176) 7ca3eba 2025-06-17 Nate Bosch Add JSON schema for test runner arguments (dart-lang/ai#169) core (https://github.com/dart-lang/core/compare/dc97530..b59ecf4): b59ecf4c 2025-06-18 Lasse R.H. Nielsen Optimize surrogate decoding. (dart-lang/core#894) dartdoc (https://github.com/dart-lang/dartdoc/compare/4ceea6b..f1fe177): f1fe1775 2025-06-16 Sarah Zakarias Refactor 404 error page to use div instead of p for search form (dart-lang/dartdoc#4064) ecosystem (https://github.com/dart-lang/ecosystem/compare/64aac3a..d5233c6): d5233c6 2025-06-13 dependabot[bot] Bump the github-actions group with 5 updates (dart-lang/ecosystem#351) web (https://github.com/dart-lang/web/compare/c8c1c28..4b2f02e): 4b2f02e 2025-06-18 nikeokoronkwo Add Variable Declaration Support (dart-lang/web#382) webdev (https://github.com/dart-lang/webdev/compare/661dafd..6dc3dde): 6dc3ddef 2025-06-20 Jessy Yameogo Fix duplicate connection/logs in Webdev (dart-lang/webdev#2635) 0c8a17b4 2025-06-20 Morgan :) Remove dependency overrides. (dart-lang/webdev#2634) a3218638 2025-06-16 Jessy Yameogo modifying DWDS Injector to always inject client and introduce useDwdsWebSocketConnection flag (dart-lang/webdev#2629) 2eb27546 2025-06-16 Morgan :) Prepare for `build_runner` changes. (dart-lang/webdev#2633) Change-Id: Ib323bea37dd77ed94387e77d9c504f889bfa8050 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/436021 Auto-Submit: Devon Carew <devoncarew@google.com> Reviewed-by: Konstantin Shcheglov <scheglov@google.com> Commit-Queue: Konstantin Shcheglov <scheglov@google.com>
Use
char ^ 0xD800 <= 0x3FF
to check if a char code is a lead surrogate. That avoids doing a later& 0x3FF
to get rid of the top bits. Similar for tail surrogate.This ensures that the
high
function gets values without high bits, which makes it smaller (it tries to get inlined, so a little smaller counts).Also optimize that function to reduce dependency depth and try to hit
base + (something < small)
expressions that can optimized into a single x64 address computation.Gives a ~7% increase on backwards traversal and 30% increase for forward traversal, based on tool/benchmark.dart compiled with
dart compile exe
.Actually a small decrease in performance on web for forward iteration, and a small increase for backwards iteration, and Wasm follows Web in performance here.
(Also found a bug in the generator, which hasn't worked since it was last committed.)
Interestingly, the change makes little-to-no difference on the
benchmark/benchmark.dart
benchmark.(Maybe even makes it a little slower.)