-
-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add simd string quote seeking #52
Conversation
ca4cf3e
to
7123c36
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #52 +/- ##
==========================================
- Coverage 95.12% 90.40% -4.73%
==========================================
Files 8 8
Lines 1067 1198 +131
==========================================
+ Hits 1015 1083 +68
- Misses 52 115 +63
Continue to review full report in Codecov by Sentry.
|
CodSpeed Performance ReportMerging #52 will degrade performances by 10.04%Comparing Summary
Benchmarks breakdown
|
7123c36
to
bce908f
Compare
I could reproduce locally that SSE simd was slower than whatever the compiler was generating. On aarch64 the simd intrinsics are much newer and I did reproduce a speedup of some 25%; this may also be that the aarch64-apple-darwin target is not as optimised as the x86_64-linux-gnu one which makes it easier to beat the compiler. We could look at AVX simd but that can't run under Rosetta on macOS, so I'll have to look next week. It's unclear if GitHub Actions supports AVX. |
45b8b73
to
8711efd
Compare
I've implemented AVX simd for x86_64. It's not clear-cut; for short strings the non-simd version wins. On aarch64 (neon) on my M1 the breakeven is two-character strings, for x86_64 on my desktop the breakeven is four-character strings. The address sanitizer doesn't like the buffer overflow. This isn't a huge surprise, in practice it shouldn't be a problem but it is a potential source of danger after refactoring, so it should probably be reworked. Overall what I've got here is a bit of a mess. Given it's not a guaranteed perf win, especially in pydantic where we might expect a lot of short strings for enum values (?), I'm not particularly in love with this branch. I'll leave it here as draft, to be decided later if we refactor or bin it. |
cf9d6f6
to
84087c9
Compare
closing as this was superceded in aarch64 with #65 and x86 will need to start gain. |
No description provided.