Switch to using thread_local regular expressions to stop mutext contention #996

orf · 2023-08-25T16:31:04Z

Fixes #986. This is also rebased on top of #995 to allow me to run the benchmarks.

Summary

When calling libcst_native::parse_module from multiple threads the parsing performance drops dramatically. This is somewhat documented in the regex_automata crate and in other places around the internet:

You need to work-around contention issues with sharing a regex across multiple threads. The meta::Regex::search_with API permits bypassing any kind of synchronization at all by requiring the caller to provide the mutable scratch spaced needed during a search.

Benchmarks

I've added a multi-threaded benchmark using Rayon. The use of thread locals improves the throughput by 22% and reduces the time by 18%:

parse_into_cst_parallel/all
                        time:   [476.86 ms 482.42 ms 488.68 ms]
                        thrpt:  [5.9343 Kelem/s 6.0113 Kelem/s 6.0814 Kelem/s]
                 change:
                        time:   [-19.567% -18.431% -17.452%] (p = 0.00 < 0.05)
                        thrpt:  [+21.142% +22.595% +24.327%]
                        Performance has improved.

However the throughput increase I've seen with high-volume "real-world" parsing is much, much higher.

Benchmarks

These are the results of the benchmark added in this change, with and without the thread_local! change. It charts the time taken to parse the fixture files. The number of files is multiplied by the number of threads, which ranges from 1 to 10. At 10 threads the average time taken is 50% faster.

Benchmark: Before

Benchmark: Before

codecov-commenter · 2023-08-26T10:04:20Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (b28777e) 90.93% compared to head (3f34a66) 90.93%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #996   +/-   ##
=======================================
  Coverage   90.93%   90.93%           
=======================================
  Files         254      254           
  Lines       25972    25972           
=======================================
  Hits        23617    23617           
  Misses       2355     2355

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

zsol

Thank you! This looks great to me. Happy to merge once #995 is in.

zsol · 2023-08-26T10:47:23Z

Oh, one thing though: you'll have to update the MSRV to (looks like 1.70), but that should be fine.

orf · 2023-08-26T11:19:00Z

Done @zsol! I also bumped some dependencies. notably the regex bump also improves parsing speed quite a lot (the benchmarks above used the latest regex version). Let me know if you want me to split them out from this.

… contention

zsol · 2023-08-26T14:21:11Z

Thank you!

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 25, 2023

orf force-pushed the use-thread-locals branch 5 times, most recently from f215978 to 77269e7 Compare August 25, 2023 18:53

zsol approved these changes Aug 26, 2023

View reviewed changes

orf force-pushed the use-thread-locals branch from 77269e7 to 2c5f9c7 Compare August 26, 2023 11:15

orf force-pushed the use-thread-locals branch 2 times, most recently from ab536ea to 4ce22ba Compare August 26, 2023 12:21

Switch to using thread_local regular expressions to avoid regex mutex…

3f34a66

… contention

orf force-pushed the use-thread-locals branch from 4ce22ba to 3f34a66 Compare August 26, 2023 12:36

zsol merged commit 75b6331 into Instagram:main Aug 26, 2023
21 checks passed

orf deleted the use-thread-locals branch August 26, 2023 14:21

MichaReiser mentioned this pull request Aug 29, 2023

Run cargo update astral-sh/ruff#6964

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to using thread_local regular expressions to stop mutext contention #996

Switch to using thread_local regular expressions to stop mutext contention #996

orf commented Aug 25, 2023 •

edited

Loading

codecov-commenter commented Aug 26, 2023 •

edited

Loading

zsol left a comment

zsol commented Aug 26, 2023 •

edited

Loading

orf commented Aug 26, 2023

zsol commented Aug 26, 2023

Switch to using thread_local regular expressions to stop mutext contention #996

Switch to using thread_local regular expressions to stop mutext contention #996

Conversation

orf commented Aug 25, 2023 • edited Loading

Summary

Benchmarks

Benchmarks

Benchmark: Before

Benchmark: Before

codecov-commenter commented Aug 26, 2023 • edited Loading

Codecov Report

zsol left a comment

Choose a reason for hiding this comment

zsol commented Aug 26, 2023 • edited Loading

orf commented Aug 26, 2023

zsol commented Aug 26, 2023

orf commented Aug 25, 2023 •

edited

Loading

codecov-commenter commented Aug 26, 2023 •

edited

Loading

zsol commented Aug 26, 2023 •

edited

Loading