Thread-safe python readers #2261

texodus · 2023-06-19T04:31:03Z

This PR introduces a rewrite of perspective-python's concurrency model, enabling safe GIL-less+concurrent calls to Perspective's API, as well as additional assorted performance fixes (including a significant improvement to to_arrow() for both Python and Wasm). In addition to Perspective's internal parallelism, PerspectiveManager can now process requests concurrently by providing a thread pool dispatch to PerspectiveManager.set_loop_callback():

with concurrent.futures.ThreadPoolExecutor() as executor:
    manager.set_loop_callback(psp_loop.run_in_executor, executor)

Fixes a regression in 2.2.1 (only this version) which disabled PSP_PARALLEL_FOR for all python builds.
The semantics of async mode have changed. Before this PR, set_loop_callback() causes GIL unlock behavior, and calls to Perspective APIs from different threads will throw an exception as a protection against the engine's internal thread-unsafeness. With this PR, the GIL is released always, and Perspective's API implements its own internal locking. As a result, a PerspectiveManager can now be called without the GIL via multiple threads from Python, e.g. from a concurrent.futures.ThreadPoolExecutor, substantially increasing thread utilization on servers with many concurrent client requests.
Apache Arrow reading/writing have been ported to utilize PSP_PARALLEL_FOR across columns.
Removes numerous superfluous heap allocations from to_arrow(), improving serialization performance by ~50%:
Fixes a bug in Python PerspectiveManager which caused some callbacks to become crossed when multiple clients are connected.

Historic to_arrow() wasm performance

Historic Python performance across a range of benchmarks (2.2.1 has a specific regression in PSP_PARALLEL_FOR):

texodus added enhancement Feature requests or improvements Python breaking labels Jun 19, 2023

finos-cla-bot bot added the cla-present label Jun 19, 2023

texodus mentioned this pull request Jun 19, 2023

Thread-safe Readers #1276

Closed

3 tasks

texodus force-pushed the rw-lock-v2 branch 2 times, most recently from bf24ad6 to 378fa68 Compare June 19, 2023 05:18

texodus added 7 commits June 19, 2023 19:24

Parallelize arrow column reading and writing

bd3e224

Fix python thread flags

81960c7

Performance improvements for to_arrow()

4dd25f1

Fix benchmarks & examples

aaa6a8d

Fix python server callback ID conflict

dd4f077

Fix CI

829ce73

Replace async mode with read/write lock

14147f7

texodus force-pushed the rw-lock-v2 branch from f486e0d to 14147f7 Compare June 20, 2023 05:42

texodus merged commit e76e396 into master Jun 20, 2023

texodus deleted the rw-lock-v2 branch June 20, 2023 06:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thread-safe python readers #2261

Thread-safe python readers #2261

texodus commented Jun 19, 2023 •

edited

Loading

Thread-safe python readers #2261

Thread-safe python readers #2261

Conversation

texodus commented Jun 19, 2023 • edited Loading

texodus commented Jun 19, 2023 •

edited

Loading