Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support free-threaded Python #165

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

support free-threaded Python #165

wants to merge 14 commits into from

Conversation

davidhewitt
Copy link
Collaborator

Add freethreaded Python support.

Pushing a bit early so that I can see results of benchmarks & CI.

Copy link

codecov bot commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 72.22222% with 5 lines in your changes missing coverage. Please review.

Project coverage is 88.74%. Comparing base (f970f0b) to head (afc89e6).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/jiter/src/py_string_cache.rs 61.53% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #165      +/-   ##
==========================================
- Coverage   88.92%   88.74%   -0.19%     
==========================================
  Files          13       13              
  Lines        2195     2203       +8     
  Branches     2195     2203       +8     
==========================================
+ Hits         1952     1955       +3     
- Misses        148      153       +5     
  Partials       95       95              
Files with missing lines Coverage Δ
crates/jiter-python/src/lib.rs 93.22% <100.00%> (ø)
crates/jiter/src/py_string_cache.rs 92.98% <61.53%> (-4.19%) ⬇️

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f970f0b...afc89e6. Read the comment docs.

---- 🚨 Try these New Features:

Copy link

codspeed-hq bot commented Nov 19, 2024

CodSpeed Performance Report

Merging #165 will improve performances by 11.93%

Comparing dh/free-threaded (afc89e6) with main (dbf0c52)

Summary

⚡ 1 improvements
✅ 72 untouched benchmarks

Benchmarks breakdown

Benchmark main dh/free-threaded Change
unicode_jiter_iter 8.6 µs 7.7 µs +11.93%

@@ -86,28 +85,34 @@ impl StringMaybeCache for StringNoCache {
}
}

static STRING_CACHE: GILOnceCell<GILProtected<RefCell<PyStringCache>>> = GILOnceCell::new();
static STRING_CACHE: OnceLock<Mutex<PyStringCache>> = OnceLock::new();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the use of a mutex here, the single-threaded benchmark is not meaningfully impacted.

We can worry about multithreaded performance in the far future if highly parallel uses of jiter should arise; and if users hit a pathological case before we do anything fancy they can always turn off the string cache.

Comment on lines +355 to +369
def test_multithreaded_parsing():
"""Basic sanity check that running a parse in multiple threads is fine."""
expected_datas = [json.loads(data) for data in JITER_BENCH_DATAS]

def assert_jiter_ok(data: bytes, expected: Any) -> bool:
return jiter.from_json(data) == expected

with ThreadPoolExecutor(8) as pool:
results = []
for _ in range(1000):
for data, expected_result in zip(JITER_BENCH_DATAS, expected_datas):
results.append(pool.submit(assert_jiter_ok, data, expected_result))

for result in results:
assert result.result()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm with this simple test that the Mutex basically stops parallelism when the cache is enabled, using jiter.from_json(data, cache_mode="none") leads to about 8x speedup on my machine.

I still don't mind that really, this PR just gets the freethreaded mode working and we can worry about that sort of optimization later.

@davidhewitt davidhewitt marked this pull request as ready for review November 19, 2024 20:33
@davidhewitt
Copy link
Collaborator Author

@samuelcolvin I think this is good to ship; main thing for you to be aware of is the mutex on the cache as per the above, so parallelism is not as good as it could be with the cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant