Skip to content

Conversation

@rdw-software
Copy link
Member

@rdw-software rdw-software commented Feb 7, 2025

Needs benchmarking to determine whether the overhead is significant. If yes, also create a C++ wrapper.

Preliminary results from a few simple/throwaway microbenchmarks suggest:

  • If the components rarely or never change, the JIT-optimized traces will run faster than a pure C++ implementation
  • For crypto-RNG'ed components of moderate length (256 byte tokens), C++ is slightly (10-20%) faster
  • Varying the token lengths does affect the runtime (moderate effect), but parsing itself isn't a bottleneck
  • For very short components the C++ version is slightly slower or faster, depending on the exact size
  • I'm guessing this is due to how allocations take place as it changes drastically at power-of-two boundaries
  • For very large sizes (several KB to MB), "naive" alloc-free sequences are evidently causing a huge slowdown

Regarding size limits:

Although libcurl claims to support up to 8MB for each URL component, that limit seems impractically high. I've therefore limited my testing to the more sensible range of up to 2 KB max. At this point it seems that the actual time is dominated by allocations and cleanup (free), which isn't yet optimized but should be - see note below.

Semi-related:

The built-in allocator should be integrated with libcurl in any event, same as for some of the other libraries, but that's something of a future concern. Libcurl does offer an API to do it, but that's not currently being used.

Basically, it doesn't seem worth optimizing these bindings right now. Allocations are already on my (long) list.

@rdw-software rdw-software linked an issue Feb 8, 2025 that may be closed by this pull request
15 tasks
Although curl will do this on its own when using various APIs, this is better handled explicitly in the C++ layer.

Scripts could theoretically do it as well, but that sounds like a recipe for disaster.
@rdw-software rdw-software marked this pull request as ready for review February 16, 2025 18:39
Doesn't seem to have a significant performance cost, but it's easier to use.
Although libcurl.a is built correctly, the runtime needs to define this again before including curl.h - an unfortunate side effect of curl being on every system is that these kinds of oversights don't fail loudly.
@rdw-software rdw-software merged commit 83fa85b into main Feb 16, 2025
11 checks passed
@rdw-software rdw-software deleted the libcurl-url-parsing branch February 16, 2025 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add FFI bindings for libcurl's URL interface

2 participants