Commit 6c4d666
committed
ext/uri: fast-path canonical URIs in get_normalized_uri
When Uri\Rfc3986\Uri::parse() produces a URI already in canonical form
(the common case: http/https URLs with no uppercase host, no
percent-encoding in unreserved ranges, no ".." path segments),
get_normalized_uri() no longer deep-copies the parsed struct and runs
a full normalization pass. It calls uriNormalizeSyntaxMaskRequiredExA
once to compute the dirty mask; a zero mask means we alias the raw
uri. The struct caches the dirty mask, so multiple non-raw reads on
the same instance only run the scan once.
Fallback: when the mask is nonzero, we copy and normalize as before,
but only for the flagged components (uriNormalizeSyntaxExMmA(...,
dirty_mask, ...) instead of (..., -1, ...)).
Measurements on a 17-URL mix with a realistic parse-and-read workload
(10 runs of 1.7M parses each, CPU pinned via taskset, same-session
stash-pop A/B so both builds share machine state):
baseline mean optimized mean delta
parse only 0.3992s (4.26M/s) 0.4083s (4.16M/s) noise
parse + 1 read 0.6687s (2.54M/s) 0.5464s (3.11M/s) -18.3%
parse + 7 reads 0.8510s (2.00M/s) 0.7305s (2.33M/s) -14.2%
The "parse + 1 read" row isolates the first-read cost where this
change lands. The "parse + 7 reads" row shows the amortized effect
under a realistic user pattern: the first getter pays the reduced
normalization cost, and the remaining six getters hit the cached
normalized uri and cost the same as before.
hyperfine cross-check on the whole benchmark script, 15 runs each:
baseline 20.471 s +/- 1.052 s [19.535 .. 22.985]
optimized 17.240 s +/- 0.540 s [16.556 .. 18.190]
optimized runs 1.19 +/- 0.07 times faster.
All 309 tests in ext/uri/tests pass. I checked that URIs needing
normalization (http://EXAMPLE.com/A/%2e%2e/c resolving to /c) still
hit the full normalize path through the nonzero dirty mask.1 parent 8ad79e1 commit 6c4d666
1 file changed
+16
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
85 | 86 | | |
86 | 87 | | |
87 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
88 | 99 | | |
89 | | - | |
| 100 | + | |
90 | 101 | | |
91 | 102 | | |
92 | 103 | | |
93 | 104 | | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
94 | 109 | | |
95 | 110 | | |
96 | 111 | | |
| |||
0 commit comments