Experiment with pointer-based slice prefix computation in normalizer #2433

hsivonen · 2022-08-22T15:00:38Z

#2378 uses a pattern where there is a full slice and another slice that's known to be its suffix, and the prefix of the full slice is computed either so that the suffix is excluded or the suffix and a number of code units before the suffix is excluded.

Currently, this is done by taking the length of the full slice and subtracting the length of the suffix slice. However, in practice, the suffix slice comes from as_slice()/as_str() on a by-char iterator. The iterator may not actually store the length internally. Whether or not the iterator actually stores the length, it does store its start pointer.

It could be a tiny bit more efficient to compute the prefix length from the pointer distance. Done in the pointer domain, this requires unsafe. By casting the pointers to usize, this can be done in safe code. It's unclear to me if real optimization opportunities are lost by casting away pointerness before subtracting.

The text was updated successfully, but these errors were encountered:

hsivonen · 2022-08-22T15:01:43Z

(If this is a win for the &str case, it then makes sense to explore making utf8_iter and utf16_iter store a pointer past the end instead of storing the remaining slice.)

sffc · 2022-09-08T18:35:42Z

Is this fixed by #2378?

hsivonen · 2022-09-09T09:34:02Z

Is this fixed by #2378?

No, this is a follow-up for that one.

It's unclear to me if real optimization opportunities are lost by casting away pointerness before subtracting.

@Gankra 's RustConf talk says this is, in principle, an operation that can pessimize other uses of the pointer, so it's probably a bad idea to go this way.

Sadly, it appears there isn't a middle-ground analogous to integer overflow: Pointer distance computation that 1) would not require unsafe to call, 2) wouldn't make the pointers "exposed", 3) wouldn't make it open-ended UB to have mismatched provenance but would instead return the address distance upon provenance mismatch on architectures that can't efficiently abort on provenance mismatch.

In any case, as far as micro optimizations go, what's contemplated here is very micro.

sffc · 2022-10-17T21:05:42Z

@hsivonen Can you set an assignee (or "help wanted") and a milestone (or "backlog")?

hsivonen · 2022-10-19T06:06:54Z

I marked this backlog, but I didn't suggest "help wanted" at this time, because this optimization is so micro that it makes more sense to focus on other perf issues first.

hsivonen added A-performance Area: Performance (CPU, Memory) S-small Size: One afternoon (small bug fix or enhancement) C-collator Component: Collation, normalization labels Aug 22, 2022

hsivonen mentioned this issue Aug 23, 2022

Optimize normalizers for contiguous-memory input/output case #2378

Merged

hsivonen added the backlog label Oct 19, 2022

sffc added this to the Backlog milestone Dec 22, 2022

sffc removed the backlog label Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with pointer-based slice prefix computation in normalizer #2433

Experiment with pointer-based slice prefix computation in normalizer #2433

hsivonen commented Aug 22, 2022

hsivonen commented Aug 22, 2022

sffc commented Sep 8, 2022

hsivonen commented Sep 9, 2022

sffc commented Oct 17, 2022

hsivonen commented Oct 19, 2022

Experiment with pointer-based slice prefix computation in normalizer #2433

Experiment with pointer-based slice prefix computation in normalizer #2433

Comments

hsivonen commented Aug 22, 2022

hsivonen commented Aug 22, 2022

sffc commented Sep 8, 2022

hsivonen commented Sep 9, 2022

sffc commented Oct 17, 2022

hsivonen commented Oct 19, 2022