Skip to content

SIMD accelerated escape routines #405

Open
@dralley

Description

@dralley

Unlike the unescape routines, the routines for escaping text don't currently utilize any SIMD accelleration.

This should be possible to do via the jetscii crate. memchr is currently used by the unescape routines, but while it is supposed to be slightly faster than jetscii it is also more limited and can only handle searching for up to 3 different bytes at a time, whereas jetscii can handle up to 16. Since escaping text requires searching for up to 5 characters <>&" ', memchr is not an option but jetscii is.

jetscii also seems capable of searching for recognizing byte sequences as well as single bytes, so it could potentially be used with UTF-16 and other multibyte encodings in the future (but I don't think you can search for multiple byte-sequence-patterns at the same time, so there's limitations to this).

Benchmark coverage needs to be added first: #404

Metadata

Metadata

Assignees

Labels

enhancementhelp wantedoptimizationIssues related to reducing time needed to parse XML or to memory consumption

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions