Closed
Description
This is what the benchmarks currently look like:
Text.Parsing.StringParser.CodeUnits
StringParser.runParser parse23Units
mean = 10.10 ms
stddev = 1.13 ms
min = 9.46 ms
max = 24.07 ms
Text.Parsing.Parser.String
runParser parse23
mean = 44.20 ms
stddev = 6.38 ms
min = 42.25 ms
max = 113.16 ms
Data.String.Regex
Regex.match pattern23
mean = 728.23 μs
stddev = 339.32 μs
min = 613.72 μs
max = 2.97 ms
I would like to reduce that 4× slowness between Parser.String and StringParser.CodeUnits .
The difference could be due to:
CodePoint
rather thanChar
. Everything goes through theanyCodePoint
parser since Unicode correctness #119 , but I benchmarked it at the time and it didn't make any difference.- String tail state. Every time the parser advances by one character, we
uncons
the input string and save the tail as the new state. I tried changing that to only keeping a codeunit index into the input string on this branch and it didn't make any difference. https://github.com/jamesdbrock/purescript-parsing/tree/cursor - Parsing.Parser.String input position tracking with
Pos { line :: Int, column :: Int}
. I tried changing that toPos Int
on this branch and it didn't make any difference. https://github.com/jamesdbrock/purescript-parsing/tree/cursor - Monad transformers. When I look at the benchmark profiling, it looks like most of the time is spent in
Control.Monad.State.Trans.bind
andText.Parsing.Parser.Combinators.tryRethrow
. So this might be the entire problem, but improving this won't be easy.
Metadata
Metadata
Assignees
Labels
No labels