Closed
Description
The count
function is implemented in terms of some C code that looks at each byte: https://github.com/haskell/bytestring/blob/master/cbits/fpstring.c#L75
If you implement count with elemIndex
(which uses C's memchr
underneath), it's much faster on non-consecutive chars:
count1 :: Word8 -> ByteString -> Int
count1 w8 = go 0
where
go c str =
case S.elemIndex w8 str of
Nothing -> c
Just off -> go (c + 1) (S.drop (off+1) str)
{-# INLINE count1 #-}
For non-consecutive characters, such as finding all <
in an XML document, it's obviously way faster because memchr
inspects more than one byte at a time.
For consecutive chars, this is slower. It might be worth including both, or mentioning it in the docs.