Much faster count function in plain Haskell

The `count` function is implemented in terms of some C code that looks at each byte: https://github.com/haskell/bytestring/blob/master/cbits/fpstring.c#L75

If you implement count with `elemIndex` (which uses C's `memchr` underneath), it's much faster on non-consecutive chars:

```haskell
count1 :: Word8 -> ByteString -> Int
count1 w8 = go 0
  where
    go c str =
      case S.elemIndex w8 str of
        Nothing -> c
        Just off -> go (c + 1) (S.drop (off+1) str)
{-# INLINE count1 #-}
```

For non-consecutive characters, such as finding all `<` in an XML document, it's obviously way faster because `memchr` [inspects more than one byte at a time.](https://github.com/lattera/glibc/blob/master/sysdeps/x86_64/memchr.S)

For consecutive chars, this is slower. It might be worth including both, or mentioning it in the docs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Much faster count function in plain Haskell #109

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Much faster count function in plain Haskell #109

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions