1BRC in Crystal #711
nogginly
started this conversation in
Show and tell
Replies: 2 comments 1 reply
-
@nogginly Can you update the OP post with the various timings? |
Beta Was this translation helpful? Give feedback.
1 reply
-
FYI, I've re-arranged the post to move the oldest posts to the bottom and recent ones to the top. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Updated, Feb 11
mmap
or not doesn't make a big difference.mmap
version, which uses ~1GB of runtime RAM, is going to do well regardless of file size.Update, Feb 7
mmap
... memory! With only 16GB of RAM memory mapping a 13GB file (1B row) was causing problems (likely swapping).mmap
variant outperforms all others on macOS.FxHashMap
(inspired by themerykitty
implementation) after digging into the FxHash function (details here) in the Rust compiler.b
variants which all use the newFxHashMap
; in the the TODO you can see the comparison where theb
variants beat all their predecessors.Relative performance is now as follows, with my implementation now only 😄 1.87x slower (vs 2.58x as of two days ago).
Update, Feb 5
I've added a version that uses
mmap
for loading the file, which turns out to improve performance on Linux but not so much on macOS.On a PC with AMD Ryzen 7 7735HS CPU, 16 cores, 32 GB, and running Linux, comparing with some other
1brc
contenders there's still room to do. See the TODO list for changes so far and possible further improvements.merykittyunsafe
merykitty
dannyvankooten/analyze
1brc_parallel_mmap 64 8
Relative performance was as follows:
Update, Feb 4
1brc_parallel2
and1brc_parallel_ptr2
)Update Feb 2, 2024
1brc_parallel_ptr
which usesPointer
instead ofSlice
to parse the buffer, essentially eliminating index bounds checking. While unsafe normally, the code is careful about the index maths.shards build -Dpreview_mt
to build the binariesOriginal post
My implementation in Crystal is here: https://github.com/nogginly/1brc.cr
I've been inspired by the many contributions and over time have incorporated ideas. The following is a list of the different ways in which I've been able to speed up the Crystal implementation (
1brc_parallel.cr
in the repo):I have not done the following (yet):
did not use memory mapping for the file since (a) Crystal limits arrays to Int32 indices, (b) the most memory I have is 16GB(see update below)Hash
map use, which turns out to be the biggest bottleneck after latest updateBeta Was this translation helpful? Give feedback.
All reactions