it's possible that this is the fastest non-jit interpreter (see the competition)
it's also possible that this is just tuned to my system and won't be fast for anyone else
see the notable micro-optimisations for fun insights
mine:
- (0.77s)
./build/src/standalone/brainfuck samples/mandelbrot.b 0.77s user 0.00s system 99% cpu 0.769 total
other non-jit entries:
- (0.98s) https://github.com/JohnCGriffin/BrainForked
./bf ../bf-cpp/samples/mandelbrot.b 0.98s user 0.00s system 99% cpu 0.978 total - (1.14s) https://github.com/rinoldm/sbfi
./a.out ../bf-cpp/samples/mandelbrot.b 1.14s user 0.00s system 99% cpu 1.145 total - (1.16s) https://github.com/rdebath/Brainfuck
./tritium/bfi.out ../bf-cpp/samples/mandelbrot.b -r 1.16s user 0.01s system 99% cpu 1.170 total - (1.96s) https://copy.sh/brainfuck/?file=https://copy.sh/brainfuck/prog/mandelbrot.b
- (2.22s) https://github.com/primo-ppcg/bfci
./bfci ../bf-cpp/samples/mandelbrot.b 2.22s user 0.00s system 99% cpu 2.226 total - (3.72s) https://github.com/apankrat/bff
./a.out ../bf-cpp/samples/mandelbrot.b 3.72s user 0.00s system 99% cpu 3.722 total - (4.08s) https://github.com/dmitmel/brainwhat
./target/release/brainwhat ../bf-cpp/samples/mandelbrot.b 4.08s user 0.00s system 99% cpu 4.085 total - (4.49s) https://github.com/phunanon/Barebrain
./C/Barebrain ../bf-cpp/samples/mandelbrot.b 4.49s user 0.00s system 99% cpu 4.494 total
the fastest jit implementation for reference:
- (0.30s) https://github.com/rdebath/Brainfuck
./tritium/bfi.out ../bf-cpp/samples/mandelbrot.b 0.30s user 0.00s system 99% cpu 0.304 total
build:
cmake --preset release
cmake --build --preset allrun:
./build/src/standalone/brainfuck samples/mandelbrot.b
./build/src/tests/tests
- This was the first project that I was really conscious of the idea of code locality impacting performance
- 2016 LLVM Developers’ Meeting: Z. Ansari "Causes of Performance Instability due to Code ..."
- "Performance Matters" by Emery Berger
- 4% performance gains by changing the order of the cpp files in the CMakelists.txt
- 8% performance loss by removing dead code
- 15% performance loss by changing value of a constant
- 13% performance gains by ensuring an odd multiple of 64 bits in the Instruction struct
- 12% performance gains by using pointer arithmetic rather than using a dynamic index
- The interpreter loop is a good reminder that C++ is still actually a relatively high level language
- 5% performance gains by using branchless looping
- 2% performance gains by using sequential enum values
- https://github.com/Jumbub/bf-cpp/commit/eb9b1714bd0e1bed281d94bde227510293d466cd
- Generates better switch statement jump tables
- 3% performance gains by telling the compiler that it will never reach the default switch case
- 5% performance gains by using computed gotos (godbolt link)
- https://github.com/Jumbub/bf-cpp/commit/c26dec0dc0ff1661c6a151fca66b733d65ea789f
- Simpler instruction jump table
- Optimising for CPU instruction speed over data compression
- 11% overall performance gains by switching from 8 bit data types to 64
- https://github.com/Jumbub/bf-cpp/commit/0e52ab9c875a291a1e3ba4bf2d8b520ede0da11b
- https://github.com/Jumbub/bf-cpp/commit/039f096668d2f592e41f8033bb16ecd579074cbe
- https://github.com/Jumbub/bf-cpp/commit/53e31164d85d45b4e94c394c210fa60a00ca58c5
- https://github.com/Jumbub/bf-cpp/commit/1bc35a79799e14713de64ab7d2dec016060afd1c
- Current generation CPUs are just much faster when operating on 64 bit data
- 11% overall performance gains by switching from 8 bit data types to 64