Skip to content

Conversation

HertzDevil
Copy link
Contributor

For some reason, when the GC allocates an 8 MB fiber stack in the interpreter, it might expand the heap only to find that the new pages are already black-listed, so it repeats until the system gives a good heap address. A single allocation could sometimes commit over 4 GB of memory this way.

This PR bypasses the GC entirely and allocates those fiber stacks using mmap or VirtualAlloc directly, the same as non-interpreted code. It doesn't change how stack pools are used, every Crystal::Repl::Interpreter still maintains its own pool. Apart from making specs like channel_spec.cr actually finish on Windows, it looks like the interpreted stdlib specs also run slightly faster on Linux.

The REPL interpreter object still uses the old allocation via Crystal::Repl::Interpreter.new's default argument for @stack.

@ysbaddaden
Copy link
Contributor

Nice! One question: could we pass a grow upwards/downwards flag and protect the appropriate stack end?

@HertzDevil
Copy link
Contributor Author

HertzDevil commented Mar 25, 2024

The guard page violation would go to the interpreter, but its stack trace will be different from that of the interpreted code, so guarding the highest address range of the stack alone wouldn't be sufficient if we need to handle stack overflows in the interpreter. It is probably easier to set a hard limit on the number of interpreter call frames directly

@HertzDevil HertzDevil marked this pull request as ready for review March 25, 2024 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants