ghc-debug
is a tool for debugging the heap of a Haskell program. Think of it like gdb
but specialised for debugging Haskell programs. It's the tool for you if you've ever wondering things like:
- Why is a specific object not being garbage collected!?
- What is the precise breakdown of the heap at a specific point in the program!?
- What's inside all the lists in my program!?
How does it work? There are two parts to the library. The first part adds a startup hook to the RTS which creates a socket for another process to connect to control and request information from the process. The debugger then connects to this socket, executes a user debugging script and reports some information from the user.
It's important that there are two different processes because in order to report accurate information to the user you have to completely pause the executation of the program you want to debug. If you don't do this then GC will quickly invalidate any information that you obtain.
The API already allows you to perform some simple tasks.
- Decode the GC roots
- Save specific objects for inspection in the debugger
- Use
findPtr
to find closures which reference a specific address. - Read DWARF information from an executable and map info table pointers to source locations.
- Visualise closures on a web interface.
There are still many parts of the library which could be improved. At Munihac 2019 we (@bgamari and @mpickering) will work on implementing the crucial missing features to the library.
The library lives at: http://www.github.com/bgamari/ghc-debug
In order to build the library you need to use a custom version of GHC from this branch: https://gitlab.haskell.org/ghc/ghc/tree/wip/ghc-debug
If you end up building ghc
yourself then make sure you use at least the quick
flavour so that
the RTS is built is all the necessary ways. Then you can use cabal new-configure
to set the
path to ghc.
Once the environment is set up you can use cabal new-build
as normal in order to build the project.
There are a few components which are useful to know about.
debugger: The main debugger
debug-test: An executable which is used for testing with the debugger
There are two tasks which Ben and I will focus on in order to fix some fundamental problems with the library.
A full heap traversal is not currently possible because we don't decode STACK closures. They have a sightly different structure to other closures but Ben has indicated that he knows what to do here.
Currently there are two pause modes, one where the debugger pauses and one where the debuggee pauses.
- If the debugger initiates the pause then unpausing works correctly.
- If the debuggee initiates the pause then unpausing causes an assertion failure.
This second case needs to be fixed as initiating the pause from the debuggee is far more useful as you have precise control over when exactly the pause happens.
We welcome help with any of these future goals.
The ghc-debug
API implemented in GHC is only available with the threaded runtime.
Therefore, for the single-threaded runtime, stub implementations of all the methods
are provided. At the moment, they just don't do anything but it would be much better
to warn a user that they need to use the threaded runtime.
TSO
closures are only half implemented. There is quite a bit more information in a TSO
which you might plausibly care about.
You can use this recent MR which added support for WEAK closures as a starting point.
https://gitlab.haskell.org/ghc/ghc/merge_requests/1475
When a file snippet is printed out it's important that the file is still the same one as when the executable was compiled. The DWARF information inside the executable records the modtime of the file so in the debugger we can check if the file has been modified in the meantime. There is a similar check already implemented in gdb.
Tools such as ghc-debug
work much better if memory is zeroed after it is freed.
Currently the only way to do this in GHC is to use the -DS
flag but this also
enables some expensive sanity checks.
It would be good to implement an option -DZ
which zeros memory appropiately so
it can be enabled when using ghc-debug
.
The current API to write debugging programs could be refined. In particular, the
Debuggee
record is explicitly passed around everywhere rather than using a Reader
like monad.
No particular thought has been given to what API endpoints are exposed so organising them and writing documentation would be worthwhile.
ClosurePtr
s are only valid in the current pause window. Information about the current
pause window should be tracked and the API should enforce you can only lookup
pointers from the current generation.
It will be much more efficient if we can batch together non-dependent requests
to decode closures. However, it is much more convenient to provide an API which
performs an individual request. The haxl
library is designed to solve this problem
so it would be good to implement support for it.
The RequestClosures
request already supports requesting multiple closures at once so
all the work to implement this will be in the Haskell code.
A very basic visualisation mode is currently implemented which starts a web server and draws a specific closure on a HTML canvas. It should be possible to do much better than this. The details are up to you! The heap is just a graph, but a large one so some standard visualisation techniques would be useful.
Testing the library is a little bit difficult because you always need two programs. One to debug and a debugger.
I'm not sure if you can nicely do this in a normal Cabal
testing framework. It could be possible to compile simple
test programs using the GHC API.
It's also possible to add a hook to redirect the eventlog. This could be used to read eventlog events and dispatch on them. It's not clear to me how easy this would be to implement or even what would need to be implemented where.
You might want to trigger a pause if the stack size reaches a certain amount or if residency crosses a certain threshold.
It's unclear exactly how to implement this but the basic idea is that
it should be possible to request that the RTS evaluates a THUNK
closure.
Once the request is made, the RTS should be unpaused, the closure evaluated, repaused and the resulting closure sent back to the debugger.
It would be nice to implement a TUI which allowed for a more interactive exploration of the heap. We have not deeply considered the design of this.
It would be nice to implement a Web GUI which allowed for a more interactive exploration of the heap. We have not deeply considered the design of this.