Skip to content

Improve recompilation avoidance in the presence of TH #2316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 2, 2022
Merged
Prev Previous commit
Next Next commit
Improve recompilation avoidance in the presence of TH
The old recompilation avoidance scheme performs quite poorly when code
generation is needed. We end up needed to recompile modules basically
any time anything in their transitive dependency closure changes.

Most versions of GHC we currently support don't have a working implementation of
code unloading for object code, and no version of GHC supports this on certain
platforms like Windows. This makes it completely infeasible for interactive
use, as symbols from previous compiles will shadow over all future compiles.

This means that we need to use bytecode when generating code for Template
Haskell. Unfortunately, we can't serialize bytecode, so we will always need
to recompile when the IDE starts. However, we can put in place a much tighter
recompilation avoidance scheme for subsequent compiles:

1. If the source file changes, then we always need to recompile
   a. For files of interest, we will get explicit `textDocument/change`
      events that will let us invalidate our build products
   b. For files we read from disk, we can detect source file changes
      by comparing the mtime of the source file with the build
      product (.hi/.o) file on disk.
2. If GHC's recompilation avoidance scheme based on interface file hashes
   says that we need to recompile, the we need to recompile.
3. If the file in question requires code generation then, we need to recompile
   if we don't have the appropriate kind of build products.
   a. If we already have the build products in memory, and the conditions
      1 and 2 hold, then we don't need to recompile
   b. If we are generating object code, then we can also search for it on
      disk and ensure it is up to date.
   Notably, we did _not_ previously re-use old bytecode from memory when
   hls-graph/shake decided to rebuild the 'HiFileResult' for some reason
4. If the file in question used Template Haskell on the previous compile,
   then we need to recompile if any `Linkable` in its transitive closure
   changed.
   This sounds bad, but it is possible to make some improvements.
   In particular, we only need to recompile if any of the `Linkable`s
   actually used during the previous compile change.

   How can we tell if a `Linkable` was actually used while running some
   TH?

   GHC provides a `hscCompileCoreExprHook` which lets us intercept bytecode
   as it is being compiled and linked. We can inspect the bytecode to see
   which `Linkable` dependencies it requires, and record this for use in
   recompilation checking.
   We record all the home package modules of the free names that occur in the
   bytecode. The `Linkable`s required are then the transitive closure of these
   modules in the home-package environment. This is the same scheme as used by
   GHC to find the correct things to link in before running bytecode.

   This works fine if we already have previous build products in memory, but
   what if we are reading an interface from disk? Well, we can smuggle in the
   necessary information (linkable `Module`s required as well as the time they
   were generated) using `Annotation`s, which provide a somewhat general purpose
   way to serialise arbitrary information along with interface files.

   Then when deciding whether to recompile, we need to check that the versions of
   the linkables used during a previous compile match whatever is currently in the
   HPT.

The changes that were made to `ghcide` in order to implement this scheme include:

1. Add `RuleWithOldValue` to define Rules which have access to the previous value.
   This is the magic bit that lets us re-use bytecode from previous compiles
2. `IsHiFileStable` rule was removed as we don't need it with this scheme in place.
3. Everything in the store is properly versioned with a `FileVersion`, not just
   FOIs.
4. The VFSHandle type was removed. Instead we now take a VFS snapshot on every
   restart, and use this snapshot for all the `Rules` in that build. This ensures
   that Rules see a consistent version of the VFS and also makes
   The `setVirtualFileContents` function was removed since it was not being used anywhere.
   If needed in the future, we can easily just modify the VFS using functions from
   `lsp`.
5. Fix a bug with the `DependencyInformation` calculation, were modules at the top
   of the hierarchy (no incoming edges) weren't being recorded properly

A possible future improvement is to use object-code on the first load (so we
have a warm cache) and use bytecode for subsequent compiles.
  • Loading branch information
wz1000 committed Mar 2, 2022
commit 07d70cd9cf2074383e91ffc537e73be149ced698
Loading