Description
If we have a sufficiently large program to type check we should be able to speed up type checking significantly by using multiple type checker processes that work in parallel.
One potential way to implement this:
- Have a coordinator process that maintains a module dependency graph and keeps track of a work list of modules that haven't been type checked. Initially this will contain all modules in the program.
- Have multiple worker processes that repeatedly get (sets of cyclically dependent) modules from the coordinator that have their dependencies processed, type check them and mark them as processed on the coordinator.
For this to really work we are going to need a way of efficiently communicating type check results for a module between processes (to avoid type checking shared dependencies multiple times). Having a JSON serialization format (see #932) would be sufficient.
Additionally we need a quick way of figuring out the dependency graph of all modules (or at least an approximation of it). We'll probably have to cache that between runs and update it incrementally, similar to the approach outlined in #932.
So how much would this help? Obviously this depends on the shape of the dependency graph. Under reasonable assumptions we should be able to hit an effective level of parallelism of at least 4 for larger programs, but I wouldn't be surprised if we could get even better than that. Cyclic module dependencies can add a limit to how far we can parallelize. We can probably estimate the achievable level of parallelism for a particular program by analyzing the module dependency graph.
This is probably only worth implementing after we have incremental type checking (#932), and we should preserve incrementalism -- i.e., we'd only type check modules modified since the last run and modules that depend on them.