Implement equivalence checkers with input handling and other improvements #2

sepp2k · 2023-06-12T19:37:33Z

Hi,

I saw your blog post on HN a while back and figured it would be a fun project to add a proper equivalence checker to your optimizer that can determine program equivalence in the presence of input. I ended up implementing two: a brute force one that just generates all possible inputs and runs the program on them; and an SMT-based one.

I also ended up refactoring a lot of things and changing the semantics of the language slightly (by making it use fixed-width integers) to enable these changes, so the PR ended up rather large. Sorry about that.

For a state that's all zeros, the superoptimizer used to output "LOAD 0" as the shortest program when in fact "LOAD 0" is a no-op and the shortest program to get all zeros is the empty program.

Since programs are generated in ascending order of length, the first program to be found that produces the right state will be the optimal one. Thus there's no reason to keep searching once a program has been found. This improves performance quite a bit in some cases and removes the need to guess the optimal length to provide a short maximum length. Additionally, this removes the program cache as it would only help when reusing results from previous searches, which is currently being done and wouldn't be correct when using different options (like max_value) anyway.

Also implement an equivalence checker that can determine program equivalence in the presence of input (by trying all possible input values).

The program representation now consists of classes with a `__str__` method, so `assembler.output` is no longer needed. More importantly, a program now contains the information how much memory it needs, removing the need to manually set this parameter and removing the possibility of accidentally using a wrong value. The opcodes are now represented as strings rather than functions from the CPU, so that the program representation is no longer tied to the CPU. This removes the need to instantiate dummy CPU objects when you don't actually want to run a program and more importantly makes this representation more usable when working with the program independently of the CPU. The program representation is now defined in the instruction_set module, which removes the dependency from the assembler module to the CPU module. This made it possible to move the `run` helper method into the CPU module (where it belongs) without introducing a cyclical dependency). The CPU, assembler and program generator have been updated according to these changes. The assembler is now also a bit more robust. It now allows leading white space and produces errors on invalid lines, including non-existant opcodes, wrong numbers of arguments or invalid memory addresses. The arity checks and the tracking of the memory size happens based on the definitions in instruction_set and when new instructions are added there, the assembler should be able to parse and check those new instructions without needing to be modified. Similarly the code generator now works solely from the defintions in `instruction_set` and should likewise work without modification when new instructions are added.

This makes the instruction set more realistic as real CPUs don't support arbitrary precision integers either. More impoortantly, having a fixed width means there's only a finite number of possible input, allowing the equivalence checker to actually try all inputs representable using the current bit width rather than unsoundly stopping at an arbitrary limit. Even more importantly, this paves the way for the introduction of an SMT solver as those don't tend to support XOR on unbounded integers, only on bitvectors.

The SMT-based equivalence checker is significantly more expensive in cases where there are no or few inputs. However, it easily outperforms the brute force one in cases where there's a large bit width and/or number of inputs. Sadly, in those cases where the SMT solver would be better, the superoptimizer will already be unbearably slow regardless of the equivalence checker just from generating all the possible candidate programs. So until the program generation is optimized, the SMT based equivalence checker won't do much good for performance.

This improves the performance of the superoptimizer using the SMT-based equivalence checker quite a bit, but it's still noticably slower than the brute force one for cases with little input.

sepp2k added 9 commits June 3, 2023 20:48

Remove IO from functions and add tests

fe4da56

Fix wrong results when the shortest program is empty

a2297b2

For a state that's all zeros, the superoptimizer used to output "LOAD 0" as the shortest program when in fact "LOAD 0" is a no-op and the shortest program to get all zeros is the empty program.

Fix exception when no program could be found

79a3137

Implement specifying the first n bytes of memory as input

8483c9a

Also implement an equivalence checker that can determine program equivalence in the presence of input (by trying all possible input values).

Reuse SMT state in SMT-based equivalence checker

f2e4e47

This improves the performance of the superoptimizer using the SMT-based equivalence checker quite a bit, but it's still noticably slower than the brute force one for cases with little input.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement equivalence checkers with input handling and other improvements #2

Implement equivalence checkers with input handling and other improvements #2

Uh oh!

sepp2k commented Jun 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Implement equivalence checkers with input handling and other improvements #2

Are you sure you want to change the base?

Implement equivalence checkers with input handling and other improvements #2

Uh oh!

Conversation

sepp2k commented Jun 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant