-
Notifications
You must be signed in to change notification settings - Fork 20
feat: #evaluation in ...
command for easier evaluation
#1706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…single JSON object
some options suggested by @bollu, such as omitting the details of messages and including walltime, and some other options to make it easier to run tests
@bollu I added the options you mentioned, could you have a look to see if this is usable in its current form, or whether we need more? |
bv_decide solved 0 theorems. |
bv_decide solved 0 theorems. |
@alexkeizer No, the LLVM evaluation has not been changed by me. What's in the repo is the "latest". I attempted to improve it using |
So when I open, say, theorem test_sext_zext_thm (e : IntW 16) : sext 64 (zext 32 e) ⊑ zext 64 e := by
simp_alive_undef
simp_alive_ops
simp_alive_case_bash
simp_alive_split
extract_goals
all_goals sorry I assumed the
|
This PR adds an
#evaluation in $cmd
command similar to#guard_msgs in $cmd
, which collects all messages generated by an arbitrary command, and serializes them into a JSON object. The idea is that this will be a building block for any kind of evaluation we'd like to do, to easily have existing tactics output JSONL without having to modify those tactics.I was originally planning to use this to convert the existing LLVM evaluation to JSONL, but it seems that evaluation has fundamentally changed. @bollu is the LLVM evaluation superseded by your evaluation?
TODO:
EvaluationHarness