Skip to content

Load memory snapshot into cbmc #3755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

danpoe
Copy link
Contributor

@danpoe danpoe commented Jan 10, 2019

This is a basic first version of loading a concrete memory snapshot (represented as a JSON symbol table) to initialize goto symex with, and then start symex from a given program location.

This is done in goto symex (as opposed to generating a goto program harness) as this will allow to potentially make the stack part of the snapshot in the future, which would allow to return from functions that have been called during the concrete execution.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@kroening
Copy link
Member

Sure this can't be done by building a harness, in goto-instrument? The symbolic execution engine appears to be the wrong place for that.

@danpoe
Copy link
Contributor Author

danpoe commented Jan 11, 2019

@kroening There was a discussion about this with @tautschnig on the respective Slack channel. If we just want to model check a single function (and its callees) in a specific context it would be possible via a harness. But if we want to be able to model check a program starting from a concrete snapshot theoretically until the end of the program this seems hard to do that way. For that we will also need to initialize the symex call stack from a concrete call stack snapshot in order to be able to return from functions that have been called during the concrete execution. I'm not sure if it's possible to build a harness that can do that.

This adds const iterators to iterate over the internal_symbols map of the symbol
table. So far, only non-const iterators existed.
… location

This adds a feature to load a concrete memory snapshot (represented as a JSON
symbol table) into cbmc to initialize goto symex with, and then start symex from
a given program location. The path to the snapshot must be passed to
--memory-snapshot and the initial program location must be specified via
--initial-location. This is a basic first version that only supports the
initialisation of global variables. Moreover, the concrete memory snapshot does
not contain a representation of the stack of the program, hence symex must stop
when it reaches the end of the function in which it was started (but it can go
into callees and return from them).
This adds several regression tests for the feature of initialising goto symex
with a concrete JSON memory snapshot, and start symex from a given program
location.
@danpoe danpoe force-pushed the feature/start-bmc-from-memory-snapshot branch from 722c6ca to f051a4e Compare January 11, 2019 14:30
@smowton
Copy link
Contributor

smowton commented Jan 14, 2019

For stack snapshotting, would it be best to create a GOTO program with specialisations of the functions on the stack? For example I would add a prefix that declares and sets local variable values, then jumps to the point where the snapshot was taken (subsequent execution will use the real GOTO program)

That way the snapshot isn't tied into symex, but rather is a general concept that can be subject to any GOTO program analysis / transformation.

@danpoe
Copy link
Contributor Author

danpoe commented Jan 14, 2019

@smowton Yes, I think that would work. Basically we'd add a boolean variable func_init_done for every function on the stack in the snapshot. When a function is entered for the first time func_init_done will be false and we'd do the local init code, set the variable to true, and then jump to the call of the next function on the call stack.

@tautschnig Would you be ok too with doing the memory snapshot loading via instrumentation?

@tautschnig
Copy link
Collaborator

@danpoe My apologies for leaving this on my stack for this long. The suggestion from @smowton seemed very good, but you might have gone down a different route now?

@danpoe
Copy link
Contributor Author

danpoe commented Feb 1, 2019

@tautschnig No problem, I think we have resolved this in a Slack conversation back then anyways. So the plan is to move this into the goto-harness tool that @hannes-steffenhagen-diffblue and @NlightNFotis have been working on and do it via instrumentation.

@tautschnig
Copy link
Collaborator

@danpoe Yes, going via goto-harness will be cool, I think a big chunk of the code in this PR can remain in place as you've put together very nice tests already. The extra bit of work will be to translate the JSON-loaded symbol table to a harness rather than into a state to symex from.

@tautschnig tautschnig removed their assignment Feb 1, 2019
@danpoe
Copy link
Contributor Author

danpoe commented Feb 10, 2019

Closed in favor of #4150.

@danpoe danpoe closed this Feb 10, 2019
@danpoe danpoe deleted the feature/start-bmc-from-memory-snapshot branch June 2, 2020 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants