Skip to content

Create harness to initialize memory from snapshot #4150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 15, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add harness generator to initialise memory from a memory snapshot
This adds a harness generator to goto-harness that takes a goto program and a
memory snapshot in JSON format (via the option --memory-snapshot), and then
builds a harness that initialises the global variables from the snapshot. In
addition the harness generator provides the option --initial-location to specify
the starting point of the execution. The harness thus behaves as if the original
program would start execution at the given program location.
  • Loading branch information
danpoe committed Mar 15, 2019
commit a809e2cffea32fa7c07f6c9ca57753b6ef263b1e
2 changes: 2 additions & 0 deletions src/goto-harness/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ generic_includes(goto-harness)
target_link_libraries(goto-harness
util
goto-programs
json
json-symtab-language
)
3 changes: 3 additions & 0 deletions src/goto-harness/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ SRC = \
goto_harness_generator_factory.cpp \
goto_harness_main.cpp \
goto_harness_parse_options.cpp \
memory_snapshot_harness_generator.cpp \
recursive_initialization.cpp \
# Empty last line

Expand All @@ -13,6 +14,8 @@ OBJ += \
../big-int/big-int$(LIBEXT) \
../langapi/langapi$(LIBEXT) \
../linking/linking$(LIBEXT) \
../json/json$(LIBEXT) \
../json-symtab-language/json-symtab-language$(LIBEXT) \
# Empty last line

INCLUDES= -I ..
Expand Down
10 changes: 9 additions & 1 deletion src/goto-harness/goto_harness_parse_options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Author: Diffblue Ltd.

#include "function_call_harness_generator.h"
#include "goto_harness_generator_factory.h"
#include "memory_snapshot_harness_generator.h"

// The basic idea is that this module is handling the following
// sequence of events:
Expand Down Expand Up @@ -103,7 +104,8 @@ void goto_harness_parse_optionst::help()
"generate\n"
<< "--harness-type one of the harness types listed below\n"
<< "\n\n"
<< FUNCTION_HARNESS_GENERATOR_HELP << messaget::eom;
<< FUNCTION_HARNESS_GENERATOR_HELP << "\n\n"
<< MEMORY_SNAPSHOT_HARNESS_GENERATOR_HELP << messaget::eom;
}

goto_harness_parse_optionst::goto_harness_parse_optionst(
Expand Down Expand Up @@ -162,6 +164,12 @@ goto_harness_generator_factoryt goto_harness_parse_optionst::make_factory()
return util_make_unique<function_call_harness_generatort>(
ui_message_handler);
});

factory.register_generator("initialise-with-memory-snapshot", [this]() {
return util_make_unique<memory_snapshot_harness_generatort>(
ui_message_handler);
});

return factory;
}

Expand Down
1 change: 1 addition & 0 deletions src/goto-harness/goto_harness_parse_options.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Author: Diffblue Ltd.
"(version)" \
GOTO_HARNESS_FACTORY_OPTIONS \
FUNCTION_HARNESS_GENERATOR_OPTIONS \
MEMORY_SNAPSHOT_HARNESS_GENERATOR_OPTIONS \
// end GOTO_HARNESS_OPTIONS

// clang-format on
Expand Down
306 changes: 306 additions & 0 deletions src/goto-harness/memory_snapshot_harness_generator.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
/******************************************************************\

Module: Harness to initialise memory from memory snapshot

Author: Daniel Poetzl

\******************************************************************/

#include "memory_snapshot_harness_generator.h"

#include <goto-programs/goto_convert.h>

#include <json/json_parser.h>

#include <json-symtab-language/json_symbol_table.h>

#include <util/exception_utils.h>
#include <util/fresh_symbol.h>
#include <util/message.h>
#include <util/string2int.h>
#include <util/string_utils.h>
#include <util/symbol_table.h>

#include "goto_harness_generator_factory.h"

void memory_snapshot_harness_generatort::handle_option(
const std::string &option,
const std::list<std::string> &values)
{
if(option == "memory-snapshot")
{
memory_snapshot = require_exactly_one_value(option, values);
}
else if(option == "initial-location")
{
const std::string initial_location =
require_exactly_one_value(option, values);

std::vector<std::string> start;
split_string(initial_location, ':', start, true);

if(
start.size() == 0 || (start.size() >= 1 && start.front().empty()) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Use start.empy() instead of size() == 0
  2. start.size() >= 1 is trivially true here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(start.size() == 2 && start.back().empty()) || start.size() > 2)
{
throw invalid_command_line_argument_exceptiont(
"invalid initial location specification", "--initial-location");
}

function = start.front();

if(start.size() == 2)
{
location_number = optionalt<unsigned>(safe_string2unsigned(start.back()));
}
}
else
{
throw invalid_command_line_argument_exceptiont(
"unrecognized option for memory snapshot harness generator",
"--" + option);
}
}

void memory_snapshot_harness_generatort::validate_options()
{
if(memory_snapshot.empty())
{
throw invalid_command_line_argument_exceptiont(
"option --memory_snapshot is required",
"--harness-type initialise-from-memory-snapshot");
}

if(function.empty())
{
INVARIANT(
location_number.has_value(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be negated (if no function is provided, then location_number must not be set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"when `function` is empty then the option --initial-location was not "
"given and thus `location_number` was not set");

throw invalid_command_line_argument_exceptiont(
"option --initial-location is required",
"--harness-type initialise-from-memory-snapshot");
}
}

void memory_snapshot_harness_generatort::add_init_section(
goto_modelt &goto_model) const
{
goto_functionst &goto_functions = goto_model.goto_functions;
symbol_tablet &symbol_table = goto_model.symbol_table;

goto_functiont &goto_function = goto_functions.function_map[function];
const symbolt &function_symbol = symbol_table.lookup_ref(function);

goto_programt &goto_program = goto_function.body;

const symbolt &symbol = get_fresh_aux_symbol(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not give you a symbol of the expected lifetime. I think you may want to set symbol.is_static_lifetime = true; afterwards.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give symbol a more meaningful name, such as func_init_done or func_init_done_symbol if you want to keep its symbol-ness part of the name?

Also, can you add a comment here about what func_init_done means?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bool_typet(),
id2string(function),
"func_init_done",
source_locationt(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use function_symbol.location - it's always useful to have some idea where this might belong to in terms of source code files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

function_symbol.mode,
symbol_table);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add symbol.value = false_exprt(); (see below)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


const symbol_exprt &var = symbol.symbol_expr();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't take a reference here, symbol_expr returns a fresh value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give this a more meaningful name, like func_init_done_variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better names added.


// initialise var in __CPROVER_initialize if it is present
const auto f_it =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give this is a more meaningful name, such as cprover_init_it or something along those lines?

Also, can you elaborate why you need to change __CPROVER_initialize to make this work? (ideally as a comment in the code as well)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

goto_functions.function_map.find(CPROVER_PREFIX "initialize");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the INITIALIZE_FUNCTION macro from static_lifetime_init.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you need this at all, shouldn't the generated harness function be enough for whatever you were going to to with initialize?


if(f_it != goto_functions.function_map.end())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what if this fails? I think that's where the initial value comes into play, i.e., you need to do symbol.value = false_exprt();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right now validate_options has a check for the presence of initialize. If absent, we throw exception. Do you want it to proceed and simply skip the modification of initialize altogether?

{
goto_programt &goto_program = f_it->second.body;

auto init_it =
goto_program.insert_before(std::prev(goto_program.instructions.end()));

init_it->make_assignment(code_assignt(var, false_exprt()));
}

const goto_programt::const_targett start_it =
goto_program.instructions.begin();

goto_programt::const_targett it;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having an uninitialised iterator you might just use optionalt<goto_programt::const_targett> opt_it; right here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


if(location_number.has_value())
{
const auto opt_it = goto_program.get_target(*location_number);

if(!opt_it.has_value())
{
throw invalid_command_line_argument_exceptiont(
"no instruction with location number " +
std::to_string(*location_number) + " in function " +
id2string(function),
"--initial-location");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check should be done as part of the options parsing very early on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At present goto-harness loads the goto binary only after the options checking. We could move the loading of the goto binary to before the options checking, and then give a reference to the goto model to the constructor of every harness generator. Then we could move some of the checks out of generate() and into handle_option(). Does that sound reasonable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although that's a non-trivial change it would seem like a good idea to me. The earlier we know about a problem the better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


it = *opt_it;
}
else
{
it = start_it;
}

auto ins_it1 = goto_program.insert_before(start_it);
ins_it1->make_goto(goto_program.const_cast_target(start_it));
ins_it1->guard = var;

auto ins_it2 = goto_program.insert_after(ins_it1);
ins_it2->make_assignment(code_assignt(var, true_exprt()));

auto ins_it3 = goto_program.insert_after(ins_it2);
ins_it3->make_goto(goto_program.const_cast_target(it));
}

void memory_snapshot_harness_generatort::add_assignments_to_globals(
const symbol_tablet &snapshot,
code_blockt &code) const
{
for(const auto &pair : snapshot)
{
const symbolt &symbol = pair.second;

if(!symbol.is_static_lifetime)
continue;

code_assignt code_assign(symbol.symbol_expr(), symbol.value);

code.add(code_assign);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::move or simply do code.add(code_assignt(....));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
}

void memory_snapshot_harness_generatort::add_call_with_nondet_arguments(
const symbolt &called_function_symbol,
code_blockt &code) const
{
const code_typet &code_type = to_code_type(called_function_symbol.type);

code_function_callt::argumentst arguments;

for(const code_typet::parametert &parameter : code_type.parameters())
{
arguments.push_back(side_effect_expr_nondett(parameter.type()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor of side_effect_expr_nondett is deprecated, please use the one that also take as source location (just use called_function_symbol.location).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

code_function_callt function_call(
called_function_symbol.symbol_expr(), arguments);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::move(arguments)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

code.add(function_call);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::move or inline the constructor call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

void memory_snapshot_harness_generatort::insert_function(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is supposed to insert the harness function specifically rather than just any function, can you rename this to insert_harness_function_into_goto_model or something like this? (I know these names are getting rather lengthy, this is just an example)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

goto_modelt &goto_model,
const symbolt &function) const
{
const auto r = goto_model.symbol_table.insert(function);
CHECK_RETURN(r.second);

auto f_it = goto_model.goto_functions.function_map.insert(
std::make_pair(function.name, goto_functiont()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use emplace(function.name, goto_functiont())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

CHECK_RETURN(f_it.second);

goto_functiont &harness_function = f_it.first->second;
harness_function.type = to_code_type(function.type);

goto_convert(
to_code_block(to_code(function.value)),
goto_model.symbol_table,
harness_function.body,
message_handler,
function.mode);

harness_function.body.add(goto_programt::make_end_function());
}

void memory_snapshot_harness_generatort::get_memory_snapshot(
const std::string &file,
symbol_tablet &snapshot) const
{
jsont json;

const bool r = parse_json(memory_snapshot, message_handler, json);

if(r)
{
throw deserialization_exceptiont("failed to read JSON memory snapshot");
}

// throws a deserialization_exceptiont or an incorrect_goto_program_exceptiont
// on failure to read JSON symbol table
symbol_table_from_json(json, snapshot);
}

void memory_snapshot_harness_generatort::generate(
goto_modelt &goto_model,
const irep_idt &harness_function_name)
{
symbol_tablet snapshot;
get_memory_snapshot(memory_snapshot, snapshot);

symbol_tablet &symbol_table = goto_model.symbol_table;
goto_functionst &goto_functions = goto_model.goto_functions;

const symbolt *called_function_symbol = symbol_table.lookup(function);

if(called_function_symbol == nullptr)
{
throw invalid_command_line_argument_exceptiont(
"function `" + id2string(function) + "` not found in the symbol table",
"--initial-location");
}

{
const auto f_it = goto_functions.function_map.find(function);

if(f_it == goto_functions.function_map.end())
{
throw invalid_command_line_argument_exceptiont(
"goto function `" + id2string(function) + "` not found",
"--initial-location");
}

if(!f_it->second.body_available())
{
throw invalid_command_line_argument_exceptiont(
"given function `" + id2string(function) + "` does not have a body",
"--initial-location");
}
}

add_init_section(goto_model);

if(symbol_table.has_symbol(harness_function_name))
{
throw invalid_command_line_argument_exceptiont(
"harness function `" + id2string(harness_function_name) +
"` already in "
"the symbol table",
"--" GOTO_HARNESS_GENERATOR_HARNESS_FUNCTION_NAME_OPT);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do that check as early as possible, no need to waste time doing add_init_section or the likes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


code_blockt harness_function_body;

add_assignments_to_globals(snapshot, harness_function_body);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make that function return a code_blockt rather than passing it by reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


add_call_with_nondet_arguments(
*called_function_symbol, harness_function_body);

// Create harness function symbol

symbolt harness_function_symbol;
harness_function_symbol.name = harness_function_name;
harness_function_symbol.base_name = harness_function_name;
harness_function_symbol.pretty_name = harness_function_name;

harness_function_symbol.is_lvalue = true;
harness_function_symbol.mode = called_function_symbol->mode;
harness_function_symbol.type = code_typet({}, empty_typet());
harness_function_symbol.value = harness_function_body;

// Add harness function to goto model and symbol table
insert_function(goto_model, harness_function_symbol);

goto_functions.update();
}
Loading