Replace CBMC output parser #1433

adpaco-aws · 2022-08-01T19:41:30Z

Description of changes:

Replaces the cbmc_json_parser.py script with the Rust module cbmc_json_parser.rs.

In contrast to the old parser, this one doesn't wait to read from a file. Instead, it pipes CBMC output into Kani, which allows us to perform parsing on the fly. The main advantage of doing this is that users will get feedback even if CBMC is stuck in automatic unwinding (i.e., this solves #493).

While the basic functionality is being thoroughly tested by our regression tests, there are some places where we aren't doing the exact same thing as in the old script. See the "Callouts" section for more details.

Here's the output for a non-terminating example:

For the test in tests/expected/assert-eq/main.rs (regular output format):

For the test in tests/expected/assert-eq/main.rs (terse output format):

Resolved issues:

Resolves #493
Resolves #1431

Call-outs:

The new parser relays on diff_path in order to print locations, which are relative to the current working directory. The old script had different branches based on the location of the involved files (i.e., it distinguished between files located below the current working directory, the user directory and others).
~~Colored results are gone, but will come back soon: New parser doesn't produce colored output #1431~~
Messages produces after looking a result should be handled as regular messages. I've some idea on how to handle this, but will have to wait until then: Messages based on results should be printed as regular messages #1432
Some Python regexes used to match on Kani reachability IDs were too greedy in Rust, therefore I changed them to NOT match on square brackets, which gives the expected result.
Had to change a dry-run regression because now we don't produce a CBMC output file.
I'm sure there are places where we can use more idiomatic Rust. Please point those out!

Testing:

How is this change tested? Existing regression. In addition, I used an ad-hoc script that diffs the output between the old and new versions. No "Status" lines appeared there.
Is this a refactor change? Mostly yes.

Checklist

Each commit message has a non-empty body, explaining why the change was made
Methods or procedures are documented
Regression or unit tests are included, or existing tests cover the modified code
My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

danielsn · 2022-08-01T20:14:14Z

kani-driver/src/session.rs

Is there a risk using pipe? I know in some implementations, if the pipe isn't drained fast enough, the upstream process blocks.

Well, I don't think it's risk-free. But I haven't found any issues so far.

For example, I think the scenario you mention would correspond to running a non-terminating example through Kani. I've tried doing that locally and it's still going on after 45mins. Kani's memory % is below 0.1, and CBMC's memory % has slowly increased to 4.3% (it's doing automatic unwinding).

jaisnan · 2022-08-01T20:30:04Z

Thank you so much for this! Just so we have a fair idea of what the output looks like now, can you add a few screenshots to the PR description?

kani-driver/src/call_cbmc.rs

kani-driver/src/cbmc_output_parser.rs

adpaco-aws · 2022-08-01T21:02:41Z

Thank you so much for this! Just so we have a fair idea of what the output looks like now, can you add a few screenshots to the PR description?

Screenshots added!
They correspond to one test (with output format regular and terse) and the non-terminating example:

fn rev(mut x: [i32; 4]) -> [i32; 4] {
    x.reverse();
    x
}

#[cfg(kani)]
#[kani::proof]
fn example() {
    let var = kani::any();
    let reversed = rev(var);
    let double_reversed = rev(reversed);

    assert_eq!(var, double_reversed);
}

The main differences are highlighted in the call-outs section (e.g., no color). Other than that, I've tried to keep it as faithful as possible.

zhassan-aws

First round of comments. Only half way through cbmc_output_parser.rs.

kani-driver/src/cbmc_output_parser.rs

adpaco-aws · 2022-08-03T18:56:25Z

Added colored output. Will update screenshots in a moment.

kani-driver/src/cbmc_output_parser.rs

zhassan-aws · 2022-08-04T23:10:53Z

kani-driver/src/cbmc_output_parser.rs

Would using something like the StreamDeserializer be possible here?

Good point.

I tried using StreamDeserializer for parsing from the buffer in one of my earliest attempts, but it wasn't successful. The problem is that it expects self-delineating values, but the output we process is a JSON array with delineated values. In other words, the output we get is of the form:

[ Program, Message, ..., Message, Result, Message, ProverStatus ]

But for StreamDeserializer to work we would need an output of the form:

Program Message ... Message Result Message ProverStatus

Note that the parsing logic is mostly concerned with this problem: It assumes it's already within a JSON array, ignoring its delimiters. Changing the CBMC output to have the second form described is a no-go, at least according to the discussions I had then.

fzaiser · 2022-08-05T20:42:06Z

kani-driver/src/cbmc_output_parser.rs

I think it would be cleaner to use write!(&mut fmt_str, ":{line}")? instead of these two lines. Similar in the rest of the function (and maybe other instances in the file.

kani-driver/src/cbmc_output_parser.rs

scripts/setup/ubuntu/install_deps.sh

fzaiser · 2022-08-08T16:32:31Z

kani-driver/src/cbmc_output_parser.rs

+            write!(&mut fmt_str, " in function {function}")?;
+        }
+
+        write! {f, "{}", fmt_str}


Since we're first writing to fmt_str and then write fmt_str to f, can't we just write to f directly?

zhassan-aws

This is a huge effort! Thanks @adpaco-aws!

Only have a few more minor comments.

zhassan-aws · 2022-08-08T18:42:36Z

kani-driver/src/cbmc_output_parser.rs

+        if input.starts_with('[') || input.starts_with(']') {
+            return Some(Action::ClearInput);
+        }
+        if input.starts_with("  }") {


Should we handle whitespace separately, i.e. remove it before matching on the patterns?

The spaces are important here: Because we're iterating over a JSON array, matching on this specific string guarantees that we'll always get an item when we attempt to process a parser item. Otherwise, we wouldn't have this guarantee, which would likely result in a slow-down of the parser.

But can we guarantee that CBMC's JSON format will not change in a way that would break the current matching (that uses a specific number of spaces)?

Since we pin to a specific cbmc version in our releases, we'd at least get notification that it broke in all our tests.

But I do think this explanation of what's going on here should be added as a comment on this line of code.

But can we guarantee that CBMC's JSON format will not change in a way that would break the current matching (that uses a specific number of spaces)?

We don't guarantee that. This could be changed to make it more resilient, but in my opinion it's better to break here than in some other places.

Added a comment to explain this.

zhassan-aws · 2022-08-08T18:45:00Z

kani-driver/src/cbmc_output_parser.rs

+
+    /// Clears the input accumulated so far.
+    fn clear_input(&mut self) {
+        self.input_so_far = String::new();


Suggested change

self.input_so_far = String::new();

self.input_so_far.clear();

Using clear instead of new would keep the same capacity, which should reduce allocations/deallocations.

Right, thanks!

zhassan-aws · 2022-08-08T18:54:15Z

kani-driver/src/cbmc_output_parser.rs

+        }
+        let complete_string = &self.input_so_far[0..self.input_so_far.len()];
+        let result_item: Result<ParserItem, _> = serde_json::from_str(complete_string);
+        assert!(result_item.is_ok());


This assert shouldn't be necessary since the unwrap on the next line already includes a panic.

zhassan-aws · 2022-08-08T20:00:27Z

kani-driver/src/cbmc_output_parser.rs

+        // Both formatting and printing could be handled by objects which
+        // implement a trait `Printer`.
+        let formatted_item = format_item(&processed_item, output_format);
+        if formatted_item.is_some() {


if let Some

zhassan-aws · 2022-08-08T20:18:23Z

kani-driver/src/cbmc_output_parser.rs

zhassan-aws · 2022-08-08T20:19:35Z

kani-driver/src/cbmc_output_parser.rs

+    ///  * Curly closing bracket ('}') preceded by two spaces will trigger the
+    ///    `ProcessItem` action.
+    fn triggers_action(&self, input: String) -> Option<Action> {
+        if input.starts_with('[') || input.starts_with(']') {


This may ignore any characters that appear after [ or ]. If CBMC's current output never includes characters, perhaps we should assert that the line only has [ or ]?

zhassan-aws · 2022-08-08T20:20:58Z

kani-driver/src/cbmc_output_parser.rs

+    if let Some(alt_descriptions) = description_alternatives {
+        for (desc_to_match, opt_desc_to_replace) in alt_descriptions {
+            if original.contains(desc_to_match) {
+                if opt_desc_to_replace.is_some() {


if let Some

tedinski

I like your description of how you tested it. Seems like it should be pretty high-confidence?

I think there's some things that could be improved, but nothing I think should block this improvement. Do you want to merge before release tomorrow?

tedinski · 2022-08-08T20:43:51Z

kani-driver/src/call_cbmc.rs

+        if self.args.output_format != OutputFormat::Old {
+            args.push("--json-ui".into());
+        }


Does this impact --visualize?

Thanks for pointing this out. It wasn't clear to me: --visualize still works, but I don't know if it's adding more files. So I reverted this change and added a comment.

tedinski · 2022-08-08T20:45:24Z

kani-driver/src/cbmc_output_parser.rs

+};
+use structopt::lazy_static::lazy_static;
+
+lazy_static! {


I do not want this to block this PR, but IIRC lazy_static isn't really maintained anymore and I forget what crate is supposed to replace it. once or oncecell or something like that.

tedinski · 2022-08-08T20:51:24Z

kani-driver/src/cbmc_output_parser.rs

+            result_str.push_str(&check_id);
+            result_str.push_str(&status_msg);
+            result_str.push_str(&description_msg);


I think you can use write! on a String, but I'd not worry about this now, do a follow-up refactoring!

adpaco-aws · 2022-08-09T17:00:41Z

Addressed all comments except "use write! on strings directly", which I'm leaving as a refactoring task in #1480

adpaco-aws requested a review from a team as a code owner August 1, 2022 19:41

danielsn reviewed Aug 1, 2022

View reviewed changes

fzaiser reviewed Aug 1, 2022

View reviewed changes

kani-driver/src/call_cbmc.rs Outdated Show resolved Hide resolved

fzaiser reviewed Aug 1, 2022

View reviewed changes

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved

fzaiser reviewed Aug 1, 2022

View reviewed changes

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved

zhassan-aws reviewed Aug 2, 2022

View reviewed changes

adpaco-aws force-pushed the replace-cbmc-output-parser branch from be24eca to 71df083 Compare August 3, 2022 15:23

This was referenced Aug 3, 2022

Add live streaming JSON output from CBMC to prevent Kani output hang #1278

Closed

Executable trace feature #1409

Merged

zhassan-aws reviewed Aug 4, 2022

View reviewed changes

fzaiser reviewed Aug 5, 2022

View reviewed changes

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved

fzaiser reviewed Aug 5, 2022

View reviewed changes

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved

fzaiser reviewed Aug 5, 2022

View reviewed changes

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved

fzaiser reviewed Aug 5, 2022

View reviewed changes

kani-driver/src/cbmc_output_parser.rs Outdated Show resolved Hide resolved

fzaiser reviewed Aug 5, 2022

View reviewed changes

scripts/setup/ubuntu/install_deps.sh Outdated Show resolved Hide resolved

adpaco-aws added 11 commits August 8, 2022 15:21

Main changes to kani-drive (call_cbmc) to enable piped output

c8e1cb8

Remove call_display_results from kani-driver

525bdd7

Adds main file for CBMC output parser, and dependencies

3ea8c4b

Remove remaining declaration of call_display_results

50168ec

Remove cbmc_json_parser.py and references in session & tests

1e73a16

Remove colorama dependency from install scripts and bundler

ae96ee8

Add issue URLs

b8c3e2e

Fix clippy warnings

4253bb5

Fix clippy warnings (2)

5ebae08

Remove debug prints and update variable names

9c73c83

Address 1st round comments

a3bcadf

adpaco-aws added 8 commits August 8, 2022 15:21

Document process_cbmc_output

edd4839

Remove line after rebase

ff6a4ed

Fix more clippy warnings...

d5c2ee3

Add colored output

7591905

Use console instead of colored for colored output

29ba95f

Use write! in fmt for source locations

8f8aef8

Remove unnecessary as_str() calls

52959d9

Format change

0527ff0

adpaco-aws force-pushed the replace-cbmc-output-parser branch from 28b7930 to 0527ff0 Compare August 8, 2022 15:22

Use as_str() in a couple of places (because of clippy)

e611c9e

fzaiser reviewed Aug 8, 2022

View reviewed changes

zhassan-aws approved these changes Aug 8, 2022

View reviewed changes

tedinski approved these changes Aug 8, 2022

View reviewed changes

sanjit-bhat mentioned this pull request Aug 9, 2022

Update concrete values parser to use serde structs #1477

Closed

adpaco-aws and others added 9 commits August 9, 2022 15:35

Use once_cell instead of lazy_static

69538ee

Move --json-ui cmd to where it was

e2a2168

Add tracking issue for string formatting work

804fa53

Address comments from Zyad

592cb39

Remove TODO item (colors)

3e5610d

Remove OutputFormat import

77627d7

Fix clippy warnings

bed23fd

Merge branch 'main' into replace-cbmc-output-parser

9fa3eb7

Merge branch 'main' into replace-cbmc-output-parser

68769e7

adpaco-aws and others added 2 commits August 9, 2022 13:13

Add comment on triggers_action function

899c29d

Merge branch 'main' into replace-cbmc-output-parser

20a8f4d

tedinski merged commit bca694a into model-checking:main Aug 9, 2022

	self.input_so_far = String::new();
	self.input_so_far.clear();

Replace CBMC output parser #1433

Replace CBMC output parser #1433

Uh oh!

Conversation

adpaco-aws commented Aug 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes:

Resolved issues:

Call-outs:

Testing:

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaisnan commented Aug 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adpaco-aws commented Aug 1, 2022

Uh oh!

zhassan-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adpaco-aws commented Aug 3, 2022

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhassan-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tedinski left a comment

Choose a reason for hiding this comment

adpaco-aws commented Aug 1, 2022 •

edited

Loading

jaisnan commented Aug 1, 2022 •

edited

Loading