Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive coverage data alongside corpus archives #2020

Closed
wants to merge 11 commits into from

Conversation

addisoncrump
Copy link
Contributor

Currently, only corpora are saved in the archive and the summaries of coverage are provided at the end of the experiment. This change simply incorporates the saving of the coverage data snapshots next to the trial corpus snapshots.

@addisoncrump
Copy link
Contributor Author

Forgot to format...

@addisoncrump addisoncrump marked this pull request as draft August 8, 2024 13:34
@addisoncrump
Copy link
Contributor Author

It doesn't seem that the saving works as expected. I'm going to keep trying with this, but it's quite difficult to debug.

@addisoncrump addisoncrump marked this pull request as ready for review August 8, 2024 15:12
@addisoncrump
Copy link
Contributor Author

Okay, this should work now. I got confused as to the direction of the copy originally.

@DonggeLiu
Copy link
Contributor

Thanks @addisoncrump!
The code looks great to me. But before merging this, let's run an experiment on this PR to triple-check that this also works in the cloud instances : )
Could you please make a trivial modification to service/gcbrun_experiment.py?
This will allow me to launch experiments in this PR for final validation. Here is an example to add a dummy comment.
We can revert this after the experiment.
Thanks!

@addisoncrump
Copy link
Contributor Author

let's run an experiment on this PR to triple-check that this also works in the cloud instances

Sure, and also to collect the corresponding coverage data for the "standard" fuzzers. I'll make that change shortly.

@addisoncrump
Copy link
Contributor Author

Also, a local experiment shows that we also get warning info in the JSON (!):

warning: 6 functions have mismatched data
{"data":[{"files":[{"branches":[[102,22,102,36,0,0,0,0,4],[103,9,103,41,0,0,0,0,4],...]}]}]}

Should we remove this?

@DonggeLiu
Copy link
Contributor

Should we remove this?

Do you happen to know the cause of this?

@addisoncrump
Copy link
Contributor Author

addisoncrump commented Aug 9, 2024

To be honest, I've looked around a bit now and do not see the root cause.

It seems to be using new_process.execute, but this redirects stdout only. I presume, then, that llvm-cov is actually producing warnings in stdout(!). I'll see if I can find the appropriate command line switch to remove this.

@addisoncrump
Copy link
Contributor Author

It seems to be a known issue btw; get_coverage_infomation (typo: information) already handles this.

@addisoncrump
Copy link
Contributor Author

addisoncrump commented Aug 9, 2024

That seems to have done it. The get_coverage_infomation function can remain as-is without loss of functionality.

Running a quick local test and then will stage the cloud test.

@addisoncrump
Copy link
Contributor Author

addisoncrump commented Aug 9, 2024

Okay, so I spent quite a while debugging a weird change that was occurring when presubmit was applied; namely, make presubmit was modifying the file analysis/test_data/pairwise_unique_coverage_heatmap-failed-diff.png. This was a result of the seaborn version being incompatible with the version of matplotlib. I fixed this by updating the dependency in requirements.txt. Nonetheless, this still had metadata changes which caused the diff to be modified in disk. Since this is the result of a test, I added it to the gitignore.

This also implies to me that the test should be failing, but isn't. I think this is a minor difference in how seaborn now emits heatmaps (seems to be some offset change).

@addisoncrump
Copy link
Contributor Author

Also, experimenting with compression, because the coverage dumps are quite large and easily compressible.

@addisoncrump
Copy link
Contributor Author

llvm-cov export: Unknown command line argument '-no-warn'. Try: 'llvm-cov export --help'

Well, the version of llvm-cov used is too old. I'll revert this now.

@addisoncrump
Copy link
Contributor Author

Compression reduces 15MB => 1MB, so seems worth it. This is now in a stable state and ready for a test run!

@DonggeLiu
Copy link
Contributor

Nice! Let's start with a simple one.

collect the corresponding coverage data for the "standard" fuzzers.

Then we collect these.

@DonggeLiu
Copy link
Contributor

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-09-dg-2020 --fuzzers libfuzzer --benchmarks libxml2_xml

@DonggeLiu
Copy link
Contributor

Experiment 2024-08-10-base data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).

@DonggeLiu
Copy link
Contributor

Out of curiosity, what is the measurement bottleneck? I did notice that, despite having lots of corpus archives available, the snapshots don't seem to have been measured yet.

Currently we measure coverage of all results in one VM, which becomes insanely slow when there are too many fuzzers (e.g., > 8) in one experiment.
We are working on fixing this.

@addisoncrump
Copy link
Contributor Author

Ah, okay, I understand. With the other experiment still running, it is effectively entirely overloaded, then.

@DonggeLiu
Copy link
Contributor

DonggeLiu commented Aug 10, 2024

Wait, something is going wrong with the 2024-08-10-base.
The data directory was generated as expected, but the report was not.

This is weird, it seems all errors are related to libafl.
example 1
example 2

Let me test re-running the experiment without it.

@tokatoka Do you happen to know why libafl failed to build with many benchmarks?
Sorry for the fuss (No pun intended :P).

@DonggeLiu
Copy link
Contributor

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-10-test --fuzzers aflplusplus centipede honggfuzz libfuzzer

@addisoncrump
Copy link
Contributor Author

The data directory was generated as expected, but the report was not.

If none of the measurements have happened yet, it won't have created a report, no?

@tokatoka
Copy link
Contributor

i guess we need to update libafl
@addisoncrump
can you change the commit we are using for libafl?
and also use fuzzers/fuzzbench/fuzzbench instead of fuzzers/fuzzbench

@addisoncrump
Copy link
Contributor Author

@DonggeLiu Any complaints if I make the libafl change in this PR as well?

@DonggeLiu
Copy link
Contributor

@DonggeLiu Any complaints if I make the libafl change in this PR as well?

Ah we would really appreciate it if you could do it in a different PR, given it is a stand-alone change.
Hope that won't cause too much trouble : )

Thanks!

@DonggeLiu
Copy link
Contributor

Thanks for the info, @tokatoka.

can you change the commit we are using for libafl?

What is the preferred commit to use?

@tokatoka
Copy link
Contributor

i'd say we can just use the latest

@addisoncrump
Copy link
Contributor Author

addisoncrump commented Aug 12, 2024

Wait, something is going wrong with the 2024-08-10-base.

@DonggeLiu, was the root cause ever discovered?

@DonggeLiu
Copy link
Contributor

@DonggeLiu, was the root cause ever discovered?

I think this is the reason: #2023.

There are other warnings/errors, but I reckon this is the reason.

@DonggeLiu
Copy link
Contributor

Also seeing a lot of this, but I presume that's unrelated to your PR?

Traceback (most recent call last):
  File "/work/src/experiment/measurer/coverage_utils.py", line 74, in generate_coverage_report
    coverage_reporter.generate_coverage_summary_json()
  File "/work/src/experiment/measurer/coverage_utils.py", line 141, in generate_coverage_summary_json
    result = generate_json_summary(coverage_binary,
  File "/work/src/experiment/measurer/coverage_utils.py", line 269, in generate_json_summary
    with open(output_file, 'w', encoding='utf-8') as dst_file:
FileNotFoundError: [Errno 2] No such file or directory: '/work/measurement-folders/lcms_cms_transform_fuzzer-centipede/merged.json'

@addisoncrump
Copy link
Contributor Author

I don't think so -- the modifications which were applied were done by the formatter. I can just revert that whole file if needed.

@DonggeLiu
Copy link
Contributor

I can just revert that whole file if needed.

No need, I've addressed this in #2023.
Later we can merge that into here.

@DonggeLiu
Copy link
Contributor

Oh, thanks for doing this.
I don't think that is caused by your modification, but since you have reverted, let's run an experiment for it.

@DonggeLiu
Copy link
Contributor

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-12-2020 --fuzzers aflplusplus centipede honggfuzz libfuzzer

@addisoncrump
Copy link
Contributor Author

👍 I figure since I didn't make any meaningful changes to that file anyway, better to leave it untouched. If the experiment magically starts working, I have no idea what that means, but I'll be happy about it lol

@DonggeLiu
Copy link
Contributor

Experiment 2024-08-12-2020 data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).

@addisoncrump
Copy link
Contributor Author

Yeah, looks like it's not working. This run should probably be cancelled, if nothing but to save some CPU time.

@DonggeLiu
Copy link
Contributor

DonggeLiu commented Aug 13, 2024

Yep, I suspect this is due to a benchmark compatibility issue.
Let me verify this.


Also, seeing a lot of instances in this experiment being preempted:
image

@addisoncrump
Copy link
Contributor Author

Superceded by #2028.

DonggeLiu added a commit that referenced this pull request Aug 15, 2024
1. Fix `TypeError: expected str, bytes or os.PathLike object, not
NoneType` in
[`2024-08-10-test`](#2020 (comment)).
```python
Traceback (most recent call last):
  File "/src/experiment/runner.py", line 468, in experiment_main
    runner.conduct_trial()
  File "/src/experiment/runner.py", line 290, in conduct_trial
    self.set_up_corpus_directories()
  File "/src/experiment/runner.py", line 275, in set_up_corpus_directories
    _unpack_clusterfuzz_seed_corpus(target_binary, input_corpus)
  File "/src/experiment/runner.py", line 144, in _unpack_clusterfuzz_seed_corpus
    seed_corpus_archive_path = get_clusterfuzz_seed_corpus_path(
  File "/src/experiment/runner.py", line 98, in get_clusterfuzz_seed_corpus_path
    fuzz_target_without_extension = os.path.splitext(fuzz_target_path)[0]
  File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext
    p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
```
This happens on [many
benchmarks+fuzzers](https://pantheon.corp.google.com/logs/query;query=%222024-08-10-test%22%0Aseverity%3E%3DERROR%0A--Hide%20similar%20entries%0A-%2528jsonPayload.message%3D~%22Error%20watching%20metadata:%20context%20canceled%22%2529%0A--End%20of%20hide%20similar%20entries;cursorTimestamp=2024-08-10T11:04:34.735815901Z;duration=P7D?project=fuzzbench&mods=logs_tg_prod).
To be investigated later:
1. Why `fuzz_target_path` is `None`.
2. Why this did not happen in other recent experiments.
3. I thought I had seen this a long ago, Déjà vu?

2. Fixing `No such file or directory:
'/work/measurement-folders/<benchmark>-<fuzzer>/merged.json`:
```python
Traceback (most recent call last):
  File "/work/src/experiment/measurer/coverage_utils.py", line 74, in generate_coverage_report
    coverage_reporter.generate_coverage_summary_json()
  File "/work/src/experiment/measurer/coverage_utils.py", line 141, in generate_coverage_summary_json
    result = generate_json_summary(coverage_binary,
  File "/work/src/experiment/measurer/coverage_utils.py", line 269, in generate_json_summary
    with open(output_file, 'w', encoding='utf-8') as dst_file:
FileNotFoundError: [Errno 2] No such file or directory: '/work/measurement-folders/lcms_cms_transform_fuzzer-centipede/merged.json'
```

3. Remove incompatible benchmarks: `openh264_decoder_fuzzer`,
`stb_stbi_read_fuzzer`
DonggeLiu pushed a commit that referenced this pull request Aug 16, 2024
Changing forks so @tokatoka can collab with me on this. Supercedes
#2021.
As requested in #2020.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants