-
Notifications
You must be signed in to change notification settings - Fork 284
Add new measurer based on Mutation Analysis #1901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Co-authored-by: Philipp Görz <phi-go@users.noreply.github.com>
Hey @alan32liu, the current commit should hopefully pass the CI checks, though, we are still encountering some errors for the integration tests with our setup but I believe this is just caused by the differences in the environment. We have merged the fixes for #1937, thank you for your help there. The current commit is still a work in progress regarding report generation and currently only 10 mutant binaries are built to keep testing times low. We added the more granular timestamps as well, see changes to the If the CI checks pass could you start a gcbrun? I think the following command should be correct:
We also encounter two errors that we are quite confused by, maybe you have an idea? One is part of the integration tests:
The other happens during report generation, which we tried to debug but are not even sure what of our changes could have even caused it:
|
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-23-mua-measurer --fuzzers aflplusplus aflplusplus_407 --mutation-analysis |
I will list my initial thoughts below, but I did not have a chance to read your code thoroughly and may make mistakes.
It seems the fuzzbench/fuzzers/libfuzzer/fuzzer.py Lines 49 to 52 in 2bc06d4
The corresponding lines were finding the first bug covered by each fuzzer on each benchmark, and adding it as a new column in the original data frame: fuzzbench/analysis/data_utils.py Lines 162 to 164 in 2bc06d4
The error complains that the size of the new column does not align with the dataframe. E.g., is this a column mismatch in the result you pasted above?
Also, I would fix the 1st error and re-run the exp to check if the 2nd one still exists, just in case it is caused by the 1st. |
This failed because there is no fuzzer named |
So presubmit error seems to be this:
Regarding openh264, I have merged the main branch but I see now that the fix is not in there, let me add the fix from the pull request.
I just copied the command assuming those fuzzers existed in this branch, it doesn't really matter which fuzzer the experiment is run with, we can also just do one. Thank you for the comments on the errors, we won't have time this weekend but will look into this starting Monday. |
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-12-23-mua-measurer --fuzzers aflplusplus --mutation-analysis |
This is because your statement requires Could that statement be replaced by mocks? |
Experiment |
How does this test work for the coverage measurer? As far as I can see there is no fixture setting up the data for the coverage measurer. We just added the mua commands (that require gsutil) there so that this test prepares the environment for the mua measurer correctly but it seems there is no similar setup needed for the coverage measurer. We can of course also try to use a mocking instead. |
Don't know, I would have to read the code to learn : ) |
Hey, the |
Also could you give it another go, I did some more performance optimizations that should have improved things: /gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-01-03-mua-xml2-2f --fuzzers afl libfuzzer --benchmarks libxml2_xml --mutation-analysis |
/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2023-01-04-mua-xml2-2f --fuzzers afl libfuzzer --benchmarks libxml2_xml --mutation-analysis |
Experiment |
Seems like You might want to delete the data for |
Done, thanks for the reminder! |
We recently published a paper describing an approach to use mutation analysis for fuzzer benchmarking. This pull request aims to support this approach (Phase I only) for FuzzBench.
There are a few mismatches between FuzzBench and our framework that would need some solutions, while I have some ideas, I would be interested in your input as well!
My understanding of measurers for FuzzBench is as follows:
At the start of a run all docker images are built. Specifically, for the coverage measurer the
generated.mk
contains targets of the formbuild-coverage-{benchmark}
which build the images containing executables used for coverage collection. These make commands are started in thebuilder.py
, and after the image is built the coverage executables are extracted and stored on the host.The coverage executables are then used by the code in the
measure_manager.py
to measure unmeasured_snapshots, which I would expect to be newly created fuzzer queue entries. Coverage measurement happens on the host in a temporary directory as far as I can tell, with a timeout of 900 seconds for a snapshot.Open points: