ConFuzz is a novel LLM-driven Fuzzer designed for security testing the unsafe consumption APIs.
This is a scientific prototype for my bachelor's thesis, to design and evaluate an LLM-driven fuzzer for consumer-side API security testing.
Run ConFuzz with the qwen3:8b Ollama model in 'auto' mode (run through all six scenarios after another):
python3 main.py --strategy llm --model qwen3:8b --autoThis is the result of ConFuzz running against the six scenario endpoints implemented in the test-environment:

Run the baseline fuzzer with the custom.txt list in 'auto' mode:
python3 main.py --strategy baseline --autoThis is the result of the mutation-based baseline fuzzer that will be used for the comparative analysis with ConFuzz.

The autorun.py script can be used to orchestrate the runs for the empirical evaluation automatically. You can configure the strategy and model in the config. This will automatically execute ConFuzz with the configured options and store the resulting log files in the evaluation/ directory for further analysis and evaluation.
Some commands and their results used in the empirical evaluation are documented here.
Warning
ConFuzz is only effective for fuzzing the test environment due to a lack of a general feedback analysis engine.
In order to use it in other projects, either the detect_exploit() function must be adapted or a proper feedback analysis module must be implemented.
In addition, the driver configuration must be adapted to the host and endpoints to be fuzzed.