For this exercize we will fuzz LibXML2 XML parsing library. The goal is to find a crash/PoC for CVE-2017-9048 in LibXML2 2.9.4.
For more information about CVE-2017-9048 vulnerability, click me!
--------------------------------------------------------------------------------------------------------CVE-2017-9048 is an stack buffer overflow vulnerability affecting the DTD validation functionality of LibXML2.
A stack buffer overflow is a type of buffer overflow where the buffer being overwritten is allocated on the stack.
As a result, a remote attacker can exploit this issue to execute arbitrary code within the context of an application using the affected library.
You can find more information about stack buffer oveflow vulnerabilities at the following link: https://cwe.mitre.org/data/definitions/121.html
Once you complete this exercise you will know how:
- To use custom dictionaries for helping the fuzzer to find new execution paths
- To parallelize the fuzzing job accross multiple cores
- I suggest you to try to solve the exercise by yourself without checking the solution. Try as hard as you can, and only if you get stuck, check out the example solution below.
- AFL uses a non-deterministic testing algorithm, so two fuzzing sessions are never the same. That's why I highly recommend to set a fixed seed (-s 123). This way your fuzzing results will be similar to those shown here and that will allow you to follow the exercises more easily.
- If you find a new vulnerability, please submit a security report to the project. If you need help or have any doubt about the process, the GitHub Security Lab can help you with it :)
Are you stuck and looking for help? Do you have suggestions for making this course better or just positive feedback so that we create more fuzzing content?
Do you want to share your fuzzing experience with the community?
Join the GitHub Security Lab Slack and head to the #fuzzing
channel. Request an invite to the GitHub Security Lab Slack
All the exercises have been tested on Ubuntu 20.04.2 LTS. I highly recommend you to use the same OS version to avoid different fuzzing results and to run AFL++ on bare-metal hardware, and not virtualized machines, for best performance.
Otherwise, you can find an Ubuntu 20.04.2 LTS VMware image here. You can also use VirtualBox instead of VMware.
The username / password for this VM are fuzz
/ fuzz
.
When we want to fuzz complex text-based file formats (such as XML), it’s useful to provide the fuzzer with a dictionary containing a list of basic syntax tokens.
In the case of AFL, such a dictionary is simply a set of words or values which is used by AFL to apply changes to the current in-memory file. Specifically, AFL performs the following changes with the values provided in the dictionary:
- Override: Replaces a specific position with n number of bytes, where n is the length of a dictionary entry.
- Insert: Inserts the dictionary entry at the current file position, forcing all characters to move n positions down and increasing file size.
You can find a good bunch of examples here
If you have a multi-core system is a good idea to parallelize your fuzzing job to make the most of your CPU resources.
This is the simplest parallelization strategy. In this mode, we run fully separate instances of afl-fuzz.
It is important to remember that AFL uses a non-deterministic testing algorithm. So, if we run multiples AFL instances, We increase our chances of success.
For this, you only need to run multiple instances of "afl-fuzz" on multiple terminal windows, setting a different "output folder" for each one of them. A simple approach is to run as many fuzzing jobs as cores are on your system.
Note: If you're using the -s flag, you need to use a different seed for each instance
The use of shared instances is a better approach to parallel fuzzing. In this case, each fuzzer instance gathers any test cases found by other fuzzers.
You will usually have only one master instance at a time:
./afl-fuzz -i afl_in -o afl_out -M Master -- ./program @@
and N-1 number of slaves:
./afl-fuzz -i afl_in -o afl_out -S Slave1 -- ./program @@
./afl-fuzz -i afl_in -o afl_out -S Slave2 -- ./program @@
...
./afl-fuzz -i afl_in -o afl_out -S SlaveN -- ./program @@
In order to complete this exercise, you need to:
- Find an interface application that makes use of the LibXML2 library
- Copy the SampleInput.xml file to your AFL input folder
- Create a custom dictionary for fuzzing XML
- Fuzz LibXML2 until you have a few unique crashes. I recommend you to use as many AFL instances as posible (CPU cores)
- Triage the crashes to find a PoC for the vulnerability
- Fix the issues
Estimated time = 3 hours
SPOILER ALERT! : Solution inside
Let's first get our fuzzing target. Create a new directory for the project you want to fuzz:
cd $HOME
mkdir Fuzzing_libxml2 && cd Fuzzing_libxml2
Download and uncompress libxml2-2.9.4.tar.gz
wget http://xmlsoft.org/download/libxml2-2.9.4.tar.gz
tar xvf libxml2-2.9.4.tar.gz && cd libxml2-2.9.4/
Build and install libxml2:
sudo apt-get install python-dev
CC=afl-clang-lto CXX=afl-clang-lto++ CFLAGS="-fsanitize=address" CXXFLAGS="-fsanitize=address" LDFLAGS="-fsanitize=address" ./configure --prefix="$HOME/Fuzzing_libxml2/libxml2-2.9.4/install" --disable-shared --without-debug --without-ftp --without-http --without-legacy --without-python LIBS='-ldl'
make -j$(nproc)
make install
Now, we can test that all is working OK with:
./xmllint --memory ./test/wml.xml
and you should see something like that
First of all, we need to get some XML samples. We're gonna use the SampleInput.xml provided in this repository:
mkdir afl_in && cd afl_in
wget https://raw.githubusercontent.com/antonio-morales/Fuzzing101/main/Exercise%205/SampleInput.xml
cd ..
Now, you need to create an XML dictionary. Alternatively, you can use the XML dictionary provided with AFL++:
mkdir dictionaries && cd dictionaries
wget https://raw.githubusercontent.com/AFLplusplus/AFLplusplus/stable/dictionaries/xml.dict
cd ..
In order to catch the bug, is mandatory to enable the --valid
parameter. I also set the dictionary path with the -x flag and enabled the deterministic mutations with the -D flag (only for the master fuzzer):
For example, I ran the fuzzer with the following command
afl-fuzz -m none -i ./afl_in -o afl_out -s 123 -x ./dictionaries/xml.dict -D -M master -- ./xmllint --memory --noenc --nocdata --dtdattr --loaddtd --valid --xinclude @@
You can run another slave instance with:
afl-fuzz -m none -i ./afl_in -o afl_out -s 234 -S slave1 -- ./xmllint --memory --noenc --nocdata --dtdattr --loaddtd --valid --xinclude @@
Are you interested in fuzzing command-line arguments? Take a look to the following blog post, to the "Fuzzing command-line arguments" section.
After a while, you should have multiple crashes:
To debug a program built with ASan is so much easier than in the previous exercises. All you need to do is to feed the program with the crash file:
./xmllint --memory --noenc --nocdata --dtdattr --loaddtd --valid --xinclude './afl_out/default/crashes/id:000000,sig:06,src:003963,time:12456489,op:havoc,rep:4'
and you will get a nice summary of the crash, including the execution trace:
The last step of the exercise is to fix the bug! Rebuild your target after the fix and check that your PoC don't crash the program anymore. This last part is left as exercise for the student.
Solution inside
--------------------------------------------------------------------------------------------------Official fix:
Alternatively, you can download a newer version of LibXML, and check that the bug has been fixed.