heiFIP stands for Heidelberg Flow Image Processor. It is a tool designed to extract essential parts of packets and convert them into images for deep learning purposes. heiFIP supports different formats and orientations. Currently, we only support offline network data analysis. However, we plan to adapt our library to support online network data too to enable live-probing of models.
Latest Release | Version 1.0 |
Project License |
|
Citation |
|
Continuous Integration |
|
The idea to create heiFIP came from working with Deep Learning approaches to classify malware traffic on images. Many papers use image representation of network traffic, but reproducing their results was quite cumbersome. As a result, we found that there is currently no official library that supports reproducible images of network traffic. For this reason, we developed heiFIP to easily create images of network traffic and reproduce ML/DL results. Researchers can use this library as a baseline for their work to enable other researchers to easily recreate their findings.
- Different Images: Currently, we support plain packet to byte representation, and flow to byte representation with one channel each. An image is created with same width and height for a quadratic representation.
- Flow Images converts a set of packets into an image. It supports the following modifications:
- Max images dimension allows you to specify the maximum image dimension. If the packet is larger than the specified size, it will cut the remaining pixel.
- Min image dimesion allows you to specify the minimum image dimension. If the packet is smaller than the specified size, it fills the remaining pixel with 0.
- Remove duplicates allows you to automatically remove same traffic.
- Append each flow to each other or write each packet to a new row.
- Tiled each flow is tiled into a square image representation.
- Min packets per flow allows you to specify the minimum number of packets per flow. If the total number of packets is too small, no image will be created.
- Max packets per flow allows you to specify the maximum number of packets per flow. If the total number of packets is too great, the remaining images are discarded.
- Packet Image converts a single packet into an image.
- Markov Transition Matrix Image: converts a packet or a flow into a Markov representation.
- Flow Images converts a set of packets into an image. It supports the following modifications:
- Header processing allows you to customize header fields of different protocols. It aims to remove biasing fields.
- Remove Payload options allows you to only work on header data.
- Fast and flexible: The main image precessing is in raw bytes inside the image classes while for the header preprocessing is PcapPlusPlus is used.
- Machine learning orientation: heiFIP aims to make Deep Learning approaches using network data as images reproducible and deployable. Using heiFIP as a common framework enables researches to test and verify their models.
- C++ Compiler: GCC ≥ 9.0, Clang ≥ 10, or MSVC 2019 with C++17 support.
- CMake: Version ≥ 3.14
- PcapPlusPlus: Installed system‑wide or built locally. (https://github.com/seladb/PcapPlusPlus)
- OpenSSL: For SHA256 hashing (libcrypto).
- OpenCV: Version ≥ 4.0 for image handling and saving (e.g., cv::imwrite).
- pthread: POSIX threads (Linux/macOS). Windows users require linking against
-lws2_32
and-lIPHLPAPI
. - libpcap: PCAP Support (Linux/macOS)
Optional:
- getopt_long: For CLI parsing (provided by libc on Linux/macOS). Windows may need
getopt
replacement.
# Clone this repo
git clone https://github.com/yourusername/heiFIPCpp.git
cd heiFIP/heiFIP/
# Create build directory
mkdir build && cd build
cmake ..
# We highly recommend that locating necessary dependencies is done manually since espically
# Pcap Plus Plus is often not installed in standard locations. While we do use scripts to automatically detect
# the necessary dependencies if those scripts fail you can specify the paths to the include directories of the header
# files aswell as the paths to libaries manually like so. Also do not forget to specify all three of Pcap Plus Plus's
# libaries libCommon++, libPacket++, libPcap++. For OpenCV doing this manually while possible, due to number of links
# necessary, is very difficult. Since OpenCV is configured for Cmake anyway this is unnecessary anyway. When using macOS
# you need to be very careful that the linked libraries are not Intel (x86_64) bottles, since if this happens the code
# will still be compiled as ARM64 but dynamically linking against x86_64 .dylib. This forces macOS to convert
# back to ARM64 at runtime using Rosetta 2 which encures significant overhead. So if possible use a Linux distribution
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DUSE_MANUAL_PCAPPLUSPLUS=ON \
-DPcapPlusPlus_INCLUDE_DIRS="/opt/homebrew/Cellar/pcapplusplus/25.05/include" \
-DPcapPlusPlus_LIBRARIES="/opt/homebrew/Cellar/pcapplusplus/25.05/lib/libCommon++.a\;/opt/homebrew/Cellar/pcapplusplus/25.05/lib/libPacket++.a\;/opt/homebrew/Cellar/pcapplusplus/25.05/lib/libPcap++.a" \
-DUSE_MANUAL_OPENSSL=ON \
-DOPENSSL_INCLUDE_DIR="/opt/homebrew/opt/openssl@3/include" \
-DOPENSSL_CRYPTO_LIBRARY="/opt/homebrew/opt/openssl@3/lib/libcrypto.a"
# Compile
make -j$(nproc)
# or
cmake --build build
# The executable 'heiFIPCpp' will be produced in build/
After installation the command line interface can be used to extract images from pcap files witht he following command
./heiFIPCpp \
--name HelloHeiFIP
--input /path/to/capture.pcap \
--output /path/to/outdir \
--threads 4 \
--processor HEADER \
--mode FlowImageTiledAuto \
--dim 16 \
--apppend \
--fill 0 \
--min-dim 10 \
--max-dim 2000 \
--min-pkts 10 \
--max-pkts 100 \
--remove-dup
Flag | Description |
---|---|
-i , --input |
Input PCAP file path |
-o , --output |
Output directory |
-t , --threads |
Number of worker threads (default: 1) |
-p , --processor |
Preprocessing: NONE or HEADER |
-m , --mode |
Image type: PacketImage , FlowImage , FlowImageTiledFixed , |
FlowImageTiledAuto , MarkovTransitionMatrixFlow , |
|
MarkovTransitionMatrixPacket |
|
--dim |
Base dimension for image (e.g. width/height in pixels) |
--fill |
Fill or padding value (0–255) |
--cols |
Number of columns (for tiled/fixed or Markov flow) |
--auto-dim |
Enable auto‑dimension selection (bool) |
--append |
Enable auto‑dimension selection (bool) |
--min-dim |
Minimum allowed image dimension |
--max-dim |
Maximum allowed image dimension |
--min-pkts |
Minimum packets per flow (for tiled/flow modes) |
--max-pkts |
Maximum packets per flow |
--remove-dup |
Remove duplicate flows/packets by hash |
--name |
Filname of processed image |
-h , --help |
Show this help message |
To add a new image type:
- Define a new
ImageArgs
struct inextractor.cpp
. - Extend the
ImageType
enum. - Implement the conversion in
PacketProcessor::createImageFromPacket()
. - Update the CLI
--mode
parser to include your new type.
- S. Machmeier, M. Hoecker, V. Heuveline, "Explainable Artificial Intelligence for Improving a Session-Based Malware Traffic Classification with Deep Learning", in 2023 IEEE Symposium Series on Computational Intelligence (SSCI), Mexico-City, Mexico, 2023. https://doi.org/10.1109/SSCI52147.2023.10371980
- S. Machmeier, M. Trageser, M. Buchwald, and V. Heuveline, "A generalizable approach for network flow image representation for deep learning", in 2023 7th Cyber Security in Networking Conference (CSNet), Montréal, Canada, 2023. https://doi.org/10.1109/CSNet59123.2023.10339761
The following people contributed to heiFIP:
- Stefan Machmeier: Creator
- Manuel Trageser: Header extraction and customization.
- Henri Rebitzky: Coversion from python to c++
This project is licensed under the EUPL-1.2 License - see the License file for details