GitHub - stefanDeveloper/heiFIP: heiFIP: A tool to convert network traffic into images for ML use cases

heiFIP stands for Heidelberg Flow Image Processor. It is a tool designed to extract essential parts of packets and convert them into images for deep learning purposes. heiFIP supports different formats and orientations. Currently, we only support offline network data analysis. However, we plan to adapt our library to support online network data too to enable live-probing of models.

Latest Release	Version 1.0
Project License
Citation
Continuous Integration

Motivation

The idea to create heiFIP came from working with Deep Learning approaches to classify malware traffic on images. Many papers use image representation of network traffic, but reproducing their results was quite cumbersome. As a result, we found that there is currently no official library that supports reproducible images of network traffic. For this reason, we developed heiFIP to easily create images of network traffic and reproduce ML/DL results. Researchers can use this library as a baseline for their work to enable other researchers to easily recreate their findings.

Main Features

Different Images: Currently, we support plain packet to byte representation, and flow to byte representation with one channel each. An image is created with same width and height for a quadratic representation.
- Flow Images converts a set of packets into an image. It supports the following modifications:
  - Max images dimension allows you to specify the maximum image dimension. If the packet is larger than the specified size, it will cut the remaining pixel.
  - Min image dimesion allows you to specify the minimum image dimension. If the packet is smaller than the specified size, it fills the remaining pixel with 0.
  - Remove duplicates allows you to automatically remove same traffic.
  - Append each flow to each other or write each packet to a new row.
  - Tiled each flow is tiled into a square image representation.
  - Min packets per flow allows you to specify the minimum number of packets per flow. If the total number of packets is too small, no image will be created.
  - Max packets per flow allows you to specify the maximum number of packets per flow. If the total number of packets is too great, the remaining images are discarded.
- Packet Image converts a single packet into an image.
- Markov Transition Matrix Image: converts a packet or a flow into a Markov representation.
Header processing allows you to customize header fields of different protocols. It aims to remove biasing fields.
Remove Payload options allows you to only work on header data.
Fast and flexible: The main image precessing is in raw bytes inside the image classes while for the header preprocessing is PcapPlusPlus is used.
Machine learning orientation: heiFIP aims to make Deep Learning approaches using network data as images reproducible and deployable. Using heiFIP as a common framework enables researches to test and verify their models.

Examples

Image Type	Description	Example
Packet	Converts a single packet into a square image. Size depends on the total length
Flow	Converts a flow packet into a square image
Markov Transition Matrix Packet	Converts a packet into a Markov Transition Matrix. Size is fixed to 16x16.
Markov Transition Matrix Flow	Converts a flow into a Markov Transition Matrix. It squares the image based on the number of packets

Requirements

C++ Compiler: GCC ≥ 9.0, Clang ≥ 10, or MSVC 2019 with C++17 support.
CMake: Version ≥ 3.14
PcapPlusPlus: Installed system‑wide or built locally. (https://github.com/seladb/PcapPlusPlus)
OpenSSL: For SHA256 hashing (libcrypto).
OpenCV: Version ≥ 4.0 for image handling and saving (e.g., cv::imwrite).
pthread: POSIX threads (Linux/macOS). Windows users require linking against -lws2_32 and -lIPHLPAPI.
libpcap: PCAP Support (Linux/macOS)

Optional:

getopt_long: For CLI parsing (provided by libc on Linux/macOS). Windows may need getopt replacement.

Building from source

# Clone this repo
git clone https://github.com/yourusername/heiFIPCpp.git
cd heiFIP/heiFIP/

# Create build directory
mkdir build && cd build

cmake ..

# We highly recommend that locating necessary dependencies is done manually since espically 
# Pcap Plus Plus is often not installed in standard locations. While we do use scripts to automatically detect 
# the necessary dependencies if those scripts fail you can specify the paths to the include directories of the header 
# files aswell as the paths to libaries manually like so. Also do not forget to specify all three of Pcap Plus Plus's
# libaries libCommon++, libPacket++, libPcap++. For OpenCV doing this manually while possible, due to number of links 
# necessary, is very difficult. Since OpenCV is configured for Cmake anyway this is unnecessary anyway. When using macOS
# you need to be very careful that the linked libraries are not Intel (x86_64) bottles, since if this happens the code
# will still be compiled as ARM64 but dynamically linking against x86_64 .dylib. This forces macOS to convert 
# back to ARM64 at runtime using Rosetta 2 which encures significant overhead. So if possible use a Linux distribution

cmake .. \
  -DCMAKE_BUILD_TYPE=Release \
  -DUSE_MANUAL_PCAPPLUSPLUS=ON \
  -DPcapPlusPlus_INCLUDE_DIRS="/opt/homebrew/Cellar/pcapplusplus/25.05/include" \
  -DPcapPlusPlus_LIBRARIES="/opt/homebrew/Cellar/pcapplusplus/25.05/lib/libCommon++.a\;/opt/homebrew/Cellar/pcapplusplus/25.05/lib/libPacket++.a\;/opt/homebrew/Cellar/pcapplusplus/25.05/lib/libPcap++.a" \
  -DUSE_MANUAL_OPENSSL=ON \
  -DOPENSSL_INCLUDE_DIR="/opt/homebrew/opt/openssl@3/include" \
  -DOPENSSL_CRYPTO_LIBRARY="/opt/homebrew/opt/openssl@3/lib/libcrypto.a"

# Compile
make -j$(nproc)

# or
cmake --build build

# The executable 'heiFIPCpp' will be produced in build/

Getting Started

After installation the command line interface can be used to extract images from pcap files witht he following command

./heiFIPCpp \
  --name HelloHeiFIP
  --input /path/to/capture.pcap \
  --output /path/to/outdir \
  --threads 4 \
  --processor HEADER \
  --mode FlowImageTiledAuto \
  --dim 16 \
  --apppend \
  --fill 0 \
  --min-dim 10 \
  --max-dim 2000 \
  --min-pkts 10 \
  --max-pkts 100 \
  --remove-dup

Options

Flag	Description
`-i`, `--input`	Input PCAP file path
`-o`, `--output`	Output directory
`-t`, `--threads`	Number of worker threads (default: 1)
`-p`, `--processor`	Preprocessing: `NONE` or `HEADER`
`-m`, `--mode`	Image type: `PacketImage`, `FlowImage`, `FlowImageTiledFixed`,
	`FlowImageTiledAuto`, `MarkovTransitionMatrixFlow`,
	`MarkovTransitionMatrixPacket`
`--dim`	Base dimension for image (e.g. width/height in pixels)
`--fill`	Fill or padding value (0–255)
`--cols`	Number of columns (for tiled/fixed or Markov flow)
`--auto-dim`	Enable auto‑dimension selection (bool)
`--append`	Enable auto‑dimension selection (bool)
`--min-dim`	Minimum allowed image dimension
`--max-dim`	Maximum allowed image dimension
`--min-pkts`	Minimum packets per flow (for tiled/flow modes)
`--max-pkts`	Maximum packets per flow
`--remove-dup`	Remove duplicate flows/packets by hash
`--name`	Filname of processed image
`-h`, `--help`	Show this help message

Extending

To add a new image type:

Define a new ImageArgs struct in extractor.cpp.
Extend the ImageType enum.
Implement the conversion in PacketProcessor::createImageFromPacket().
Update the CLI --mode parser to include your new type.

Publications that use heiFIP

S. Machmeier, M. Hoecker, V. Heuveline, "Explainable Artificial Intelligence for Improving a Session-Based Malware Traffic Classification with Deep Learning", in 2023 IEEE Symposium Series on Computational Intelligence (SSCI), Mexico-City, Mexico, 2023. https://doi.org/10.1109/SSCI52147.2023.10371980
S. Machmeier, M. Trageser, M. Buchwald, and V. Heuveline, "A generalizable approach for network flow image representation for deep learning", in 2023 7th Cyber Security in Networking Conference (CSNet), Montréal, Canada, 2023. https://doi.org/10.1109/CSNet59123.2023.10339761

Authors

The following people contributed to heiFIP:

Stefan Machmeier: Creator
Manuel Trageser: Header extraction and customization.
Henri Rebitzky: Coversion from python to c++

License

This project is licensed under the EUPL-1.2 License - see the License file for details

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
.github		.github
assets		assets
examples		examples
heiFIP		heiFIP
tests/pcaps		tests/pcaps
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Table of Contents

Motivation

Main Features

Examples

Requirements

Building from source

Getting Started

Options

Extending

Publications that use heiFIP

Authors

License

About

Uh oh!

Releases 3

Uh oh!

Contributors 3

Uh oh!

Languages

License

stefanDeveloper/heiFIP

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Motivation

Main Features

Examples

Requirements

Building from source

Getting Started

Options

Extending

Publications that use heiFIP

Authors

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Contributors 3

Uh oh!

Languages