This document explains everything about this project - from basic networking concepts to the complete code architecture. After reading this, you should understand exactly how packets flow through the system without needing to read the code.
- What is DPI?
- Networking Background
- Project Overview
- File Structure
- The Journey of a Packet (Simple Version)
- The Journey of a Packet (Multi-threaded Version)
- Deep Dive: Each Component
- How SNI Extraction Works
- How Blocking Works
- Building and Running
- Understanding the Output
Deep Packet Inspection (DPI) is a technology used to examine the contents of network packets as they pass through a checkpoint. Unlike simple firewalls that only look at packet headers (source/destination IP), DPI looks inside the packet payload.
- ISPs: Throttle or block certain applications (e.g., BitTorrent)
- Enterprises: Block social media on office networks
- Parental Controls: Block inappropriate websites
- Security: Detect malware or intrusion attempts
User Traffic (PCAP) → [DPI Engine] → Filtered Traffic (PCAP)
↓
- Identifies apps (YouTube, Facebook, etc.)
- Blocks based on rules
- Generates reports
When you visit a website, data travels through multiple "layers":
┌─────────────────────────────────────────────────────────┐
│ Layer 7: Application │ HTTP, TLS, DNS │
├─────────────────────────────────────────────────────────┤
│ Layer 4: Transport │ TCP (reliable), UDP (fast) │
├─────────────────────────────────────────────────────────┤
│ Layer 3: Network │ IP addresses (routing) │
├─────────────────────────────────────────────────────────┤
│ Layer 2: Data Link │ MAC addresses (local network)│
└─────────────────────────────────────────────────────────┘
Every network packet is like a Russian nesting doll - headers wrapped inside headers:
┌──────────────────────────────────────────────────────────────────┐
│ Ethernet Header (14 bytes) │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ IP Header (20 bytes) │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ TCP Header (20 bytes) │ │ │
│ │ │ ┌──────────────────────────────────────────────────────┐ │ │ │
│ │ │ │ Payload (Application Data) │ │ │ │
│ │ │ │ e.g., TLS Client Hello with SNI │ │ │ │
│ │ │ └──────────────────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
A connection (or "flow") is uniquely identified by 5 values:
| Field | Example | Purpose |
|---|---|---|
| Source IP | 192.168.1.100 | Who is sending |
| Destination IP | 172.217.14.206 | Where it's going |
| Source Port | 54321 | Sender's application identifier |
| Destination Port | 443 | Service being accessed (443 = HTTPS) |
| Protocol | TCP (6) | TCP or UDP |
Why is this important?
- All packets with the same 5-tuple belong to the same connection
- If we block one packet of a connection, we should block all of them
- This is how we "track" conversations between computers
Server Name Indication (SNI) is part of the TLS/HTTPS handshake. When you visit https://www.youtube.com:
- Your browser sends a "Client Hello" message
- This message includes the domain name in plaintext (not encrypted yet!)
- The server uses this to know which certificate to send
TLS Client Hello:
├── Version: TLS 1.2
├── Random: [32 bytes]
├── Cipher Suites: [list]
└── Extensions:
└── SNI Extension:
└── Server Name: "www.youtube.com" ← We extract THIS!
This is the key to DPI: Even though HTTPS is encrypted, the domain name is visible in the first packet!
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Wireshark │ │ DPI Engine │ │ Output │
│ Capture │ ──► │ │ ──► │ PCAP │
│ (input.pcap)│ │ - Parse │ │ (filtered) │
└─────────────┘ │ - Classify │ └─────────────┘
│ - Block │
│ - Report │
└─────────────┘
| Version | File | Use Case |
|---|---|---|
| Simple (Single-threaded) | src/com/dpi/SimpleMain.java |
Learning, small captures |
| Multi-threaded | src/com/dpi/MultiThreadedMain.java |
Production, large captures |
- Live Interface Capture: Added
--live <interfaceName>flag. It dynamically swaps to inline capturing usingpcap4jor falls back to a custom internaltcpdumpprocess pipe. - Kernel-Bypass Mitigation / Buffer Pooling: Implemented
PacketBufferPoolto entirely remove per-packet allocation inPcapReaderandLiveCapture, alongside explicit memory barriers releasing buffers back to the thread-safeArrayBlockingQueueupon processing. Added strict ZGC configurations mimicking zero-GC performance profiles. - High-Precision Timestamps: Removed
System.currentTimeMillis()relying purely onSystem.nanoTime()and raw EPOCHtsUsec/tsSecvalues extracted directly from the byte headers, scaling flow tracking resolution strictly to microscopic boundaries and removing sub-millisecond visibility gaps. - Garbage Collection Optimization: SNI resolution strings are explicitly pooled using JVM
.intern(), queue capacities boosted strictly to65536elements to seamlessly absorb full GC pauses at 300k+ PPS, and an inlineLiveStatsthread has been attached printing continuous live MXBean polling of active Garbage Collection behavior dynamically into stdout alongside processing metrics. - Hot-reloading rules: Uses
WatchServiceto reload blocked IPs/domains from file on-the-fly without engine restart. - Extended Signatures: More than 10 new app signatures added to
AppType. - JSON Report Export: Engine dynamically writes a final
report.jsonwith all packet and thread statistics. - Top 10 Flows Logging: Final output report includes top 10 flows by bandwidth consumed.
packet_analyzer/
├── include/ # Header files (declarations)
│ ├── pcap_reader.h # PCAP file reading
│ ├── packet_parser.h # Network protocol parsing
│ ├── sni_extractor.h # TLS/HTTP inspection
│ ├── types.h # Data structures (FiveTuple, AppType, etc.)
│ ├── rule_manager.h # Blocking rules (multi-threaded version)
│ ├── connection_tracker.h # Flow tracking (multi-threaded version)
│ ├── load_balancer.h # LB thread (multi-threaded version)
│ ├── fast_path.h # FP thread (multi-threaded version)
│ ├── thread_safe_queue.h # Thread-safe queue
│ └── dpi_engine.h # Main orchestrator
│
├── src/ # Implementation files
│ ├── pcap_reader.cpp # PCAP file handling
│ ├── packet_parser.cpp # Protocol parsing
│ ├── sni_extractor.cpp # SNI/Host extraction
│ ├── types.cpp # Helper functions
│ ├── main_working.cpp # ★ SIMPLE VERSION ★
│ ├── dpi_mt.cpp # ★ MULTI-THREADED VERSION ★
│ └── [other files] # Supporting code
│
├── generate_test_pcap.py # Creates test data
├── test_dpi.pcap # Sample capture with various traffic
└── README.md # This file!
Let's trace a single packet through main_working.cpp:
PcapReader reader;
reader.open("capture.pcap");What happens:
- Open the file in binary mode
- Read the 24-byte global header (magic number, version, etc.)
- Verify it's a valid PCAP file
PCAP File Format:
┌────────────────────────────┐
│ Global Header (24 bytes) │ ← Read once at start
├────────────────────────────┤
│ Packet Header (16 bytes) │ ← Timestamp, length
│ Packet Data (variable) │ ← Actual network bytes
├────────────────────────────┤
│ Packet Header (16 bytes) │
│ Packet Data (variable) │
├────────────────────────────┤
│ ... more packets ... │
└────────────────────────────┘
while (reader.readNextPacket(raw)) {
// raw.data contains the packet bytes
// raw.header contains timestamp and length
}What happens:
- Read 16-byte packet header
- Read N bytes of packet data (N = header.incl_len)
- Return false when no more packets
PacketParser::parse(raw, parsed);What happens (in packet_parser.cpp):
raw.data bytes:
[0-13] Ethernet Header
[14-33] IP Header
[34-53] TCP Header
[54+] Payload
After parsing:
parsed.src_mac = "00:11:22:33:44:55"
parsed.dest_mac = "aa:bb:cc:dd:ee:ff"
parsed.src_ip = "192.168.1.100"
parsed.dest_ip = "172.217.14.206"
parsed.src_port = 54321
parsed.dest_port = 443
parsed.protocol = 6 (TCP)
parsed.has_tcp = true
Parsing the Ethernet Header (14 bytes):
Bytes 0-5: Destination MAC
Bytes 6-11: Source MAC
Bytes 12-13: EtherType (0x0800 = IPv4)
Parsing the IP Header (20+ bytes):
Byte 0: Version (4 bits) + Header Length (4 bits)
Byte 8: TTL (Time To Live)
Byte 9: Protocol (6=TCP, 17=UDP)
Bytes 12-15: Source IP
Bytes 16-19: Destination IP
Parsing the TCP Header (20+ bytes):
Bytes 0-1: Source Port
Bytes 2-3: Destination Port
Bytes 4-7: Sequence Number
Bytes 8-11: Acknowledgment Number
Byte 12: Data Offset (header length)
Byte 13: Flags (SYN, ACK, FIN, etc.)
FiveTuple tuple;
tuple.src_ip = parseIP(parsed.src_ip);
tuple.dst_ip = parseIP(parsed.dest_ip);
tuple.src_port = parsed.src_port;
tuple.dst_port = parsed.dest_port;
tuple.protocol = parsed.protocol;
Flow& flow = flows[tuple]; // Get or createWhat happens:
- The flow table is a hash map:
FiveTuple → Flow - If this 5-tuple exists, we get the existing flow
- If not, a new flow is created
- All packets with the same 5-tuple share the same flow
// For HTTPS traffic (port 443)
if (pkt.tuple.dst_port == 443 && pkt.payload_length > 5) {
auto sni = SNIExtractor::extract(payload, payload_length);
if (sni) {
flow.sni = *sni; // "www.youtube.com"
flow.app_type = sniToAppType(*sni); // AppType::YOUTUBE
}
}What happens (in sni_extractor.cpp):
-
Check if it's a TLS Client Hello:
Byte 0: Content Type = 0x16 (Handshake) ✓ Byte 5: Handshake Type = 0x01 (Client Hello) ✓ -
Navigate to Extensions:
Skip: Version, Random, Session ID, Cipher Suites, Compression -
Find SNI Extension (type 0x0000):
Extension Type: 0x0000 (SNI) Extension Length: N SNI List Length: M SNI Type: 0x00 (hostname) SNI Length: L SNI Value: "www.youtube.com" ← FOUND! -
Map SNI to App Type:
// In types.cpp if (sni.find("youtube") != std::string::npos) { return AppType::YOUTUBE; }
if (rules.isBlocked(tuple.src_ip, flow.app_type, flow.sni)) {
flow.blocked = true;
}What happens:
// Check IP blacklist
if (blocked_ips.count(src_ip)) return true;
// Check app blacklist
if (blocked_apps.count(app)) return true;
// Check domain blacklist (substring match)
for (const auto& dom : blocked_domains) {
if (sni.find(dom) != std::string::npos) return true;
}
return false;if (flow.blocked) {
dropped++;
// Don't write to output
} else {
forwarded++;
// Write packet to output file
output.write(packet_header);
output.write(packet_data);
}After processing all packets:
// Count apps
for (const auto& [tuple, flow] : flows) {
app_stats[flow.app_type]++;
}
// Print report
"YouTube: 150 packets (15%)"
"Facebook: 80 packets (8%)"
...The multi-threaded version (dpi_mt.cpp) adds parallelism for high performance:
┌─────────────────┐
│ Reader Thread │
│ (reads PCAP) │
└────────┬────────┘
│
┌──────────────┴──────────────┐
│ hash(5-tuple) % 2 │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ LB0 Thread │ │ LB1 Thread │
│ (Load Balancer)│ │ (Load Balancer)│
└────────┬────────┘ └────────┬────────┘
│ │
┌──────┴──────┐ ┌──────┴──────┐
│hash % 2 │ │hash % 2 │
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│FP0 Thread│ │FP1 Thread│ │FP2 Thread│ │FP3 Thread│
│(Fast Path)│ │(Fast Path)│ │(Fast Path)│ │(Fast Path)│
└─────┬────┘ └─────┬────┘ └─────┬────┘ └─────┬────┘
│ │ │ │
└────────────┴──────────────┴────────────┘
│
▼
┌───────────────────────┐
│ Output Queue │
└───────────┬───────────┘
│
▼
┌───────────────────────┐
│ Output Writer Thread │
│ (writes to PCAP) │
└───────────────────────┘
- Load Balancers (LBs): Distribute work across FPs
- Fast Paths (FPs): Do the actual DPI processing
- Consistent Hashing: Same 5-tuple always goes to same FP
Why consistent hashing matters:
Connection: 192.168.1.100:54321 → 142.250.185.206:443
Packet 1 (SYN): hash → FP2
Packet 2 (SYN-ACK): hash → FP2 (same FP!)
Packet 3 (Client Hello): hash → FP2 (same FP!)
Packet 4 (Data): hash → FP2 (same FP!)
All packets of this connection go to FP2.
FP2 can track the flow state correctly.
// Main thread reads PCAP
while (reader.readNextPacket(raw)) {
Packet pkt = createPacket(raw);
// Hash to select Load Balancer
size_t lb_idx = hash(pkt.tuple) % num_lbs;
// Push to LB's queue
lbs_[lb_idx]->queue().push(pkt);
}void LoadBalancer::run() {
while (running_) {
// Pop from my input queue
auto pkt = input_queue_.pop();
// Hash to select Fast Path
size_t fp_idx = hash(pkt.tuple) % num_fps_;
// Push to FP's queue
fps_[fp_idx]->queue().push(pkt);
}
}void FastPath::run() {
while (running_) {
// Pop from my input queue
auto pkt = input_queue_.pop();
// Look up flow (each FP has its own flow table)
Flow& flow = flows_[pkt.tuple];
// Classify (SNI extraction)
classifyFlow(pkt, flow);
// Check rules
if (rules_->isBlocked(pkt.tuple.src_ip, flow.app_type, flow.sni)) {
stats_->dropped++;
} else {
// Forward: push to output queue
output_queue_->push(pkt);
}
}
}void outputThread() {
while (running_ || output_queue_.size() > 0) {
auto pkt = output_queue_.pop();
// Write to output file
output_file.write(packet_header);
output_file.write(pkt.data);
}
}The magic that makes multi-threading work:
template<typename T>
class TSQueue {
std::queue<T> queue_;
std::mutex mutex_;
std::condition_variable not_empty_;
std::condition_variable not_full_;
void push(T item) {
std::lock_guard<std::mutex> lock(mutex_);
queue_.push(item);
not_empty_.notify_one(); // Wake up waiting consumer
}
T pop() {
std::unique_lock<std::mutex> lock(mutex_);
not_empty_.wait(lock, [&]{ return !queue_.empty(); });
T item = queue_.front();
queue_.pop();
return item;
}
};How it works:
push(): Producer adds item, signals waiting consumerspop(): Consumer waits until item available, then takes itmutex: Only one thread can access at a timecondition_variable: Efficient waiting (no busy-loop)
Purpose: Read network captures saved by Wireshark
Key structures:
struct PcapGlobalHeader {
uint32_t magic_number; // 0xa1b2c3d4 identifies PCAP
uint16_t version_major; // Usually 2
uint16_t version_minor; // Usually 4
uint32_t snaplen; // Max packet size captured
uint32_t network; // 1 = Ethernet
};
struct PcapPacketHeader {
uint32_t ts_sec; // Timestamp (seconds)
uint32_t ts_usec; // Timestamp (microseconds)
uint32_t incl_len; // Bytes saved in file
uint32_t orig_len; // Original packet size
};Key functions:
open(filename): Open PCAP, validate headerreadNextPacket(raw): Read next packet into bufferclose(): Clean up
Purpose: Extract protocol fields from raw bytes
Key function:
bool PacketParser::parse(const RawPacket& raw, ParsedPacket& parsed) {
parseEthernet(...); // Extract MACs, EtherType
parseIPv4(...); // Extract IPs, protocol, TTL
parseTCP(...); // Extract ports, flags, seq numbers
// OR
parseUDP(...); // Extract ports
}Important concepts:
Network Byte Order: Network protocols use big-endian (most significant byte first). Your computer might use little-endian. We use ntohs() and ntohl() to convert:
// ntohs = Network TO Host Short (16-bit)
uint16_t port = ntohs(*(uint16_t*)(data + offset));
// ntohl = Network TO Host Long (32-bit)
uint32_t seq = ntohl(*(uint32_t*)(data + offset));Purpose: Extract domain names from TLS and HTTP
For TLS (HTTPS):
std::optional<std::string> SNIExtractor::extract(
const uint8_t* payload,
size_t length
) {
// 1. Verify TLS record header
// 2. Verify Client Hello handshake
// 3. Skip to extensions
// 4. Find SNI extension (type 0x0000)
// 5. Extract hostname string
}For HTTP:
std::optional<std::string> HTTPHostExtractor::extract(
const uint8_t* payload,
size_t length
) {
// 1. Verify HTTP request (GET, POST, etc.)
// 2. Search for "Host: " header
// 3. Extract value until newline
}Purpose: Define data structures used throughout
FiveTuple:
struct FiveTuple {
uint32_t src_ip;
uint32_t dst_ip;
uint16_t src_port;
uint16_t dst_port;
uint8_t protocol;
bool operator==(const FiveTuple& other) const;
};AppType:
enum class AppType {
UNKNOWN,
HTTP,
HTTPS,
DNS,
GOOGLE,
YOUTUBE,
FACEBOOK,
// ... more apps
};sniToAppType function:
AppType sniToAppType(const std::string& sni) {
if (sni.find("youtube") != std::string::npos)
return AppType::YOUTUBE;
if (sni.find("facebook") != std::string::npos)
return AppType::FACEBOOK;
// ... more patterns
}When you visit https://www.youtube.com:
┌──────────┐ ┌──────────┐
│ Browser │ │ Server │
└────┬─────┘ └────┬─────┘
│ │
│ ──── Client Hello ─────────────────────►│
│ (includes SNI: www.youtube.com) │
│ │
│ ◄─── Server Hello ───────────────────── │
│ (includes certificate) │
│ │
│ ──── Key Exchange ─────────────────────►│
│ │
│ ◄═══ Encrypted Data ══════════════════► │
│ (from here on, everything is │
│ encrypted - we can't see it) │
We can only extract SNI from the Client Hello!
Byte 0: Content Type = 0x16 (Handshake)
Bytes 1-2: Version = 0x0301 (TLS 1.0)
Bytes 3-4: Record Length
-- Handshake Layer --
Byte 5: Handshake Type = 0x01 (Client Hello)
Bytes 6-8: Handshake Length
-- Client Hello Body --
Bytes 9-10: Client Version
Bytes 11-42: Random (32 bytes)
Byte 43: Session ID Length (N)
Bytes 44 to 44+N: Session ID
... Cipher Suites ...
... Compression Methods ...
-- Extensions --
Bytes X-X+1: Extensions Length
For each extension:
Bytes: Extension Type (2)
Bytes: Extension Length (2)
Bytes: Extension Data
-- SNI Extension (Type 0x0000) --
Extension Type: 0x0000
Extension Length: L
SNI List Length: M
SNI Type: 0x00 (hostname)
SNI Length: K
SNI Value: "www.youtube.com" ← THE GOAL!
std::optional<std::string> SNIExtractor::extract(
const uint8_t* payload, size_t length
) {
// Check TLS record header
if (payload[0] != 0x16) return std::nullopt; // Not handshake
if (payload[5] != 0x01) return std::nullopt; // Not Client Hello
size_t offset = 43; // Skip to session ID
// Skip Session ID
uint8_t session_len = payload[offset];
offset += 1 + session_len;
// Skip Cipher Suites
uint16_t cipher_len = readUint16BE(payload + offset);
offset += 2 + cipher_len;
// Skip Compression Methods
uint8_t comp_len = payload[offset];
offset += 1 + comp_len;
// Read Extensions Length
uint16_t ext_len = readUint16BE(payload + offset);
offset += 2;
// Search for SNI extension
size_t ext_end = offset + ext_len;
while (offset + 4 <= ext_end) {
uint16_t ext_type = readUint16BE(payload + offset);
uint16_t ext_data_len = readUint16BE(payload + offset + 2);
offset += 4;
if (ext_type == 0x0000) { // SNI!
// Parse SNI structure
uint16_t sni_len = readUint16BE(payload + offset + 3);
return std::string(
(char*)(payload + offset + 5),
sni_len
);
}
offset += ext_data_len;
}
return std::nullopt; // SNI not found
}| Rule Type | Example | What it Blocks |
|---|---|---|
| IP | 192.168.1.50 |
All traffic from this source |
| App | YouTube |
All YouTube connections |
| Domain | tiktok |
Any SNI containing "tiktok" |
Packet arrives
│
▼
┌─────────────────────────────────┐
│ Is source IP in blocked list? │──Yes──► DROP
└───────────────┬─────────────────┘
│No
▼
┌─────────────────────────────────┐
│ Is app type in blocked list? │──Yes──► DROP
└───────────────┬─────────────────┘
│No
▼
┌─────────────────────────────────┐
│ Does SNI match blocked domain? │──Yes──► DROP
└───────────────┬─────────────────┘
│No
▼
FORWARD
Important: We block at the flow level, not packet level.
Connection to YouTube:
Packet 1 (SYN) → No SNI yet, FORWARD
Packet 2 (SYN-ACK) → No SNI yet, FORWARD
Packet 3 (ACK) → No SNI yet, FORWARD
Packet 4 (Client Hello) → SNI: www.youtube.com
→ App: YOUTUBE (blocked!)
→ Mark flow as BLOCKED
→ DROP this packet
Packet 5 (Data) → Flow is BLOCKED → DROP
Packet 6 (Data) → Flow is BLOCKED → DROP
...all subsequent packets → DROP
Why this approach?
- We can't identify the app until we see the Client Hello
- Once identified, we block all future packets of that flow
- The connection will fail/timeout on the client
- macOS/Linux with C++17 compiler
- g++ or clang++
- No external libraries needed!
Simple Version:
g++ -std=c++17 -O2 -I include -o dpi_simple \
src/main_working.cpp \
src/pcap_reader.cpp \
src/packet_parser.cpp \
src/sni_extractor.cpp \
src/types.cppMulti-threaded Version:
g++ -std=c++17 -pthread -O2 -I include -o dpi_engine \
src/dpi_mt.cpp \
src/pcap_reader.cpp \
src/packet_parser.cpp \
src/sni_extractor.cpp \
src/types.cppBasic usage:
./dpi_engine test_dpi.pcap output.pcapWith blocking:
./dpi_engine test_dpi.pcap output.pcap \
--block-app YouTube \
--block-app TikTok \
--block-ip 192.168.1.50 \
--block-domain facebookConfigure threads (multi-threaded only):
./dpi_engine input.pcap output.pcap --lbs 4 --fps 4
# Creates 4 LB threads × 4 FP threads = 16 processing threadspython3 generate_test_pcap.py
# Creates test_dpi.pcap with sample traffic╔══════════════════════════════════════════════════════════════╗
║ DPI ENGINE v2.0 (Multi-threaded) ║
╠══════════════════════════════════════════════════════════════╣
║ Load Balancers: 2 FPs per LB: 2 Total FPs: 4 ║
╚══════════════════════════════════════════════════════════════╝
[Rules] Blocked app: YouTube
[Rules] Blocked IP: 192.168.1.50
[Reader] Processing packets...
[Reader] Done reading 77 packets
╔══════════════════════════════════════════════════════════════╗
║ PROCESSING REPORT ║
╠══════════════════════════════════════════════════════════════╣
║ Total Packets: 77 ║
║ Total Bytes: 5738 ║
║ TCP Packets: 73 ║
║ UDP Packets: 4 ║
╠══════════════════════════════════════════════════════════════╣
║ Forwarded: 69 ║
║ Dropped: 8 ║
╠══════════════════════════════════════════════════════════════╣
║ THREAD STATISTICS ║
║ LB0 dispatched: 53 ║
║ LB1 dispatched: 24 ║
║ FP0 processed: 53 ║
║ FP1 processed: 0 ║
║ FP2 processed: 0 ║
║ FP3 processed: 24 ║
╠══════════════════════════════════════════════════════════════╣
║ APPLICATION BREAKDOWN ║
╠══════════════════════════════════════════════════════════════╣
║ HTTPS 39 50.6% ########## ║
║ Unknown 16 20.8% #### ║
║ YouTube 4 5.2% # (BLOCKED) ║
║ DNS 4 5.2% # ║
║ Facebook 3 3.9% ║
║ ... ║
╚══════════════════════════════════════════════════════════════╝
[Detected Domains/SNIs]
- www.youtube.com -> YouTube
- www.facebook.com -> Facebook
- www.google.com -> Google
- github.com -> GitHub
...
| Section | Meaning |
|---|---|
| Configuration | Number of threads created |
| Rules | Which blocking rules are active |
| Total Packets | Packets read from input file |
| Forwarded | Packets written to output file |
| Dropped | Packets blocked (not written) |
| Thread Statistics | Work distribution across threads |
| Application Breakdown | Traffic classification results |
| Detected SNIs | Actual domain names found |
-
Add More App Signatures
// In types.cpp if (sni.find("twitch") != std::string::npos) return AppType::TWITCH;
-
Add Bandwidth Throttling
// Instead of DROP, delay packets if (shouldThrottle(flow)) { std::this_thread::sleep_for(10ms); }
-
Add Live Statistics Dashboard
// Separate thread printing stats every second void statsThread() { while (running) { printStats(); sleep(1); } }
-
Add QUIC/HTTP3 Support
- QUIC uses UDP on port 443
- SNI is in the Initial packet (encrypted differently)
-
Add Persistent Rules
- Save rules to file
- Load on startup
This DPI engine demonstrates:
- Network Protocol Parsing - Understanding packet structure
- Deep Packet Inspection - Looking inside encrypted connections
- Flow Tracking - Managing stateful connections
- Multi-threaded Architecture - Scaling with thread pools
- Producer-Consumer Pattern - Thread-safe queues
The key insight is that even HTTPS traffic leaks the destination domain in the TLS handshake, allowing network operators to identify and control application usage.
If you have questions about any part of this project, the code is well-commented and follows the same flow described in this document. Start with the simple version (main_working.cpp) to understand the concepts, then move to the multi-threaded version (dpi_mt.cpp) to see how parallelism is added.
Happy learning! 🚀