Skip to content

This project detects structural network anomalies using a GNN autoencoder. It contrasts this deep learning approach with the classic DBSCAN method. While DBSCAN only uses node features (CPU, RAM), the GNN learns the graph's topology to identify statistically improbable links, proving superior for structural analysis.

Notifications You must be signed in to change notification settings

SecurDrgorP/Network_Anomaly_Detection_GNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Structural Anomaly Detection in Network Topologies using Graph Neural Networks


Plan

  1. Overview
  2. Project Objectives
  3. Dataset
  4. Anomaly Types Modeled
  5. Methodology
  6. Anomaly Scoring Strategy
  7. Evaluation Protocol
  8. Results Summary
  9. Project Structure
  10. Usage Instructions
  11. Outputs & Deliverables
  12. Applications
  13. Key Takeaway

Overview

Modern network infrastructures (cloud, ISP backbones, enterprise VLANs, SOC environments) face increasingly complex failure and attack patterns. Traditional monitoring systems rely heavily on threshold-based alerts (CPU, memory, bandwidth), which are insufficient to detect topological misconfigurations or stealthy lateral connections.

This project proposes a graph-based anomaly detection framework that detects both:

  • Attribute anomalies (e.g., abnormal resource usage)
  • Structural anomalies (e.g., unauthorized links between isolated network segments)

by explicitly modeling the network topology using Graph Neural Networks (GNNs).


Project Objectives

  • Detect structural anomalies that cannot be identified using classical tabular methods
  • Compare traditional ML (DBSCAN) with Graph Representation Learning
  • Simulate a realistic secure network scenario (strict VLAN isolation)
  • Demonstrate why topological context is essential for anomaly detection in networks

Dataset

Base Topology

  • Source: Internet Topology Zoo (conceptually inspired)
  • Implementation: Synthetic VLAN-based network topology
  • Model: Stochastic Block Model (SBM)

Each VLAN represents a secure subnet, where:

  • Intra-VLAN communication is allowed
  • Inter-VLAN communication is strictly forbidden

This design provides a clean ground truth for detecting structural violations.

πŸ’‘ The framework is dataset-agnostic and can be applied to any real network topology provided as an edge list.


Anomaly Types Modeled

1. Attribute Anomalies

Simulated as extreme CPU usage spikes:

  • Normal nodes: CPU ∈ [0.1, 1.0]
  • Anomalous nodes: CPU ∈ [90, 100]

These anomalies are designed to be easily detectable by DBSCAN, serving as a baseline.


2. Structural Anomalies (Core Contribution)

Injected as unauthorized links between distant VLANs:

  • Example: Direct connections between VLAN 0 and VLAN 3

  • Represent:

    • Firewall misconfigurations
    • Unauthorized tunnels
    • Lateral movement / backdoors

These anomalies do not affect node attributes, making them invisible to classical ML.


Methodology

Phase 1: Baseline β€” DBSCAN (Tabular ML)

Description

  • Nodes are treated as independent samples

  • Features used:

    • CPU usage
    • Memory usage
  • No graph structure is considered

Hypothesis

Anomalous nodes lie in low-density regions of the feature space.

Limitations

  • Ignores adjacency and topology
  • Cannot detect structural anomalies
  • Fails when anomalies are purely relational

Phase 2: Graph-Based Learning β€” Graph Auto-Encoder (GNN)

Model Architecture

Encoder

  • GraphSAGE-based encoder

  • Learns node embeddings by aggregating neighborhood information

  • Captures:

    • VLAN structure
    • Connectivity patterns
    • Structural regularities

Decoder

  • Dot-product decoder
  • Reconstructs the adjacency matrix
  • Outputs link existence probabilities

Learning Principle

The model is trained on a clean topology only.

At inference time:

  • Links that cannot be reconstructed accurately
  • Are assigned high reconstruction error
  • And flagged as structural anomalies

Anomaly Scoring Strategy

  • Edge-level: Low reconstructed probability β‡’ suspicious link
  • Node-level: A node is anomalous if it participates in at least one suspicious link

Final node anomaly score:

score(node) = 1 βˆ’ min(reconstructed_link_probability)

Evaluation Protocol

Ground Truth

  • Known injected CPU anomalies
  • Known injected inter-VLAN bridges

Metrics

  • Precision
  • Recall
  • F1-Score
  • ROC-AUC (GNN only)

Results Summary

Method Attribute Anomalies Structural Anomalies Topology-Aware
DBSCAN βœ… Detected ❌ Missed ❌ No
GNN (GraphSAGE + GAE) βœ… Detected βœ… Detected βœ… Yes

Key Findings

  • DBSCAN performs well only when anomalies affect raw features
  • GNN successfully detects stealth structural violations
  • Structural context is critical for robust network anomaly detection

Project Structure

Network_Anomaly_Detection/
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                # Clean topology
β”‚   └── processed/          # Nodes, edges, predictions
β”‚
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ data_loader.py
β”‚   β”œβ”€β”€ feature_generator.py
β”‚   β”œβ”€β”€ dataset.py
β”‚   β”œβ”€β”€ models.py
β”‚   β”œβ”€β”€ baseline.py
β”‚   β”œβ”€β”€ train.py
β”‚   └── visualization.py
β”‚
β”œβ”€β”€ notebooks/
β”‚   └── NADGNN.ipynb
β”‚
β”œβ”€β”€ models/
β”‚   └── gnn_model.pth
β”‚
β”œβ”€β”€ output/
β”‚   β”œβ”€β”€ dashboard.png
β”‚   └── risk_map.png
β”‚
β”œβ”€β”€ config.py
β”œβ”€β”€ main.py
β”œβ”€β”€ run.sh
└── requirements.txt

Usage Instructions

1. Installation

pip install -r requirements.txt

2. Run Full Pipeline

./run.sh

This will:

  • Generate the network topology
  • Inject anomalies
  • Train the GNN
  • Evaluate DBSCAN vs GNN
  • Save results, metrics, and visualizations

Outputs & Deliverables

Data

  • nodes.csv β€” node features + ground truth
  • edges_train.csv β€” clean topology
  • edges_test.csv β€” topology with anomalies
  • results_gnn_predictions.csv β€” final scores & predictions

Models

  • gnn_model.pth β€” trained Graph Auto-Encoder

Visualizations

  • Dashboard: Training loss, ROC, confusion matrix, metrics comparison
  • Risk Map: Network visualization with detected anomalous links

Applications

  • SOC automation & zero-trust validation
  • Cloud network misconfiguration detection
  • ISP backbone monitoring
  • Insider threat & lateral movement detection
  • Digital twin simulation of secure networks

Key Takeaway

Anomalies in networks are not always about β€œhigh values” β€” they are often about β€œwrong connections.”

Graph Neural Networks provide the necessary inductive bias to understand and protect network structure, making them indispensable for next-generation network security and monitoring systems.

About

This project detects structural network anomalies using a GNN autoencoder. It contrasts this deep learning approach with the classic DBSCAN method. While DBSCAN only uses node features (CPU, RAM), the GNN learns the graph's topology to identify statistically improbable links, proving superior for structural analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published