- Overview
- Project Objectives
- Dataset
- Anomaly Types Modeled
- Methodology
- Anomaly Scoring Strategy
- Evaluation Protocol
- Results Summary
- Project Structure
- Usage Instructions
- Outputs & Deliverables
- Applications
- Key Takeaway
Modern network infrastructures (cloud, ISP backbones, enterprise VLANs, SOC environments) face increasingly complex failure and attack patterns. Traditional monitoring systems rely heavily on threshold-based alerts (CPU, memory, bandwidth), which are insufficient to detect topological misconfigurations or stealthy lateral connections.
This project proposes a graph-based anomaly detection framework that detects both:
- Attribute anomalies (e.g., abnormal resource usage)
- Structural anomalies (e.g., unauthorized links between isolated network segments)
by explicitly modeling the network topology using Graph Neural Networks (GNNs).
- Detect structural anomalies that cannot be identified using classical tabular methods
- Compare traditional ML (DBSCAN) with Graph Representation Learning
- Simulate a realistic secure network scenario (strict VLAN isolation)
- Demonstrate why topological context is essential for anomaly detection in networks
- Source: Internet Topology Zoo (conceptually inspired)
- Implementation: Synthetic VLAN-based network topology
- Model: Stochastic Block Model (SBM)
Each VLAN represents a secure subnet, where:
- Intra-VLAN communication is allowed
- Inter-VLAN communication is strictly forbidden
This design provides a clean ground truth for detecting structural violations.
π‘ The framework is dataset-agnostic and can be applied to any real network topology provided as an edge list.
Simulated as extreme CPU usage spikes:
- Normal nodes: CPU β [0.1, 1.0]
- Anomalous nodes: CPU β [90, 100]
These anomalies are designed to be easily detectable by DBSCAN, serving as a baseline.
Injected as unauthorized links between distant VLANs:
-
Example: Direct connections between VLAN 0 and VLAN 3
-
Represent:
- Firewall misconfigurations
- Unauthorized tunnels
- Lateral movement / backdoors
These anomalies do not affect node attributes, making them invisible to classical ML.
-
Nodes are treated as independent samples
-
Features used:
- CPU usage
- Memory usage
-
No graph structure is considered
Anomalous nodes lie in low-density regions of the feature space.
- Ignores adjacency and topology
- Cannot detect structural anomalies
- Fails when anomalies are purely relational
Encoder
-
GraphSAGE-based encoder
-
Learns node embeddings by aggregating neighborhood information
-
Captures:
- VLAN structure
- Connectivity patterns
- Structural regularities
Decoder
- Dot-product decoder
- Reconstructs the adjacency matrix
- Outputs link existence probabilities
The model is trained on a clean topology only.
At inference time:
- Links that cannot be reconstructed accurately
- Are assigned high reconstruction error
- And flagged as structural anomalies
- Edge-level: Low reconstructed probability β suspicious link
- Node-level: A node is anomalous if it participates in at least one suspicious link
Final node anomaly score:
score(node) = 1 β min(reconstructed_link_probability)
- Known injected CPU anomalies
- Known injected inter-VLAN bridges
- Precision
- Recall
- F1-Score
- ROC-AUC (GNN only)
| Method | Attribute Anomalies | Structural Anomalies | Topology-Aware |
|---|---|---|---|
| DBSCAN | β Detected | β Missed | β No |
| GNN (GraphSAGE + GAE) | β Detected | β Detected | β Yes |
- DBSCAN performs well only when anomalies affect raw features
- GNN successfully detects stealth structural violations
- Structural context is critical for robust network anomaly detection
Network_Anomaly_Detection/
β
βββ data/
β βββ raw/ # Clean topology
β βββ processed/ # Nodes, edges, predictions
β
βββ utils/
β βββ data_loader.py
β βββ feature_generator.py
β βββ dataset.py
β βββ models.py
β βββ baseline.py
β βββ train.py
β βββ visualization.py
β
βββ notebooks/
β βββ NADGNN.ipynb
β
βββ models/
β βββ gnn_model.pth
β
βββ output/
β βββ dashboard.png
β βββ risk_map.png
β
βββ config.py
βββ main.py
βββ run.sh
βββ requirements.txt
pip install -r requirements.txt./run.shThis will:
- Generate the network topology
- Inject anomalies
- Train the GNN
- Evaluate DBSCAN vs GNN
- Save results, metrics, and visualizations
nodes.csvβ node features + ground truthedges_train.csvβ clean topologyedges_test.csvβ topology with anomaliesresults_gnn_predictions.csvβ final scores & predictions
gnn_model.pthβ trained Graph Auto-Encoder
- Dashboard: Training loss, ROC, confusion matrix, metrics comparison
- Risk Map: Network visualization with detected anomalous links
- SOC automation & zero-trust validation
- Cloud network misconfiguration detection
- ISP backbone monitoring
- Insider threat & lateral movement detection
- Digital twin simulation of secure networks
Anomalies in networks are not always about βhigh valuesβ β they are often about βwrong connections.β
Graph Neural Networks provide the necessary inductive bias to understand and protect network structure, making them indispensable for next-generation network security and monitoring systems.