Skip to content

lukiod/TrustFlow-Zk

Repository files navigation

TrustFlow ZK - Federated Learning with Simulated Audit Trail

This project implements a Federated Learning simulation using the Flower framework and TensorFlow. It includes a simulated blockchain-based audit trail (Model Registry, Inference Log, Feedback Log), SHAP for explainable AI, and EZKL for generating Zero-Knowledge Proofs of the model inferences.

Project Structure

  • src/: Contains the source code for the simulation.
    • client.py: FlowerClient implementations (Honest and Malicious).
    • strategy.py: Custom server strategy to capture aggregated model weights.
    • model.py: Keras model definition.
    • data_processing.py: Data loading, preprocessing, and partitioning.
    • blockchain_sim.py: Simulation of smart contracts/ledgers using Pandas.
    • explainability.py: Logic for SHAP analysis and summary plots.
    • zk_proof.py: Logic for converting models to ONNX and generating ZKP using EZKL.
  • data/: Directory to store the datasets.
  • notebooks/: Original Jupyter notebooks.
  • main.py: The entry point to run the full simulation workflow.

Features

  1. Federated Learning Simulation: Simulates multiple clients (honest and malicious) training a model collaboratively without sharing raw data.
  2. Blockchain Audit Trail: Logs model updates and inference events to a simulated tamper-proof ledger.
  3. Explainable AI (XAI): Uses SHAP (SHapley Additive exPlanations) to explain global model predictions.
  4. Zero-Knowledge Proofs (ZKP): Uses EZKL to generate validity proofs for model inferences, ensuring computational integrity without revealing weights.

Prerequisites

  • Python 3.9+
  • pip

Installation

  1. Clone the repository:

    git clone <repository_url>
    cd trustflow-task
  2. Install the required dependencies:

    pip install -r requirements.txt

Dataset Setup

This project supports two datasets: Heart Disease and Breast Cancer. You need to download them manually and place them in the data/ directory.

1. Heart Disease Dataset

2. Breast Cancer Dataset (Optional)

To switch datasets, modify the DATASET_NAME constant in main.py (default is 'heart_disease').

Usage

To start the full workflow simulation:

python main.py

This will:

  1. Initialize the simulated blockchain ledgers.
  2. Load and partition the data.
  3. Start a simulation of Federated Averaging (default 5 rounds).
  4. Log the final global model to the Model Registry.
  5. Run SHAP analysis to generate explanations for test set predictions.
  6. Run EZKL to generate a Zero-Knowledge Proof for a sample inference.
  7. Save metrics and charts (e.g., simulation_results.png, zkp_overhead_chart.png).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published