TrustFlow ZK - Federated Learning with Simulated Audit Trail

This project implements a Federated Learning simulation using the Flower framework and TensorFlow. It includes a simulated blockchain-based audit trail (Model Registry, Inference Log, Feedback Log), SHAP for explainable AI, and EZKL for generating Zero-Knowledge Proofs of the model inferences.

Project Structure

src/: Contains the source code for the simulation.
- client.py: FlowerClient implementations (Honest and Malicious).
- strategy.py: Custom server strategy to capture aggregated model weights.
- model.py: Keras model definition.
- data_processing.py: Data loading, preprocessing, and partitioning.
- blockchain_sim.py: Simulation of smart contracts/ledgers using Pandas.
- explainability.py: Logic for SHAP analysis and summary plots.
- zk_proof.py: Logic for converting models to ONNX and generating ZKP using EZKL.
data/: Directory to store the datasets.
notebooks/: Original Jupyter notebooks.
main.py: The entry point to run the full simulation workflow.

Features

Federated Learning Simulation: Simulates multiple clients (honest and malicious) training a model collaboratively without sharing raw data.
Blockchain Audit Trail: Logs model updates and inference events to a simulated tamper-proof ledger.
Explainable AI (XAI): Uses SHAP (SHapley Additive exPlanations) to explain global model predictions.
Zero-Knowledge Proofs (ZKP): Uses EZKL to generate validity proofs for model inferences, ensuring computational integrity without revealing weights.

Prerequisites

Python 3.9+
pip

Installation

Clone the repository:

git clone <repository_url>
cd trustflow-task

Install the required dependencies:
```
pip install -r requirements.txt
```

Dataset Setup

This project supports two datasets: Heart Disease and Breast Cancer. You need to download them manually and place them in the data/ directory.

1. Heart Disease Dataset

Source: Kaggle - Heart Disease Dataset
Action: Download heart.csv and place it in data/heart.csv.

2. Breast Cancer Dataset (Optional)

Source: Kaggle - Breast Cancer Wisconsin (Diagnostic) Data
Action: Download data.csv and place it in data/data.csv.

To switch datasets, modify the DATASET_NAME constant in main.py (default is 'heart_disease').

Usage

To start the full workflow simulation:

python main.py

This will:

Initialize the simulated blockchain ledgers.
Load and partition the data.
Start a simulation of Federated Averaging (default 5 rounds).
Log the final global model to the Model Registry.
Run SHAP analysis to generate explanations for test set predictions.
Run EZKL to generate a Zero-Knowledge Proof for a sample inference.
Save metrics and charts (e.g., simulation_results.png, zkp_overhead_chart.png).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
inspect_notebook.py		inspect_notebook.py
main.py		main.py
notebook_content.txt		notebook_content.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TrustFlow ZK - Federated Learning with Simulated Audit Trail

Project Structure

Features

Prerequisites

Installation

Dataset Setup

1. Heart Disease Dataset

2. Breast Cancer Dataset (Optional)

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

lukiod/TrustFlow-Zk

Folders and files

Latest commit

History

Repository files navigation

TrustFlow ZK - Federated Learning with Simulated Audit Trail

Project Structure

Features

Prerequisites

Installation

Dataset Setup

1. Heart Disease Dataset

2. Breast Cancer Dataset (Optional)

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages