Skip to content

Conversation

@kth0910
Copy link

@kth0910 kth0910 commented Sep 9, 2025

feat: Add FedMD (Federated Model Distillation) support

Issue

Description

Currently, Flower supports traditional federated learning approaches like FedAvg, FedProx, etc., but lacks support for Federated Model Distillation (FedMD). FedMD is an important paradigm that enables knowledge sharing through logit distillation on public data, allowing heterogeneous models to learn from each other without sharing raw data or model parameters.

This approach is particularly valuable for:

  • Privacy-preserving federated learning with heterogeneous client models
  • Knowledge transfer between different architectures
  • Scenarios where clients have different model capacities but access to shared public data

Related issues/PRs

This is a new feature request. No related issues found, but this addresses the broader need for more diverse federated learning strategies in Flower.

Proposal

Explanation

This PR introduces comprehensive FedMD support to Flower with the following key components:

Core Protocol & Communication:

  • New protobuf messages (Tensor, DistillIns, DistillRes, ConsensusIns) for efficient logit exchange
  • Tensor serialization utilities for seamless numpy array ↔ protobuf conversion
  • Public data management system with registry and provider patterns

Client-Side Implementation:

  • FedMDNumPyClient: Extends NumPyClient with logit generation and distillation capabilities
  • Public data provider for accessing shared datasets
  • Support for temperature-scaled soft target learning

Server-Side Strategy:

  • FedMDStrategy: Handles logit collection, aggregation, and consensus building
  • Configurable sampling strategies for public data selection
  • Flexible temperature and training parameter control

Comprehensive Example & Validation:

  • Complete PyTorch CIFAR-10 implementation with 3 clients
  • Real-time monitoring and convergence analysis tools
  • Automated validation of FedMD effectiveness through client-consensus distance analysis

Key Benefits:

  • Privacy-preserving: Only logits (not raw data) are shared
  • Heterogeneous model support: Different architectures can participate
  • Effective knowledge transfer through consensus learning
  • Scalable and extensible design following Flower patterns

Checklist

  • Implement proposed change
  • Write tests (comprehensive example with validation)
  • Update documentation (comprehensive docs added)
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

Any other comments?

This implementation follows Flower's established patterns and integrates seamlessly with existing infrastructure. The example demonstrates successful convergence with clear metrics showing client-consensus distance reduction over training rounds.

The code includes comprehensive error handling, detailed logging, and validation tools to ensure FedMD effectiveness. All new components are properly documented and follow Flower's coding standards.

Example Output:

- Add FedMD protocol messages (Tensor, DistillIns, DistillRes, ConsensusIns)
- Implement FedMDNumPyClient for client-side logit generation and distillation
- Add FedMDStrategy for server-side logit aggregation and consensus
- Create public data registry and provider for shared dataset management
- Add tensor serialization utilities for protobuf communication
- Include comprehensive PyTorch CIFAR-10 example with validation
- Add detailed monitoring and convergence analysis tools

This implementation enables federated learning through knowledge distillation
where clients share logits on public data and learn from aggregated consensus.
- Add explanation-fedmd.rst with detailed FedMD concept explanation
- Add tutorial-fedmd-pytorch.rst with complete PyTorch example
- Add ref-api-fedmd.rst with full API reference documentation
- Update index.rst and reference.rst to include FedMD sections
- Include usage examples, configuration options, and best practices
- Add troubleshooting guide and customization examples
@github-actions github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Sep 9, 2025
kth0910 and others added 2 commits September 9, 2025 10:51
- Add heterogeneous_models.py with different CNN architectures (Small, Medium, Large, ResNet-like)
- Add heterogeneous_client.py for creating clients with different models
- Add run_heterogeneous_simulation.py for testing heterogeneous FedMD
- Support models with 545K to 4.3M parameters
- Demonstrate FedMD effectiveness with different model complexities
- Show complex models achieve better consensus convergence
- Validate heterogeneous federated learning capabilities
@kth0910 kth0910 requested a review from panh99 as a code owner September 20, 2025 18:42
@jafermarq
Copy link
Member

Hello @kth0910 , thanks for opening the PR to introduce a new example using FedMD. For new examples, it's better if the framework doesn't need to be modified. If you ServerApp needs of a custom strategy, you can create a new one and leave it (e.g. as a strategy.py file) inside your example directory. If what you want to contribute is a brand new strategy, that's amazing! but they need to follow the signature of the new Strategy ABC and therefore it will be a single file in flwr.serverapp.strategy (you can take a look at some of the strategies in there)

Also note, that examples need to follow the structure of all other examples (this primarily mean that each example is a Flower App that's executed via flwr run). You can take a look at the quickstart-pytorch for a simple example or the advanced-pytorch for a more complex one (which includes a custom strategy actually!!). I also encourage you to write your example using the Message API directly since it provides much more versatility for the things ServerApp and ClientApp objects can communicate (in both simulated and real world federations). You can find a guide to migrate your code to the Message API here. For a complete step-by-step tutorial about the Message API i recommend you to take a look at the recently-updated (and much enhanced!) Flower tutorial.

Let me know if you need some support. In my view the steps to add both the strategy and examples into flower would be:

  • one PR for FedMD -- introduces a single .py file in flwr.serverapp.strategy and tests.
  • one PR for the example -- introduces a new example under examples/ making use of the FedMD strategy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Used to determine what PRs (mainly) come from external contributors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants