Skip to content

sebasmos/curious-qmoe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

80 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LICENSE Python Version

Uncertainty Makes It Stable: Curiosity-Driven Quantized Mixture-of-Experts

curious-qmoe is a curiosity-driven quantized Mixture-of-Experts framework for efficient audio classification on resource-constrained edge devices. curious-qmoe achieves 99.9% of full-precision accuracy with 4Γ— compression and 82% latency variance reduction through Bayesian epistemic uncertainty-based routing.

Key Features:

  • Heterogeneous Quantization: BitNet ternary, BitLinear (1-16 bit), post-training quantization (PTQ) with bitwise operations
  • Curiosity-Driven Routing: Bayesian router with Monte Carlo dropout for epistemic uncertainty estimation
  • Mixture-of-Experts: Dynamic expert selection across quantized models for adaptive precision
  • Hardware-Efficient: Optimized for edge deployment with predictable latency (29 ms std)
  • Comprehensive Evaluation: Energy consumption, carbon emissions, and statistical significance testing
  • Reproducible: Hydra configuration management, cross-validation, experiment tracking

Datasets: ESC-50, Quinn, UrbanSound8K


Setup

conda create -n curious-qmoe python=3.11 -y
conda activate curious-qmoe
git clone https://github.com/sebasmos/QWave.git
cd QWave
pip install -e .

Quick Start

Basic Usage

cd scripts
python benchmark.py \
  --config-path /path/to/curious-qmoe/config \
  --config-name esc50 \
  experiment.datasets.esc.csv=/path/to/esc-50.csv \
  experiment.device=cpu \
  experiment.models_to_run=[esc]

MoE with Curiosity Mode

python benchmark.py \
  --config-path /path/to/curious-qmoe/config \
  --config-name esc50 \
  experiment.device=cpu \
  experiment.datasets.esc.csv=/path/to/esc-50.csv \
  experiment.models_to_run=[moe] \
  experiment.router.expert_quantizations="[bitnet,'1','2','4','8','16',qesc]" \
  experiment.router.num_experts=3 \
  experiment.router.top_k=1 \
  experiment.router.use_curiosity=true \
  experiment.metadata.tag=esc_moe_curiosity

Curiosity outputs (saved per fold):

  • curiosity_values.json - Raw uncertainty values
  • curiosity_histogram.png - Distribution of epistemic uncertainty
  • curiosity_per_class.png - Average uncertainty per class

Project Structure

curious-qmoe/
β”œβ”€β”€ config/                    # Hydra configs
β”‚   └── esc50.yaml             # ESC-50 configuration
β”œβ”€β”€ curious_qmoe/              # Core source code
β”‚   β”œβ”€β”€ datasets.py            # EmbeddingDataset and normalization
β”‚   β”œβ”€β”€ models.py              # Neural architectures (MLP, ESCModel)
β”‚   β”œβ”€β”€ bitnnet.py             # BitNet quantized layers
β”‚   β”œβ”€β”€ qmoe_layers.py         # Quantized MoE layers
β”‚   β”œβ”€β”€ moe.py                 # MoE training and Bayesian Router
β”‚   β”œβ”€β”€ train_utils.py         # Training/validation utilities
β”‚   β”œβ”€β”€ memory.py              # Model size calculation
β”‚   β”œβ”€β”€ graphics.py            # Plotting (ROC, losses, curiosity)
β”‚   └── utils.py               # Helpers (seeding, device, metrics)
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ benchmark.py           # Main benchmarking pipeline
β”‚   └── tables/                # Results analysis scripts
β”‚       β”œβ”€β”€ organize-results.py      # Combine CSV results
β”‚       β”œβ”€β”€ analyze-std.py           # Generate tables with meanΒ±std
β”‚       β”œβ”€β”€ analyze-significance.py  # Statistical testing (t-tests, Levene)
β”‚       └── README-significance.md   # Model nomenclature reference
β”œβ”€β”€ outputs/                   # Auto-generated results
└── pyproject.toml

Results Analysis

After running experiments, analyze results with the scripts in scripts/tables/:

1. Organize Results

Combine CSV files from multiple experiments:

cd scripts/tables
python organize-results.py  # Edit dataset path in script

2. Generate Tables (meanΒ±std)

Create 5 tables with meanΒ±std from 5-fold cross-validation:

python analyze-std.py

Output: tables-std/ folder with 4 main tables + 1 supplementary

3. Statistical Significance Testing

Run paired t-tests and variance tests:

python analyze-significance.py

Output: significance-tests/ folder with 6 CSV files:

  • F1-score comparisons (Tables 1-3)
  • Latency speedup tests (Table 4)
  • Energy efficiency tests (Table 3)
  • Variance reduction analysis (Levene's test)

Model nomenclature: See scripts/tables/README-significance.md for standardized names (FP32-Base, Q8-Base-PTQ, etc.)


Config Overview

Key parameters in config/esc50.yaml:

experiment:
  models_to_run: [esc]  # Options: esc, bitnet, moe, qmoe, 1, 2, 4, 8, 16, qesc
  device: "cpu"  # or "cuda", "mps"

  datasets:
    esc:
      csv: "/path/to/esc-50.csv"
      normalization_type: "standard"

  model:
    batch_size: 64
    hidden_sizes: [640, 320]
    learning_rate: 0.0005793146438537801
    epochs: 10

  router:  # For MoE models
    expert_quantizations: [1, 2, 4, 16]
    num_experts: 4
    top_k: 1
    use_curiosity: false  # Enable Bayesian Router
    load_balancing: true

  cross_validation:
    n_splits: 5
    shuffle: true
    random_seed: 42

Supported schemes:

  • 1-bit to 16-bit: Symmetric quantization with scale factors
  • BitNet: Ternary weights {-1, 0, 1} with per-channel scaling
  • qesc: Bitwise popcount with 2-bit ternary encoding

License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).


Citation

@software{Cajas2025_curious_qmoe,
  author = {Cajas OrdΓ³Γ±ez, SebastiΓ‘n AndrΓ©s and Torres, Luis and Meno, Mackenzie and Lai, Yuan and DurΓ‘n, Carlos and Celi, Leo Anthony},
  title = {curious-qmoe: Learning to Route Curiously in Low-Bit Mixture-of-Experts},
  year = {2025},
  url = {https://github.com/sebasmos/QWave},
  license = {CC-BY-NC-SA-4.0}
}

About

πŸ”¬ Curiosity-Driven Quantized Mixture of Experts

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •