This repo reproduces the MNIST MLP experiment from Sliced Information Plane for Analysis of Deep Neural Networks (Wongso, Ghosh, Motani, Jan 31, 2025) [TechRxiv preprint] using 1‑SMI to study learning dynamics on the Sliced Information Plane.
The goal is to track how each layer’s representation T evolves during training in terms of:
S_1(X; T)(information retained about the input),S_1(T; Y)(information about the label).
Plotting these quantities across epochs yields the Sliced Information Plane (SIP).
Mutual Information (MI, Eq. 1)
Sliced Mutual Information (1‑SMI, Eq. 2)
where Y, only X is projected.
k‑Sliced Mutual Information (k‑SMI, Eq. 3)
where A and B are random orthonormal projection matrices (uniform on the Stiefel manifold).
- Train an MLP and save ReLU activations from hidden layers on the test set at snapshot epochs.
- Estimate
S_1(X; T)andS_1(T; Y)for each saved layer and epoch. - Default estimator: KSG (
k_neighbors = 5), withmrandom projections.
The Sliced Information Plane shows the relationship between S_1(X; T) and S_1(T; Y) across epochs for each layer, showing how representations evolve in terms of input retention and label relevance.
| h32 (32 hidden units) | h64 (64 hidden units) |
|---|---|
![]() |
![]() |
Training Curves. Loss and accuracy over epochs for train/test splits.
| h32 (32 hidden units) | h64 (64 hidden units) |
|---|---|
![]() |
![]() |
My contribution: Redundancy Comparison. (h32 vs h64) Estimated redundancy dynamics side-by-side.
Setup
uv sync
uv add --dev --editable .Train and save activation snapshots
bash experiments/mnist_mlp/run_training.sh h64Compute 1‑SMI metrics
bash experiments/mnist_mlp/run_analysis.sh h64Compute Redundancy metric
bash experiments/mnist_mlp/run_pid_analysis.sh h64Generate plots
bash experiments/mnist_mlp/generate_plots.sh h64Variants:
h64matches the paper (64 units per hidden layer).h32is the reduced‑capacity variant (32 units).
Outputs are written to experiments/mnist_mlp/runs/<variant>/ (configs, checkpoints, activations, analysis metrics, and plots).
experiments/mnist_mlp/run.py: loads configs and launches training.experiments/mnist_mlp/run_analysis.py: computes 1‑SMI from saved activations.experiments/mnist_mlp/generate_plots.py: renders training curves and SIP plots.src/smi_training_dynamics/neural_networks/: MLP and training loop.src/smi_training_dynamics/measures/: MI, SMI, k‑SMI estimators (implementation based on Wongso et al., 2023).src/smi_training_dynamics/visualizations/: plotting utilities.




