Flow matching as a small, backend-agnostic Rust library primitive.
This crate exists to make a minimal, readable reference implementation for a few flow-matching variants that are useful as building blocks in larger pipelines (and as a testbed for evaluation).
This crate currently focuses on a semidiscrete flow matching setup:
- Discrete target support: a finite set of prototypes (y_j) with weights (b_j).
- Semidiscrete conditioning / assignment: uses
wass::semidiscretepotentials + hard assignment to pick an index (j). - Flow matching regression: trains a conditional vector field (v_\theta(x,t; y_j)) against a simple linear-path target.
Code entrypoints:
flowmatch::sd_fm::train_sd_fm_semidiscrete_linearflowmatch::sd_fm::TrainedSdFm::{sample,sample_with_x0}flowmatch::linear::LinearCondField(intentionally “boring baseline”)
Related primitives implemented in this crate:
flowmatch::ode: fixed-step ODE samplers (Euler,Heun)flowmatch::rfm: coupling helpers for rectified / OT-based pairingflowmatch::simplex: simplex validation + Dirichlet sampling (simplex-based “discrete FM” scaffolding)flowmatch::discrete_ctmc: CTMC generator validation + a minimal probability evolution stepflowmatch::non_euclidean: geodesic interpolant scaffolding (currently includes only Euclidean baseline)
Related (adjacent meaning of “distribution matching”):
decipher/: symbolic distribution matching for classical text deciphers (letter-frequency scoring, etc.). Seecanon/topics/distribution-matching.md.
- Semidiscrete FM baseline:
sd_fm_semidiscrete_linear(end-to-end, intentionally simple). - RFM on real geodata:
rfm_usgs_full_pipeline_report(flowmatch + tier + jin; includes metrics + timings). - Cluster-mass evaluation:
rfm_usgs_earthquakes_cluster_mass(structure-aware scoring;tier-evalsfeature).
These are the conceptual anchors for the objective + design space:
- Lipman et al., Flow Matching for Generative Modeling (arXiv:2210.02747).
Link: arXiv - Lipman et al., Flow Matching Guide and Code (arXiv:2412.06264).
Link: arXiv
Also useful as an applications-oriented map (especially for discrete / non-Euclidean variants):
- Li et al., Flow Matching Meets Biology and Life Science: A Survey (arXiv:2507.17731, 2025).
Link: arXiv
Curated resources: Awesome list
- Chen & Lipman, Flow Matching on General Geometries (arXiv:2302.03660) — Riemannian FM
Link: arXiv - Dao et al., Flow Matching in Latent Space (arXiv:2307.08698) — latent FM and guidance
Link: arXiv - Gat et al., Discrete Flow Matching (NeurIPS 2024) — discrete state spaces
Link: NeurIPS - Klein et al., Equivariant Flow Matching (NeurIPS 2023) — symmetry/equivariance constraints
Link: NeurIPS
cargo run -p flowmatch --example sd_fm_semidiscrete_linearRFM minibatch OT demo:
cargo run -p flowmatch --example rfm_minibatch_ot_linearRFM demo on token embeddings + TF-IDF-ish weights:
cargo run -p flowmatch --example rfm_textish_tokensRFM demo on real USGS earthquake locations (sphere-ish geodata):
cargo run -p flowmatch --example rfm_usgs_earthquakes_sphereRFM demo on real USGS earthquake locations, evaluated via cluster-mass structure (uses tier):
cargo run -p flowmatch --example rfm_usgs_earthquakes_cluster_mass --features tier-evalsFull engine composition demo (flowmatch + tier + jin): kNN graph → Leiden communities:
cargo run -p flowmatch --example rfm_usgs_knn_leiden --features tier-evalsFull pipeline report (all metrics + timings, including deterministic exact-kNN Leiden and optional HNSW-kNN):
cargo run -p flowmatch --example rfm_usgs_full_pipeline_report --features tier-evalsNFE/steps curve (paper-style “few-step” evaluation):
cargo run -p flowmatch --example rfm_usgs_nfe_curveSolver NFE tradeoff (Euler vs Heun under equal evaluation budgets):
cargo run -p flowmatch --example rfm_usgs_solver_nfe_tradeoffProtein torsions NFE/steps curve (seed-averaged, Ramachandran JS):
cargo run -p flowmatch --example rfm_torsions_nfe_curveMinibatch OT outlier forcing + partial pairing mitigation:
cargo run -p flowmatch --example rfm_minibatch_outlier_partialControls:
FLOWMATCH_PAIRING=partial_rowwiseusesRfmMinibatchPairing::PartialRowwiseFLOWMATCH_PAIRING=sinkhorn_selective, uses Sinkhorn then selective matchingFLOWMATCH_PAIRING_PARTIAL_KEEP_FRAC=0.8controls the fraction of rows that are forced one-to-one
Speed knobs for the full pipeline report:
# Default (highest quality): Sinkhorn pairing every step.
# Faster: reuse Sinkhorn pairing for 4 SGD steps (usually ~4× faster coupling).
FLOWMATCH_PAIRING_EVERY=4 cargo run -p flowmatch --example rfm_usgs_full_pipeline_report
# Fastest: no Sinkhorn at all (row-wise nearest pairing).
FLOWMATCH_PAIRING=rowwise cargo run -p flowmatch --example rfm_usgs_full_pipeline_report
# U-shaped timestep sampling (more weight near t=0 and t=1)
FLOWMATCH_T_SCHEDULE=ushaped cargo run -p flowmatch --example rfm_usgs_full_pipeline_reportRFM demo on real protein φ/ψ torsions (a torus-shaped domain, scored via Ramachandran JS divergence):
cargo run -p flowmatch --example rfm_protein_torsions_1bpiTiming breakdown (“poor man's profiling”): where time goes (sampling vs Sinkhorn vs SGD):
cargo run -p flowmatch --example profile_breakdown_usgs
cargo run -p flowmatch --example profile_breakdown_torsionscargo test -p flowmatchflowmatch is ndarray-only by default, but it now includes an opt-in Burn-backed Euclidean FM
module behind the burn feature (see flowmatch::burn_euclidean / flowmatch::burn_sd_fm).
cargo test -p flowmatch --features burnRun the Burn-backed toy examples:
cargo run -p flowmatch --example burn_sd_fm_semidiscrete_linear --features burn
cargo run -p flowmatch --example burn_rfm_minibatch_ot_linear --features burn