Skip to content

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

License

Notifications You must be signed in to change notification settings

lazaratan/meta-flow-matching

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Meta Flow Matching

Paper python pytorch lightning hydra license Template

Description

Meta Flow Matching (MFM) is a practical approach to integrating along vector fields on the Wasserstein manifold by amortizing the flow model over the initial distributions. Current flow-based models are limited to a single initial distribution/population and a set of predefined conditions which describe different dynamics.

In natural sciences, multiple processes can be represented as vector fields on the Wasserstein manifold of probability densities - i.e. the change of the population at any moment in time depends on the population itself due to the interactions between samples/particles. One domain of applications is personalized medicine, where the development of diseases and the respective effect/response of treatments depend on the microenvironment of cells specific to each patient.

In MFM, we jointly train a vector field model $v_t(\cdot | \varphi(p_0; \theta); \omega)$ and a population embedding model $\varphi(p_0; \theta)$. Initial populations are embedded into lower dimensional representations using a Graph Neural Network (GNN). This gives MFM the ability to generalize over unseen distributions, unlike previously proposed methods. We show the ability of MFM to improve prediction of individual treatment responses on a large-scale multi-patient single-cell drug screen dataset (Ramos Zapatero et al. Cell, 2023).

This repo contains all elements needed to reproduce our results. See this http link for the paper.

The preprocessed data can be downloaded here: Preprocessed organoid data

The raw data can be downloaded here: Raw organoid data. For usability, we provide the notebooks trellis_data_replica_splits.ipynb and trellis_data_3_pdo_splits.ipynb which contain further dataset details, code for the data preprocessing, and code for generating the replica and patient splits.

If you find this code useful in your research, please cite our work.

L. Atanackovic*, X. Zhang*, B. Amos, M. Blanchette, L.J. Lee, Y. Bengio, A. Tong, K. Neklyudov. "Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold." International Conference on Learning Representations (ICLR), 2025.
@article{atanackovic2024meta,
      title={Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold}, 
      author={Lazar Atanackovic and Xi Zhang and Brandon Amos and Mathieu Blanchette and Leo J. Lee and Yoshua Bengio and Alexander Tong and Kirill Neklyudov},
      year={2024},
      eprint={2408.14608},
      archivePrefix={arXiv},
}

How to run

Install dependencies

# clone project
git clone https://github.com/lazaratan/meta-flow-matching.git
cd meta-flow-matching

# [OPTIONAL] create conda environment
conda create -n mfm python=3.9
conda activate mfm

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Train model with chosen experiment configuration from src.conf/experiment/

python train.py experiment=experiment_name.yaml

You can override any parameter from command line like this

python train.py experiment=experiment_name.yaml trainer.max_epochs=1234 seed=42

To train a model via MFM on the synthetic letters setting, use

python train.py experiment=letters_mfm.yaml

To run the biological experiments, first download the preprocessed data here. Then, similar to the synthetic letters experiment, executing

python train.py experiment=trellis_mfm.yaml

will train 1 seed of an MFM model on the organoid drug-screen dataset.

To replicate an experiment, for example, the last row of Table 1 (in the paper), you can use the multi-run feature:

python train.py -m experiment=letters_mfm.yaml seed=1,2,3

Contributions

Have a question? Found a bug? Missing a specific feature? Feel free to file a new issue, discussion or PR with respective title and description.

Before making an issue, please verify that:

  • The problem still exists on the current main branch.
  • Your python dependencies are updated to recent versions.

Suggestions for improvements are always welcome!

About

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages