Categorical Flow Maps

Daan Roos*, Oscar Davis*, Floor Eijkelboom*,
Michael Bronstein, Max Welling, İsmail İlkan Ceylan, Luca Ambrogioni, Jan-Willem van de Meent

Official implementation of the text experiments. 🚀

❓ About

This repository contains all the code for the text experiments from the Categorical Flow Maps paper. The main module of the code is located in semicat/models/semicat.py 🧠. The module is general and ready to accept many other data types. Text-specific code is to be found in semicat/models/textsemicat.py 📝.

⚙️ Running the code

Install the dependencies:

mamba env create -f environment.yaml

Activate the environment:

mamba activate semicat

Create a .env file containing the directory that will cache the processed LM1B data:

DATASET_CACHE_DIR=/the/dir/for/lm1b

Run the experiment you want! 💥 For example,

python -m semicat.train experiment=lm1b_dit trainer=gpu

For wandb logging, add logger=wandb as an argument.

📊 Data

Text8

To download the dataset, follow the steps in github.com/andrew-cr/discrete_flow_models, placing the data in ./data/text8.

LM1B

LM1B is automatically downloaded into DATASET_CACHE_DIR, and then sequence-packed, etc. You can also run python -m semicat.data.lm1b separately in order to set up the data before launching your runs.

📘 Citation

To cite the paper or the code, please use the following:

@misc{roos2026categoricalflowmaps,
    title={Categorical Flow Maps}, 
    author={Daan Roos and Oscar Davis and Floor Eijkelboom and Michael Bronstein and Max Welling and İsmail İlkan Ceylan and Luca Ambrogioni and Jan-Willem van de Meent},
    year={2026},
    eprint={2602.12233},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2602.12233}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
configs		configs
res		res
semicat		semicat
.gitignore		.gitignore
.project-root		.project-root
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Categorical Flow Maps

❓ About

⚙️ Running the code

📊 Data

Text8

LM1B

📘 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

License

olsdavis/semicat

Folders and files

Latest commit

History

Repository files navigation

Categorical Flow Maps

❓ About

⚙️ Running the code

📊 Data

Text8

LM1B

📘 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages