This is the offical repository for Neural Persistence Dynamics.
We are still updating the respository actively!
If you use the code please cite as:
@misc{Zeng24a,
      title={Neural Persistence Dynamics}, 
      author={Sebastian Zeng and Florian Graf and Martin Uray and Stefan Huber and Roland Kwitt},
      year={2024},
      eprint={2405.15732},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}In the following, we assume that the repository has been cloned into /tmp/neural_persistence_dynamics.
conda create -n "pytorch23" python=3.10
conda activate pytorch23cd /tmp/neural_persistence_dynamics
mkdir -p data # stores all data
mkdir -p logs # stores all logs
mkdir -p 3rdparty # 3rdparty code
mkdir -p runs # stores all tensorboard related stuffconda install pytorch=2.3.0 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
python -c 'import torch' # checkcd /tmp/neural_persistence_dynamics
cd 3rdparty
git clone https://github.com/c-hofer/torchph.git 
cd torchph
pip install -e .
python -c 'import torchph' # checkpip3 install torchdiffeq
python -c 'import torchdiffeq' # checkplease see the installation instructions here.
Typically, one would use
pip3 install git+https://github.com/simonzhang00/ripser-plusplus.git
Depending on your system's GCC and CUDA version, you might need to downgrade to, e.g., GCC-12. On an Ubuntu system, this can be done via
sudo apt-get install g++-12
sudo apt-get install gcc-12
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 100
sudo update-alternatives --config g++ # then select the version you want
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 100
sudo update-alternatives --config gcc # then select the version you wantcd /tmp/neural_persistence_dynamics
cd 3rdparty
git clone https://github.com/patrick-kidger/signatory.git
cd signatory
python setup.py install 
python -c 'import signatory' # checkpip install tensorboard permetrics halo einops h5py rich_argparse gdown matplotlib torch_geometric torch_cluster torch_scatter torch_sparse In the following, we replicate the experiments with the dorsogna-1k data from then paper. All other experiments follow the same procedure.
cd /tmp/neural_persistence_dynamics
mkdir -p data/Giusti23a/1k
python download.py --dataset dorsogna-1k --destination data/Giusti23a/1k python compute_pds.py \
    --simu-inp-file data/Giusti23a/1k/simu_1k.pt \
    --prms-inp-file data/Giusti23a/1k/prms_1k.pt \
    --dgms-out-file data/Giusti23a/1k/dgms_1k_vr_h0h1.pt \
    --compute-phThis will compute all Vietoris-Rips persistence diagrams for --max-dim 2 on the command line).
Diagrams are saved to data/Giusti23a/1k/dgms_1k_vr_h0h1.pt.
In the case, that during execution an error similar to
 OSError: /home/User/envs/pytorch23/lib/python3.10/site-packages/torch/lib/../../../.././libstdc++.so.6:  version `GLIBCXX_3.4.32' not found (required by /home/USER/.miniconda3/envs/pytorch23/lib/python3.10/site-packages/ripserplusplus/libphmap.so)
occurs, please be referred to this link.
Next, we can compute the vectorizations:
python compute_vec.py \
    --dgms-inp-file data/Giusti23a/1k/dgms_1k_vr_h0h1.pt \
    --vecs-out-base data/Giusti23a/1k/vecs \
    --num-elements 20 \
    --nu 0.005 \
    --subsample 50000The dorsogna-1k dataset contains 1,000 simulations with 100 time points. Hence, we
have a total of 100,000 available persistence diagrams (for 
In this example,
we randomly subsample 50,000 to parametrize the vectorization, meaning 50,000 diagrams are
used to compute the centers of 20 exponential structure elements. Overall, this
yields 20-dimensional vectorizations per diagram and dimension. The relevant output
file is named vecs_20_0.005.pt and, for the given setting, contains 40-dimensional
vectors per time point.
We are now ready to train and evaluate the continuous latent variable model using
dynamics.py.
python dynamics.py \
    --vec-inp-file data/Giusti23a/1k/vecs_20_0.005.pt \
    --aux-inp-file data/Giusti23a/1k/prms_1k_norm.pt \
    --batch-size 64 \
    --lr 0.001 \
    --n-epochs 210 \
    --kl-weight 0.001 \
    --aux-weight 1000 \
    --restart 30 \
    --device cuda:0 \
    --z-dim 16 \
    --tps-frac 0.5 \
    --weight-decay 1e-3 \
    --run-dir runs/ \
    --log-out-file logs/logfile.pt \
    --backbone topdyn_only \
    --processor z_mtantwins \
    --mtan-h-dim 64 \
    --experiment-id debug \
    --seed 9000Here,  --tps-frac 0.5 specifies that we only want to
keep 50% of (all 100) time points for training, the --aux-inp-file holds a (M, 2) tensor with normalized (to
[-1,1]) simulation parameters that we wish to predict, and
--kl-weight as well as --aux-weight specify the
weight given to the KL divergence in the ELBO and the
regression objective, respectively.
Note: we normalize simulation parameter ranges to [-1,1] for training only, but evaluate on the actual range for R2 and SMAPE computation later.
Note: Overall, the run in the above mentioned configuration should yield an average testing R2 score of approx. 0.845.
You can monitor training progress by starting a tensorboard as follows:
cd /tmp/neural_persistence_dynamics
tensorboard --logdir=runs --port <PORT>The console output currently only provide information about the overall (extended) ELBO, the training/testing MSE (averaged over all parameters) and the training/testing R2 score (averaged over all parameters).
To compute other scores, it is more convenient to use the logged
predictions throughout training (or, alternatively, load the model
as in ModelLoading.ipynb and run the testing data through the
model) and use the Evaluation.ipynb notebook.
When running dynamics.py you can specify the --net-out-file FILE
parameter in which case the trained model will be saved as FILE for
later use. To load the model, see ModelLoading.ipynb in the notebooks
subfolder for an example use case.
We provide precomputed simulations, persistence diagrams and vectorizations. They can be downloaded as follows:
cd /tmp/neural_persistence_dynamics
mkdir -p data/<OUTFOLDER>
python download.py --dataset <DATASETNAME> --destination data/OUTFOLDEROUTFOLDER is the desired output folder name under data/,
DATASETNAME is the name of the dataset as referenced in the
paper, i.e., dorsogna-1k, dorsogna-10k, volex-10k and
vicsek-10k.
Note: Destination folders will be created if they do not exist already.
To run your own simulations, you also need to install the sysiphe package via
pip3 install sisyphe
python -c 'import sisyphe' # checkSimulation code for (almost) all simulations can be found under simulations.
Currently, the variations in the simulation model parameters are fixed (according
to the paper), but are easy to modify if required.
To simulate the dorsogna-1k data, we used the original simulation code provided
here.
Use generate_attractionrepulsion.py for this dataset. In particular, we used
cd /tmp/neural_persistence_dynamics
cd simulations
python generate_attractionrepulsion.py \
    --device cuda:0 \
    --freq 10 \
    --points 200 \
    --dim 3 \
    --root ../data/attraction_repulsion \
    --steps 1000 \
    --simulations 10000to generate the dorsogna-10k data.
Use generate_volumeexclusion.py for this dataset. In particular, we used
cd /tmp/neural_persistence_dynamics
cd simulations
python generate_volumeexclusion.py \
    --device cuda:0 \
    --freq 10 \
    --points 50 \
    --dim 3 \
    --root ../data/volume_exclusion \
    --steps 1000 \
    --simulations 10000to generate the volex-10k data.
Use generate_vicsek.py for this dataset. In particular, we used
python generate_vicsek.py \
    --device cuda:0 \
    --freq 10 \
    --points 200 \
    --dim 3 \
    --root ../data/vicsek \
    --steps 1000 \
    --simulations 10000to generate the vicsek-10k data.
Ubuntu 22.04, running kernel 5.15.0-112-generic, CUDA 11.8 (NVIDIA driver 555.42.02, 2x GeForce RTX 3090) and PyTorch 2.3.
The path signature kernel (PSK) approach is implemented in psk.py. For the dorsogna-1k data, for example,
The Crocker stacks baseline comparison is implemented in crocker_stacks.py.
To execute this script, you must first prepare the data using compute_cs.py.
Additionally, you need to install the teaspoon library with the appropriate
version for computing the Crocker stacks.
pip install teaspoon scikit-optimizeTo prepare the Crocker stacks for the dorsogna-1k dataset, execute the
compute_cs.py script with the path to the persistence diagrams:
python compute_cs.py --dgms_file data/Giusti23a/1k/dgms_1k_vr_h0h1.ptAfter the calculation, the Crocker stacks will be saved next to the specified file.
To execute the experiment for the Crocker stack, use the crocker_stacks.py
script. Here is an example command for the dorsogna-1k dataset:
python crocker_stacks.py \
    --prms_file=data/Vicsek/43_rips/vicsek_prms_10k.pt \
    --vecs_file=data/Vicsek/43_rips/crocker_vecs.ptThe output will directly display the final results.
Several notebooks on loading and evaluating models can be found under notebooks.
