Corresponding author: Shijun Cheng (sjcheng.academic@gmail.com)
This repository is organized as follows:
- 📂 gsfm: python code containing routines for generative seismic foundation model;
- 📂 asset: folder containing logo;
- 📂 dataset: folder to store dataset;
To ensure reproducibility, we provide the the data set for pre-training stages and our pre-trainined model for GSFM. Field data is not shared here due to restricted permissions.
-
Pre-training data set Download the pre-training data set here. Then, use
unzipto extract the contents todataset/syn/train/. -
Synthetic test data set Download the synthetic test data set here. Then, use
unzipto extract the contents todataset/syn/test/. -
Pre-trained model Download our pre-trained GSFM model here. Then, extract the contents to
/checkpoints_pretrain/.
To ensure reproducibility of the results, we suggest using the environment.yml file when creating an environment.
Simply run:
./install_env.sh
It will take some time, if at the end you see the word Done! on your terminal you are ready to go. Activate the environment by typing:
conda activate gsfm
After that you can simply install your package:
pip install .
or in developer mode:
pip install -e .
When you have downloaded the supplementary files and have installed the environment, you can run the pre-training and fine-tuning code. For pretraning, you can directly run:
python pretrain.py
For fine-tuning, you need to decide which script to use based on the task you need to fine-tune, as follows
python denoise_finetune.py # backscatted noise attenuation task
python interpolation_finetune.py # interpolation task
python lowfreq_finetune.py # low-frequency extrapolation task
When you test the performance of our pre-trained GSFM, you can use the synthetic test data we provide. If you want to directly predict without uncertainty quantification, you can directly run:
python sample.py
For prediction with uncertainty quantification, you can directly run:
python sample_with_uq.py
Disclaimer: All experiments have been carried on a Intel(R) Xeon(R) CPU @ 2.10GHz equipped with a single NVIDIA GEForce A100 GPU. Different environment
configurations may be required for different combinations of workstation and GPU. If your graphics card does not large batch size training, please reduce the configuration value of args (batch_size) in the gsfm/pretrain.py file.
This implementation is motivated from the paper Improved Denoising Diffusion Probabilistic Models and the code adapted from their repository. We are grateful for their open source code.
@article{cheng2025gsfm,
title={A generative foundation model for an all-in-one seismic processing framework},
author={Cheng, Shijun and Harsuko, Randy and Alkhalifah, Tariq},
journal={Surveys in Geophysics},
year={2025},
publisher={Springer}
}
