Skip to content

noisepy/NoisePy

Repository files navigation

About NoisePy

NoisePy is a Python package designed for fast and easy computation of ambient noise cross-correlation functions. It provides additional functionality for noise monitoring and surface wave dispersion analysis.

Documentation Status Build Status Codecov DOI

Major updates coming

NoisePy is going through a major refactoring to make this package easier to develop and deploy. Submit an issue, fork the repository and create pull requests to contribute.

Installation

The nature of NoisePy being composed of python scripts allows flexible package installation, which is essentially to build dependent libraries the scripts and related functions live upon. We recommend using conda or pip to install.

Note the order of the command lines below matters

With Conda and pip

conda create -n noisepy -y python=3.10 pip
conda activate noisepy
pip install noisepy-seis

To add jupyter dependencies, install them

pip install ipykernel notebook
python -m ipykernel install --user --name noisepy

With Conda and pip and MPI support

conda create -n noisepy -y python=3.10 pip mpi4py
conda activate noisepy
pip install noisepy-seis[mpi]

With virtual environment

python -m venv noisepy
source noisepy/bin/activate
pip install noisepy-seis

With virtual environment and MPI support

An MPI installation is required. E.g. for macOS using brew :

brew install open-mpi
python -m venv noisepy
source noisepy/bin/activate
pip install noisepy-seis[mpi]

Functionality

Here is a list of features of the package:

  • download continous noise data based:
    • on webservices using obspy's core functions of get_station and get_waveforms
    • on AWS S3 bucket calls, with a test on the SCEDC AWS Open Dataset.
  • save seismic data in ASDF format, which convinently assembles meta, wavefrom and auxililary data into one single file (Tutorials on reading/writing ASDF files)
  • offers scripts to precondition data sets before cross correlations. This involves working with gappy data from various formats (SAC/miniSEED) and storing it on local in ASDF.
  • performs fast and easy cross-correlation with functionality to run in parallel through MPI
  • Applications module:
    • Ambient noise monitoring: measure dv/v using a wide variety of techniques in time, fourier, and wavelet domain (Yuan et al., 2021)
    • Surface wave dispersion: construct dispersion images using conventional techniques.

Usage

To run the code on a single core, open the terminal and activate the noisepy environment before run following commands. To run on institutional clusters, see installation notes for individual packages on the module list of the cluster.

Deploy using Docker

We use I/O on disk, so users need root access to the file system. To install rootless docker, see instructions here.

docker pull  ghcr.io/noisepy/noisepy:latest
docker run -v ~/tmp:/tmp ghcr.io/noisepy/noisepy:latest cross_correlate --path /tmp

Tutorials

Short tutorials on how to use NoisePy can be is available here and can be run directly in Colab. These tutorials present simple examples of how NoisePy might work. We strongly encourage you to download the NoisePy package and play it on your own! If you have any comments and/or suggestions during running the codes, please do not hesitate to contact us through email or open an issue in this github page!

Chengxin Jiang (chengxinjiang@gmail.com) Marine Denolle (mdenolle@uw.edu) Yiyu Ni (niyiyu@uw.edu)

Taxonomy

Taxonomy of the NoisePy variables.

  • station refers to the site that has the seismic instruments that records ground shaking.
  • channel refers to the direction of ground motion investigated for 3 component seismometers. For DAS project, it may refers to the single channel sensors.
  • ista is the index name for looping over stations
  • cc_len correlation length, basic window length in seconds
  • step is the window that get skipped when sliding windows in seconds
  • smooth_N number of points for smoothing the time or frequency domain discrete arrays.
  • maxlag maximum length in seconds saved in files in each side of the correlation (save on storage)
  • substack, substack_windows boolean, number of window over which to substack the correlation (to save storage or do monitoring).
  • time_chunk, nchunk refers to the time unit that defined a single job. for instace, cc_len is the correlation length (e.g., 1 hour, 30 min), the overall duration of the experiment is the total length (1 month, 1 year, ...). The time chunk could be 1 day: the code would loop through each cc_len window in a for loop. But each day will be sent as a thread.

Acknowledgements

Thanks to our contributors so far!

Contributors

Use this reference when publishing on your work with noisepy

Main code:

Algorithms used:

This research received software engineering support from the University of Washington’s Scientific Software Engineering Center (SSEC) supported by Schmidt Futures, as part of the Virtual Institute for Scientific Software (VISS). We would like to acknowledge Carlos Garcia Jurado Suarez and Nicholas Rich for their collaboration and contributions to the software.