Skip to content

Denolle-Lab/da-instaseis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

da-instaseis

Data assimilation of global wavefields using instaseis.

Repository layout

da-instaseis/
├── notebooks/          # Jupyter notebooks (examples, tutorials)
│   ├── getting_started.ipynb      # Introduction and basic usage
│   └── generate_wavefields.ipynb  # Synthetic & real data workflows
├── src/
│   └── da_instaseis/   # Main Python package
│       ├── __init__.py
│       ├── waveforms.py
│       └── plotting.py
├── tests/              # pytest unit tests
├── pixi.toml           # Pixi environment & task definitions
├── pyproject.toml      # Python package metadata (PEP 517/518)
└── README.md

Installation

This project uses Pixi to manage the conda + pip environment.

1 – Install Pixi (once per machine)

curl -fsSL https://pixi.sh/install.sh | bash

2 – Clone and install the environment

git clone https://github.com/Denolle-Lab/da-instaseis.git
cd da-instaseis
pixi install

Pixi reads pixi.toml and installs all dependencies (obspy, matplotlib, scipy, numpy, jupyterlab, … from conda-forge; torch and instaseis via pip) into an isolated environment under .pixi/.

3 – Activate the environment

pixi shell

Or prefix individual commands with pixi run:

pixi run python -c "import obspy; print(obspy.__version__)"

Usage

Launch JupyterLab

pixi run lab

This opens JupyterLab in your browser. Navigate to the notebooks/ directory to access:

  • getting_started.ipynb - Introduction and basic usage examples

  • generate_wavefields.ipynb - Synthetic-only wavefields on a sphere, visualization and GIF animation

  • get_waveforms.ipynb - End-to-end pipeline that packages, per past earthquake (M≥7 since 2010):

    • real observed seismograms at permanent global stations,
    • matching Syngine synthetics at those same stations,
    • a semi-continuous synthetic wavefield on a Fibonacci sphere of virtual receivers,

    all saved as a single .npz per event. Stations are discovered from the FDSN station service (permanent networks recording continuously since 2010: IU, II, IC, G, GE, GT, CU), so the workflow no longer depends on the broken libcomcat.get_phase_dataframe. The reusable logic lives in src/da_instaseis/download.py:

    from da_instaseis import download as D
    
    cat     = D.build_event_catalog("2010-01-01", "2024-12-31", minmagnitude=7.0)
    src     = D.extract_source(cat[0])                 # origin + moment tensor
    inv, st = D.select_permanent_stations("IRIS")      # continuous since 2010
    path    = D.build_event_package(src, stations=st)  # -> data/<event_id>.npz

    Each .npz holds real_obs, real_syn of shape (n_stations, 3, n_samples), sphere_syn of shape (n_receivers, n_samples), their coordinates, the source moment tensor and full metadata. Traces are band-passed 0.01–0.1 Hz and stored at 0.25 Hz (lossless for that band, ~1.2 MB/event).

    Events are written to the repo-root data/ folder, then bundled into zip chunks of 50 events (data/events_NNN.zip) for upload. The download loop is resume-safe — it skips events already inside a chunk.

  • read_packaged.ipynb - Standalone reader (numpy + matplotlib + stdlib only) that opens and inspects the packaged events straight out of the data/events_NNN.zip chunks.

Data hosting

The packaged datasets are not committed to git (too large). They live in an external Dropbox folder; only the small pointer data/README.md is tracked. Put the Dropbox direct-download URL (append ?dl=1) in data/README.md and in the DATA_URL cell at the top of read_packaged.ipynb, which downloads and unpacks the chunks into data/ on demand.

Run individual notebooks from command line

# Execute all cells in a notebook
pixi run jupyter nbconvert --to notebook --execute notebooks/getting_started.ipynb

# Or use papermill for parameterized execution
pixi run pip install papermill
pixi run papermill notebooks/generate_wavefields.ipynb output.ipynb

Run the tests

pixi run test

Install the package in editable mode (optional)

Inside pixi shell:

pip install -e .

Key dependencies

Package Source Purpose
obspy conda-forge Seismological data handling & FDSN access
instaseis pip Green's function database access
matplotlib conda-forge Visualization
numpy conda-forge Numerical computing
scipy conda-forge Scientific algorithms
pandas conda-forge Data manipulation & analysis
cartopy conda-forge Geographic map visualizations
pillow conda-forge Image processing
h5py conda-forge HDF5 file I/O
torch pip Deep learning / data assimilation
jupyterlab conda-forge Interactive notebooks

Optional dependencies

  • longboard (install separately): pip install longboard - Interactive seismic waveform visualization (if available in your Python environment)

Working with Real Seismic Data

The generate_wavefields.ipynb notebook includes functionality to download real seismic data from FDSN web services:

  1. Earthquake catalog queries - Query global earthquake catalogs (e.g., M≥7.0 events in 2023)
  2. Waveform downloads - Download 3-component long-period data (LHZ, LHN, LHE) from networks II, IU
  3. Automatic preprocessing - Remove instrument response, filter, and organize by station
  4. Multiple visualization approaches:
    • Traditional matplotlib record sections
    • Interactive longboard explorer (optional)
    • Geographic maps with Cartopy
  5. Data export - Save processed data as NumPy NPZ arrays for machine learning workflows

Note: Internet connection required for FDSN data downloads. Downloads may take several minutes depending on the number of earthquakes and stations.

Quick Start

# Clone and set up the environment
git clone https://github.com/Denolle-Lab/da-instaseis.git
cd da-instaseis
pixi install

# Launch JupyterLab
pixi run lab

# Or run tests
pixi run test

Then open notebooks/generate_wavefields.ipynb or notebooks/getting_started.ipynb in JupyterLab.

License

See LICENSE.

About

data assimilation of global wavefields

Resources

License

Stars

Watchers

Forks

Contributors