Skip to content

Fast neural posterior estimation for gravitational wave events using normalizing flows efficiently infers high-dimensional GW parameters from waveform data.

License

Notifications You must be signed in to change notification settings

bibinthomas123/PosteriFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

76 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌊 PosteriFlow β€” Adaptive Hierarchical Signal Decomposition (AHSD)

Python PyTorch License Status

A next-generation gravitational-wave analysis system that detects, decomposes, and characterizes overlapping signals in real-time using neural posterior estimation and adaptive signal subtraction.


🎯 What is PosteriFlow?

PosteriFlow is a cutting-edge machine learning pipeline for gravitational-wave astronomy that solves a critical problem: how to extract multiple overlapping signals from noisy gravitational-wave detector data.

The Core Problem

Modern gravitational-wave detectors (LIGO, Virgo) detect weak signals buried in noise. When multiple sources merge simultaneously, their signals overlap, creating a complex mixture that traditional methods cannot easily separate. PosteriFlow uses hierarchical neural networks to:

  1. Prioritize signals - Determine which sources to extract first
  2. Estimate parameters - Rapidly infer masses, distances, spins using neural inference
  3. Subtract adaptively - Remove extracted signals while preserving fainter ones
  4. Quantify uncertainty - Provide calibrated confidence intervals for all estimates

Why This Matters

  • Multi-messenger astronomy: Early warnings for neutron star mergers enable electromagnetic follow-up
  • Population statistics: Extracting overlapping events improves population constraints on compact object formation
  • Real-time decision-making: LIGO alert system can trigger faster with overlapping signals disentangled
  • Scientific discovery: Overlaps may reveal unexpected binary characteristics (precession, eccentricity)

πŸ—οΈ Architecture Overview

Three-Phase Pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         RAW GRAVITATIONAL-WAVE DATA (H1, L1, V1)             β”‚
β”‚     Detector noise + overlapping GW signals + glitches       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
                         β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚  PHASE 1: NEURAL POSTERIOR       β”‚
          β”‚  ESTIMATION (Neural PE)          β”‚
          β”‚  ─────────────────────────────   β”‚
          β”‚  β€’ Likelihood-free inference     β”‚
          β”‚  β€’ Multi-detector coherence      β”‚
          β”‚  β€’ Uncertainty quantification    β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
          Parameter estimates + uncertainties
          (mass_1, mass_2, distance, sky position, spins)
                         β”‚
                         β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚  PHASE 2: PRIORITY NET            β”‚
          β”‚  Signal Ranking & Selection       β”‚
          β”‚  ─────────────────────────────   β”‚
          β”‚  β€’ Temporal encoding (CNN+BiLSTM)β”‚
          β”‚  β€’ Cross-signal feature analysis  β”‚
          β”‚  β€’ Uncertainty-aware ranking      β”‚
          β”‚  β€’ Predicts extraction order      β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
                     Ordered list of signals
                     (which to remove first)
                         β”‚
                         β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚  PHASE 3: ADAPTIVE SUBTRACTOR     β”‚
          β”‚  Iterative Signal Removal         β”‚
          β”‚  ─────────────────────────────   β”‚
          β”‚  β€’ Uncertainty-weighted subtraction
          β”‚  β€’ Cross-detector coherence       β”‚
          β”‚  β€’ Bias correction               β”‚
          β”‚  β€’ Residual quality monitoring    β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
                         β–Ό
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚  EXTRACTED SIGNALS & RESIDUAL NOISE      β”‚
       β”‚  β€’ Individual source parameters          β”‚
       β”‚  β€’ Parameter uncertainties               β”‚
       β”‚  β€’ Signal-to-noise metrics               β”‚
       β”‚  β€’ Residual quality assessment           β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Neural Components

Neural PE (Parameter Estimation)

  • Likelihood-free inference using normalizing flows
  • Simultaneous estimation of ~15 binary parameters
  • Fast inference: <100ms for 4-second segment
  • Uncertainty quantification via posterior ensemble
  • Handles contamination via data augmentation

PriorityNet (Signal Prioritization)

  • Temporal CNN encoder: Multi-scale time-frequency features
  • BiLSTM encoder: Temporal dependencies in strain data
  • Cross-signal analyzer: Quantifies signal overlap and interaction
  • Output: Ranking of signals + confidence in order
  • Enables optimal extraction strategy

Adaptive Subtractor

  • Uses Neural PE uncertainties to weight residuals
  • Subtracts strongest signal first (per PriorityNet)
  • Bias correction: Accounts for parameter estimation errors
  • Iterative: Updates estimates after each subtraction
  • Quality monitoring: Validates residual Gaussianity

πŸ’Ύ Data Pipeline

Synthetic Dataset Generation

PosteriFlow generates realistic synthetic gravitational-wave data for training:

REAL LIGO/VIRGO CHARACTERISTICS
β”œβ”€ Detector network (H1, L1, V1)
β”œβ”€ Realistic PSDs from O4 sensitivity
β”œβ”€ Real glitches & contamination
β”œβ”€ Physics-accurate waveforms (IMRPhenomXAS)
└─ Realistic source populations

                    β–Ό

PARAMETERS SAMPLED (Physics-Constrained)
β”œβ”€ Masses (BBH: 5-100 Mβ˜‰, BNS: 1-2.5 Mβ˜‰)
β”œβ”€ Spins (aligned & precessing)
β”œβ”€ Distance (~log-uniform, Malmquist bias)
β”œβ”€ Sky position (uniform on sphere)
└─ Binary merger epoch

                    β–Ό

SIGNAL GENERATION
β”œβ”€ GW waveform synthesis (PyCBC)
β”œβ”€ Detector response (antenna patterns)
β”œβ”€ SNR-dependent distance scaling
└─ Parameter-distance correlation (physics-validated)

                    β–Ό

CONTAMINATION INJECTION
β”œβ”€ Real LIGO noise (GWOSC, 10-25Γ— speedup via caching)
β”œβ”€ Neural synthetic noise (10,000Γ— faster than GWOSC)
β”œβ”€ Line glitches (60 Hz, harmonics)
β”œβ”€ Transient glitches (blips, scattered light)
β”œβ”€ PSD drift (multiple epochs)
└─ Detector dropout scenarios

                    β–Ό

OVERLAP CREATION (45% realistic rate)
β”œβ”€ 2-signal overlaps (direct mergers)
β”œβ”€ Multi-signal overlaps (up to 8 signals)
β”œβ”€ Partial overlaps (different durations)
└─ Subtle ranking (important for prioritization)

                    β–Ό

EDGE CASE SAMPLING (8% of dataset)
β”œβ”€ Physical extremes (high mass-ratio, spins)
β”œβ”€ Observational extremes (strong glitches)
β”œβ”€ Statistical extremes (multimodal posteriors)
└─ Overlapping extremes (subtle ranking)

                    β–Ό

FINAL DATASET (25,000+ samples)
β”œβ”€ Detector strain (H1, L1, V1) + preprocessing
β”œβ”€ Ground-truth parameters
β”œβ”€ Network SNR & quality metrics
β”œβ”€ Metadata for analysis
└─ Train/val/test splits (80/10/10)

Data Statistics

SIGNAL TYPE DISTRIBUTION:
β”œβ”€ Binary Black Hole (BBH):    46% β†’ Loudest, most common
β”œβ”€ Binary Neutron Star (BNS):  32% β†’ Rare, long duration, crucial for EW
β”œβ”€ NS-BH (NSBH):               17% β†’ Intermediate
└─ Noise only:                  5% β†’ Background characterization

OVERLAP STATISTICS:
β”œβ”€ Single signals:      55% of samples
β”œβ”€ Overlapping:         45% of samples
β”‚  β”œβ”€ 2-3 signals:      35%
β”‚  β”œβ”€ 4-5 signals:       8%
β”‚  └─ 6+ signals:        2%
└─ Average: 2.25 signals per sample

SNR DISTRIBUTION (O4 REALISTIC):
β”œβ”€ Weak (10-15):        5%
β”œβ”€ Low (15-25):        35%  ← Most detections
β”œβ”€ Medium (25-40):     45%
β”œβ”€ High (40-60):       12%
└─ Loud (60-80):        3%

PARAMETER RANGES:
β”œβ”€ Masses:    3-200 Mβ˜‰  (detector frame)
β”œβ”€ Distances: 10-18,000 Mpc
β”œβ”€ Spins:     0-0.99
└─ SNR:       3-100

Advanced Features

Real Noise Integration (10-25Γ— speedup)

  • Pre-downloaded GWOSC segments (133 cached files)
  • Three-level fallback: cache β†’ on-demand β†’ synthetic
  • 10% real noise mixing for enhanced realism

Neural Noise Generation (10,000Γ— speedup)

  • FMPE pre-trained models (Gaussian_network.pickle)
  • Colored Gaussian & non-Gaussian variants
  • Falls back gracefully if models unavailable

TransformerStrainEncoder Enhancement

  • State-of-the-art strain encoding
  • Attention-based temporal modeling
  • Outperforms CNN+BiLSTM baselines

πŸš€ Quick Start

1. Environment Setup

# Clone repository
git clone https://github.com/bibinthomas123/PosteriFlow.git
cd PosteriFlow

# Initialize conda (first time only)
conda init

# Activate environment
conda activate ahsd

# Install package in development mode
pip install -e . --no-deps

Important: The conda environment ahsd exists and contains all dependencies. Never recreate it.

2. Generate Training Data

# Generate 25,000 samples (default, ~1.5-2 hours)
python src/ahsd/data/scripts/generate_dataset.py \
    --config configs/data_config.yaml \
    --num-samples 25000

# Custom parameters
python src/ahsd/data/scripts/generate_dataset.py \
    --config configs/data_config.yaml \
    --num-samples 50000 \
    --output-dir data/dataset_custom

3. Train Phase 1: Neural PE

# Train neural parameter estimation network
python experiments/phase3a_neural_pe.py \
    --config configs/enhanced_training.yaml \
    --batch-size 32 \
    --epochs 100

# Monitor training
tensorboard --logdir outputs/

4. Train Phase 2: PriorityNet

# Train signal prioritization network
python experiments/train_priority_net.py \
    --config configs/priority_net.yaml \
    --create-overlaps \
    --batch-size 16

# Resume from checkpoint
python experiments/train_priority_net.py \
    --resume outputs/prioritynet_checkpoint.pth \
    --create-overlaps

5. Evaluate & Validate

# Full validation suite
python experiments/phase3c_validation.py \
    --phase3a_output outputs/phase3a_output_X/ \
    --phase3b_output outputs/phase3b_production/ \
    --n_samples 2000 \
    --seeds 5

# Expected output:
# βœ… System Success Rate: 82.1%
# βœ… Neural PE Accuracy: 0.582 Β± 0.087
# βœ… Subtraction Efficiency: 81.1%

πŸ“Š Performance Results

System-Level Metrics

Metric Value Notes
System Success Rate 82.1% End-to-end detection of all signals
Average Efficiency (Ξ·) 81.1% Residual energy reduction
Latency per 4s segment 156 ms Dual-channel (H1, L1)
Throughput 25.6 seg/s Real-time capable
Memory (8GB VRAM) Fits Batch inference supported

Phase 1: Neural PE Accuracy

Dataset APE (mean) APE (std) Comments
Clean (training) 0.802 0.012 Physics-perfect data
Contaminated (validation) 0.582 0.087 Realistic noise
After subtraction 0.645 0.074 Improved residuals

Phase 2: PriorityNet Ranking

Metric Value Target
Top-K Precision@1 96.6% >95%
Ranking Correlation 0.605 >0.50
Priority Accuracy 94.6% >90%
Calibration Error <0.05 <0.10

Phase 3: Multi-Seed Verification

METRIC STABILITY ACROSS 5 SEEDS (200 samples each):
β”œβ”€ Neural PE Accuracy:  0.582 Β± 0.004  (variation: 0.1%)
β”œβ”€ Subtraction Ξ·:       0.811 Β± 0.001  (variation: <0.1%)
β”œβ”€ System Success:      0.821 Β± 0.008  (variation: 1.0%)
└─ Statistical significance: Cohen's d > 2.0

πŸ“ Project Structure

PosteriFlow/
β”œβ”€β”€ πŸ“ src/ahsd/                    # Main package
β”‚   β”œβ”€β”€ πŸ“ core/                    # Core algorithms
β”‚   β”‚   β”œβ”€β”€ priority_net.py          # Signal prioritization (PriorityNet)
β”‚   β”‚   β”œβ”€β”€ adaptive_subtractor.py   # Adaptive subtraction + NeuralPE
β”‚   β”‚   β”œβ”€β”€ ahsd_pipeline.py         # Full end-to-end pipeline
β”‚   β”‚   └── bias_corrector.py        # Parameter bias correction
β”‚   β”œβ”€β”€ πŸ“ data/                    # Data generation & preprocessing
β”‚   β”‚   β”œβ”€β”€ dataset_generator.py     # Main dataset generator
β”‚   β”‚   β”œβ”€β”€ waveform_generator.py    # GW waveform synthesis (PyCBC)
β”‚   β”‚   β”œβ”€β”€ noise_generator.py       # Synthetic noise + glitches
β”‚   β”‚   β”œβ”€β”€ neural_noise_generator.py # FMPE neural noise (10kΓ— speedup)
β”‚   β”‚   β”œβ”€β”€ parameter_sampler.py     # Physics-constrained sampling
β”‚   β”‚   β”œβ”€β”€ psd_manager.py          # Power spectral density management
β”‚   β”‚   β”œβ”€β”€ gwtc_loader.py          # Real GWOSC data loading
β”‚   β”‚   β”œβ”€β”€ injection.py            # Signal injection into noise
β”‚   β”‚   β”œβ”€β”€ preprocessing.py        # Whitening, normalization
β”‚   β”‚   └── config.py               # Config loading & validation
β”‚   β”œβ”€β”€ πŸ“ models/                  # Neural network architectures
β”‚   β”‚   β”œβ”€β”€ neural_pe.py            # Neural PE normalizing flow
β”‚   β”‚   β”œβ”€β”€ overlap_neuralpe.py      # Multi-signal PE variant
β”‚   β”‚   β”œβ”€β”€ transformer_encoder.py   # TransformerStrainEncoder
β”‚   β”‚   β”œβ”€β”€ flows.py                # Flow architectures
β”‚   β”‚   └── rl_controller.py         # RL-based control (future)
β”‚   β”œβ”€β”€ πŸ“ evaluation/              # Metrics & analysis
β”‚   β”‚   └── metrics.py              # APE, efficiency, ranking metrics
β”‚   └── πŸ“ utils/                   # Utilities
β”‚       β”œβ”€β”€ config.py               # Configuration classes
β”‚       β”œβ”€β”€ logging.py              # Logging setup
β”‚       └── data_format.py           # Data standardization
β”œβ”€β”€ πŸ“ experiments/                 # Training & evaluation scripts
β”‚   β”œβ”€β”€ phase3a_neural_pe.py        # Neural PE training
β”‚   β”œβ”€β”€ train_priority_net.py        # PriorityNet training
β”‚   β”œβ”€β”€ data_generation.py          # Dataset generation wrapper
β”‚   └── phase3c_validation.py        # Multi-seed validation
β”œβ”€β”€ πŸ“ configs/                     # Configuration files (YAML)
β”‚   β”œβ”€β”€ data_config.yaml            # Data generation parameters
β”‚   β”œβ”€β”€ enhanced_training.yaml      # Training hyperparameters
β”‚   β”œβ”€β”€ priority_net.yaml           # PriorityNet config
β”‚   └── inference.yaml              # Inference settings
β”œβ”€β”€ πŸ“ tests/                       # Unit & integration tests
β”‚   β”œβ”€β”€ test_dataset_generation.py
β”‚   β”œβ”€β”€ test_neural_pe.py
β”‚   β”œβ”€β”€ test_priority_net.py
β”‚   └── test_integration.py
β”œβ”€β”€ πŸ“ models/                      # Trained model checkpoints
β”‚   β”œβ”€β”€ neural_pe_best.pth
β”‚   └── prioritynet_checkpoint.pth
β”œβ”€β”€ πŸ“ data/                        # Generated datasets
β”‚   β”œβ”€β”€ dataset/
β”‚   β”‚   β”œβ”€β”€ train.pkl
β”‚   β”‚   β”œβ”€β”€ val.pkl
β”‚   β”‚   └── test.pkl
β”‚   └── Gaussian_network.pickle     # FMPE model (neural noise)
β”œβ”€β”€ πŸ“ outputs/                     # Experiment results
β”‚   β”œβ”€β”€ phase3a_output_X/
β”‚   β”œβ”€β”€ phase3b_production/
β”‚   └── logs/
β”œβ”€β”€ πŸ“ gw_segments/         # Pre-cached GWOSC segments
β”‚   └── [133 real noise segments]
β”œβ”€β”€ πŸ“ notebooks/                   # Analysis & visualization
β”œβ”€β”€ πŸ“ docs/                        # Additional documentation
β”œβ”€β”€ pyproject.toml                  # Package metadata & dependencies
β”œβ”€β”€ setup.py                        # Package setup
β”œβ”€β”€ AGENTS.md                       # Development guidelines
└── README.md                       # This file

πŸ”§ Configuration System

All parameters are controlled via YAML configuration files in configs/:

data_config.yaml - Dataset Generation

# Core parameters
n_samples: 25000              # Number of samples to generate
sample_rate: 4096             # Hz (LIGO standard)
duration: 4.0                 # seconds
detectors: [H1, L1, V1]      # Detector network

# Signal characteristics
overlap_fraction: 0.45        # Realistic O4 rate
edge_case_fraction: 0.08      # Physical/statistical extremes
create_overlaps: true         # Enable multi-signal generation

# Contamination
add_glitches: true
neural_noise_enabled: true    # 10,000Γ— speedup
neural_noise_prob: 0.5        # 50% neural, 50% synthetic
use_real_noise_prob: 0.1      # 10% real GWOSC (cached)

# Event distribution (realistic O4)
event_type_distribution:
  BBH: 0.46                   # Most common
  BNS: 0.32                   # Rare but important
  NSBH: 0.17                  # Intermediate
  noise: 0.05                 # Background

enhanced_training.yaml - Neural PE Training

# Hyperparameters
learning_rate: 0.0005
batch_size: 32
epochs: 100
weight_decay: 1e-5

# Loss weights
loss_weights:
  mse: 0.35                   # Parameter estimation
  ranking: 0.50               # Ranking loss
  uncertainty: 0.15           # Calibration

# Data augmentation
augment_contamination: true
noise_augmentation_k: 1.0
preprocess: true

priority_net.yaml - Signal Prioritization

# Architecture
temporal_encoder_dim: 128
hidden_dim: 256
num_heads: 8                  # Multi-head attention

# Training
learning_rate: 0.0002
batch_size: 16
epochs: 80
create_overlaps: true         # Enable multi-signal training

πŸ§ͺ Testing

Run the comprehensive test suite:

# All tests
pytest

# Specific test
pytest tests/test_priority_net.py::TestPriorityNet::test_forward_pass -v

# With coverage
pytest --cov=ahsd --cov-report=html

# Verbose with print statements
pytest -v -s

# Specific test file
pytest tests/test_neural_pe.py

Key Test Suites

Test Purpose Location
Neural PE Forward pass, loss computation tests/test_neural_pe.py
PriorityNet Signal ranking, feature extraction tests/test_priority_net.py
Dataset Data generation, splits, validation tests/test_dataset_generation.py
Integration End-to-end pipeline tests/test_integration.py

πŸ’‘ How to Use PosteriFlow

Use Case 1: Train on Custom Data

  1. Prepare real GW data in HDF5 format
  2. Implement data reader in src/ahsd/data/gwtc_loader.py
  3. Update data_config.yaml with real data paths
  4. Run training pipeline

Use Case 2: Parameter Estimation on New Events

from ahsd.core.adaptive_subtractor import NeuralPE
import numpy as np

# Load strain data
strain_data = {
    'H1': np.load('H1_data.npy'),
    'L1': np.load('L1_data.npy'),
    'V1': np.load('V1_data.npy'),
}

# Quick estimation
pe = NeuralPE()
result = pe.quick_estimate(strain_data)

print(f"Mass 1: {result['mass_1_mean']:.1f} Mβ˜‰")
print(f"Distance: {result['luminosity_distance_mean']:.0f} Mpc")
print(f"SNR: {result['network_snr']:.1f}")

Use Case 3: Signal Decomposition Pipeline

from ahsd.core.ahsd_pipeline import AHSDPipeline

# Initialize pipeline
pipeline = AHSDPipeline(
    neural_pe_model='models/neural_pe_best.pth',
    priority_net_model='models/prioritynet_best.pth',
    subtractor_model='models/subtractor_best.pth',
)

# Process 4-second segment
result = pipeline.run(strain_data={
    'H1': h1_strain,
    'L1': l1_strain,
    'V1': v1_strain,
})

# Extracted signals
for i, signal in enumerate(result['extracted_signals']):
    print(f"\nSignal {i+1}:")
    print(f"  Mass 1: {signal['mass_1']:.1f} Mβ˜‰")
    print(f"  SNR: {signal['snr']:.1f}")
    print(f"  Confidence: {signal['priority_score']:.2f}")

πŸ”¬ Scientific Details

Neural Posterior Estimation (Phase 1)

Approach: Likelihood-free inference using normalizing flows

  • Input: Multi-detector strain (whitened, windowed)
  • Output: Posterior samples of ~15 astrophysical parameters
  • Speed: <100ms per 4s segment
  • Training: On clean synthetic waveforms + augmented contamination

Key Features:

  • Amortized inference: Single network for all parameters
  • Uncertainty quantification: Full posterior ensemble
  • Multi-detector coherence: Combines H1, L1, V1 optimally
  • Robust to PSD variation: Data augmentation during training

Signal Prioritization (Phase 2: PriorityNet)

Approach: Deep learning on temporal strain features

  • Architecture: CNN (multi-scale) + BiLSTM (temporal) + Attention (context)
  • Input: Whitened strain for multiple signals
  • Output: Ranking order (which signal to subtract first)
  • Training: On overlapping synthetic signals

Why Prioritization Matters:

  • Extracting loud signal first reduces noise floor
  • Removes contamination bias on faint signals
  • Improves overall parameter estimation accuracy
  • Handles multimodal posteriors better

Adaptive Subtraction (Phase 3)

Approach: Iterative removal with uncertainty weighting

  • Step 1: Identify signal with highest priority
  • Step 2: Subtract using Neural PE parameters + uncertainties
  • Step 3: Bias correction: Account for parameter errors
  • Step 4: Validate residual Gaussianity
  • Step 5: Repeat for remaining signals

Uncertainty Weighting:

  • Larger uncertainties β†’ weaker subtraction (preserve signal)
  • Calibrated uncertainties β†’ correct bias
  • Cross-detector coherence check

πŸ“š References

Key Papers

  1. PyCBC Waveforms: arXiv:1508.01844

    • GW waveform generation and detection
  2. LIGO Data Conditioning: arXiv:2002.01606

    • Real gravitational-wave detector noise
  3. Normalizing Flows: arXiv:1810.01367

    • Flexible density estimation (used in Neural PE)
  4. DINGO: arXiv:2105.12151

    • Deep inference for GW observations (basis for neural noise models)

Data Sources


🀝 Contributing

Development Workflow

  1. Create feature branch: git checkout -b feature/description
  2. Code style: Follow AGENTS.md guidelines
  3. Test: Run pytest before committing
  4. Format: black . && isort . && flake8 .
  5. Commit message: Descriptive, explain "why"
  6. Push & PR: Create pull request with summary

Code Standards

  • Type hints: Always (required for all functions)
  • Docstrings: NumPy format for classes and methods
  • Line length: 100 characters (black formatter)
  • Testing: Unit tests for new modules
  • Coverage: Aim for >80% for new code

πŸ“ž Support & Resources

Documentation

  • Docs - Use this folder to understand the core functionality and how to run the code

Commands

# Data generation
ahsd-generate --config configs/data_config.yaml

# Validation
ahsd-validate --dataset data/dataset/train.pkl

# Analysis
ahsd-analyze --input-data data.hdf5 --output results.pkl

# Model training
python experiments/phase3a_neural_pe.py --config configs/enhanced_training.yaml

# Validation
python experiments/phase3c_validation.py --phase3a_output outputs/phase3a_output_X/ \
    --phase3b_output outputs/phase3b_production/ --n_samples 2000 --seeds 5

πŸ“ License

MIT License - see LICENSE for details


πŸ‘€ Author & Citation

Author: Bibin Thomas
Email: bibinthomas951@gmail.com
Repository: https://github.com/bibinthomas123/PosteriFlow

Citation

If you use PosteriFlow in your research, please cite:

@software{thomas2025posteriflow,
  title={PosteriFlow: Adaptive Hierarchical Signal Decomposition 
         for Overlapping Gravitational Waves},
  author={Thomas, Bibin},
  year={2025},
  url={https://github.com/bibinthomas123/PosteriFlow}
}

🌟 Acknowledgments

PosteriFlow builds on foundational work from:

  • LIGO-Virgo Collaboration for detector design and data access
  • PyCBC for waveform generation
  • Bilby for Bayesian inference tools
  • GWpy for detector data handling
  • DINGO for neural density estimation techniques

Built for the next generation of gravitational-wave astronomy 🌌

Last Updated: November 12, 2025

About

Fast neural posterior estimation for gravitational wave events using normalizing flows efficiently infers high-dimensional GW parameters from waveform data.

Topics

Resources

License

Stars

Watchers

Forks