Skip to content

blakedehaas/clare

Repository files navigation

CLARE: Classification-based Regression for Electron Temperature Prediction

We present CLARE, the first machine learning model tailored to predict electron temperatures in Earth’s plasmasphere, covering altitudes from 1,000 to 8,000 km, by integrating geospatial and temporal solar indices.

CLARE is an 84-million-parameter neural network that uses a classification-based architecture. This approach discretizes the continuous output space into bins to enhance prediction accuracy while naturally embedding uncertainty estimates.

Key Features:

  • Predicts electron temperature (Te) in the plasmasphere (1,000 - 8,000 km altitude).
  • Utilizes in-situ geospatial data from the Akebono satellite and solar activity indices (Kp, AL, SYM-H, F10.7) as inputs.
  • Employs a novel binned classification approach for improved accuracy and uncertainty quantification.
  • Achieves predictions within 10% absolute deviation for 69.82% of observations under typical solar conditions.
  • Demonstrates an accuracy of 21.39% on a solar storm test set.

Paper: For a detailed description of the model architecture, dataset, and results, please refer to our paper: CLARE: Classification-based Regression for Electron Temperature Prediction (arXiv link)


Table of Contents


Prerequisites

Before you begin, set up the following:

  1. Python: Required for using the repository, download version 3.9 or above.
  2. (Optional) Weights & Biases Account: Optional for experiment tracking and model management.

Installation

Follow these steps to set up the project environment:

  1. Clone the Repository: Open your terminal or command prompt and navigate to the directory where you want to store the project. Clone the repository using either SSH or HTTPS:

    • Using SSH (Recommended):
      git clone git@github.com:blakedehaas/clare.git

    Navigate into the cloned directory:

    cd clare

    Install Git Large File Storage:

    git lfs install

    Download the model checkpoint using Git Large File Storage:

    git lfs pull
  2. Install Dependencies: Install all the required Python packages using pip and the requirements.txt file:

    pip install -r requirements.txt

Data Preparation

Prepare the necessary datasets for training and evaluation:

  1. Download Input Data: Download the following data files from the data repository

    • Akebono_combined.tsv: AKEBONO dataset from EXOS-D satellite (this file is currently restricted, reach out to the paper authors for access)
    • omni_kp_index.lst: Kp index values from NASA OMNI dataset
    • omni_al_index_symh.zip: AL and SYM-H index values from NASA OMNI dataset
    • omni_f107.zip: f10.7 index values from NASA OMNI dataset
  2. Place Data Files: Move the downloaded files into the clare/dataset/input_dataset/ directory.

  3. Unzip Archives: Navigate to the clare/dataset/input_dataset/ directory and unzip the .zip files:

    cd dataset/input_dataset/
    unzip omni_al_index_symh.zip
    unzip omni_f107.zip

    After successful extraction, delete the original .zip files:

    rm omni_al_index_symh.zip omni_f107.zip
  4. Run Dataset Creation Script: Navigate back to the clare/dataset/ directory and run the script to process the raw data and create the final datasets:

    cd ..  # Move up from clare/dataset/input_dataset to clare/dataset
    python create_dataset.py

    This script will generate the necessary processed data files used for training and evaluation, saving to the clare/dataset/processed_dataset directory.


Training

Train the CLARE model using the prepared datasets:

  1. Configure Experiment Name:

    • Open the clare/train.py file in your editor.
    • Manually update the model_name variable (line 24) to a unique identifier for your training run (e.g., model_name = "clare_experiment1"). This name will be used for saving checkpoints and normalization statistics.
  2. Run Training Script:

    • Navigate to the top-level project directory (clare/).
    • Execute the training script:
      python train.py
    • Training progress will be logged to your terminal and to Weights & Biases under the project clare.
    • The trained model checkpoint (.pth file) will be saved to clare/checkpoints/ using the specified model_name.
    • Normalization statistics (e.g., mean, std) used during training will be saved to clare/checkpoints/ using the specified model_name.

Evaluation

Evaluate the performance of a trained model checkpoint:

  1. Configure Model Name for Evaluation:

    • Open the clare/evaluate.py file in your editor.
    • Manually update the model_name variable (line 18) to match the exact model_name of the trained checkpoint you want to evaluate (the one you set in train.py).
  2. Select Test Dataset for Evaluation:

    • Manually update the dataset variable (line 19) and select between test-normal and test-storm. The test-normal dataset consists of 50,000 randomly selected points over the entire dataset and the test-storm dataset consists of a continuous known solar storm period from May 16 - 20th, 1991.
  3. Run Evaluation Script:

    • Ensure you are in the top-level project directory (clare/).
    • Execute the evaluation script:
      python evaluate.py
    • The script will load the specified checkpoint and normalization statistics, run predictions on the test set, and print evaluation metrics. Results may also be logged to Weights & Biases if configured within the script.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors