We present CLARE, the first machine learning model tailored to predict electron temperatures in Earth’s plasmasphere, covering altitudes from 1,000 to 8,000 km, by integrating geospatial and temporal solar indices.
CLARE is an 84-million-parameter neural network that uses a classification-based architecture. This approach discretizes the continuous output space into bins to enhance prediction accuracy while naturally embedding uncertainty estimates.
Key Features:
- Predicts electron temperature (Te) in the plasmasphere (1,000 - 8,000 km altitude).
- Utilizes in-situ geospatial data from the Akebono satellite and solar activity indices (Kp, AL, SYM-H, F10.7) as inputs.
- Employs a novel binned classification approach for improved accuracy and uncertainty quantification.
- Achieves predictions within 10% absolute deviation for 69.82% of observations under typical solar conditions.
- Demonstrates an accuracy of 21.39% on a solar storm test set.
Paper: For a detailed description of the model architecture, dataset, and results, please refer to our paper: CLARE: Classification-based Regression for Electron Temperature Prediction (arXiv link)
Before you begin, set up the following:
- Python: Required for using the repository, download version 3.9 or above.
- (Optional) Weights & Biases Account: Optional for experiment tracking and model management.
Follow these steps to set up the project environment:
-
Clone the Repository: Open your terminal or command prompt and navigate to the directory where you want to store the project. Clone the repository using either SSH or HTTPS:
- Using SSH (Recommended):
git clone git@github.com:blakedehaas/clare.git
Navigate into the cloned directory:
cd clareInstall Git Large File Storage:
git lfs install
Download the model checkpoint using Git Large File Storage:
git lfs pull
- Using SSH (Recommended):
-
Install Dependencies: Install all the required Python packages using pip and the
requirements.txtfile:pip install -r requirements.txt
Prepare the necessary datasets for training and evaluation:
-
Download Input Data: Download the following data files from the data repository
Akebono_combined.tsv: AKEBONO dataset from EXOS-D satellite (this file is currently restricted, reach out to the paper authors for access)omni_kp_index.lst: Kp index values from NASA OMNI datasetomni_al_index_symh.zip: AL and SYM-H index values from NASA OMNI datasetomni_f107.zip: f10.7 index values from NASA OMNI dataset
-
Place Data Files: Move the downloaded files into the
clare/dataset/input_dataset/directory. -
Unzip Archives: Navigate to the
clare/dataset/input_dataset/directory and unzip the.zipfiles:cd dataset/input_dataset/ unzip omni_al_index_symh.zip unzip omni_f107.zipAfter successful extraction, delete the original
.zipfiles:rm omni_al_index_symh.zip omni_f107.zip
-
Run Dataset Creation Script: Navigate back to the
clare/dataset/directory and run the script to process the raw data and create the final datasets:cd .. # Move up from clare/dataset/input_dataset to clare/dataset python create_dataset.py
This script will generate the necessary processed data files used for training and evaluation, saving to the
clare/dataset/processed_datasetdirectory.
Train the CLARE model using the prepared datasets:
-
Configure Experiment Name:
- Open the
clare/train.pyfile in your editor. - Manually update the
model_namevariable (line 24) to a unique identifier for your training run (e.g.,model_name = "clare_experiment1"). This name will be used for saving checkpoints and normalization statistics.
- Open the
-
Run Training Script:
- Navigate to the top-level project directory (
clare/). - Execute the training script:
python train.py
- Training progress will be logged to your terminal and to Weights & Biases under the project
clare. - The trained model checkpoint (
.pthfile) will be saved toclare/checkpoints/using the specifiedmodel_name. - Normalization statistics (e.g., mean, std) used during training will be saved to
clare/checkpoints/using the specifiedmodel_name.
- Navigate to the top-level project directory (
Evaluate the performance of a trained model checkpoint:
-
Configure Model Name for Evaluation:
- Open the
clare/evaluate.pyfile in your editor. - Manually update the
model_namevariable (line 18) to match the exactmodel_nameof the trained checkpoint you want to evaluate (the one you set intrain.py).
- Open the
-
Select Test Dataset for Evaluation:
- Manually update the
datasetvariable (line 19) and select betweentest-normalandtest-storm. Thetest-normaldataset consists of 50,000 randomly selected points over the entire dataset and thetest-stormdataset consists of a continuous known solar storm period from May 16 - 20th, 1991.
- Manually update the
-
Run Evaluation Script:
- Ensure you are in the top-level project directory (
clare/). - Execute the evaluation script:
python evaluate.py
- The script will load the specified checkpoint and normalization statistics, run predictions on the test set, and print evaluation metrics. Results may also be logged to Weights & Biases if configured within the script.
- Ensure you are in the top-level project directory (
This project is licensed under the MIT License - see the LICENSE file for details.