AIPAL Validator

AIPAL Validator is a tool designed to streamline the validation process for AIPAL. Below you'll find instructions on how to set up and run this validator both locally and with Docker.

Local Setup

Prerequisites

R Installation: Ensure R is installed on your system. If not, install it using:
```
sudo apt-get install r-base
```
Also ensure to install the following packages within R: 'dplyr', 'tidyr', 'yaml', 'caret', 'xgboost'

Install the necessary dependencies:
```
poetry install
```
Run the validation process. You can specify the step to run (all, data, sampling, test):
```
poetry run aipal_validation --task aipal --step [all,data,sampling,test]
```

Docker Setup

Run the Docker container:
```
docker compose run aipal bash
```

Inside the Docker container, execute the validation script:

python -m aipal_validation --task aipal --step [all,data,sampling,test]

Project Structure

The project has the following structure:

aipal_validation/: Main package containing all functionality
- r/: Contains all R scripts for prediction and model training (moved from root directory)
- config/: Configuration files
- data_preprocessing/: Data preprocessing modules
- eval/: Evaluation modules
- fhir/: FHIR-related modules
- ml/: Machine learning modules
- outlier/: Outlier detection modules
- helper/: Utility functions

Importing Data from Excel Without a Firemetrics Server

If you don't have a Firemetrics server running and want to import data from an Excel sheet, follow these steps:

Steps to Import Data

Set the run_id:
- Update the run_id to match your cohort name.
Prepare Your Directory:
- In your root_dir, create a folder named after your cohort.
- Inside this folder, create another folder named aipal.
- Place your Excel sheet in the aipal folder.
Generate Custom Samples:
- Run the following command:
```
python -m aipal_validation --task aipal --step sampling
```
- This command invokes the generate_custom_samples.py class.
- Ensure the column names in your Excel file exactly match the expected names in the script.
- Alternatively, perform necessary data transformations within the script.
Run the Validation Pipeline:
- Once the samples.csv file is successfully created, execute the following command to run the validation pipeline:
```
python -m aipal_validation --task aipal --step test
```

Outlier Detection

To run outlier detection on your dataset and identify potential anomalies:

Local Setup:

poetry run aipal_validation --task outlier --step detect

Docker Setup:

docker compose run aipal bash
python -m aipal_validation --task outlier --step detect

The outlier detection uses isolation forest and local outlier factor (LOF) algorithms to identify samples that deviate significantly from the expected patterns in each class.

Model Retraining (on pediatric subset)

To retrain the AIPAL model with your dataset:

Local Setup:

poetry run aipal_validation --task retrain --step all

Docker Setup:

docker compose run aipal bash
python -m aipal_validation --task retrain --step all

The retraining process will:

Split your data into training and testing sets
Train an XGBoost model on the pediatric subset (age < 18)
Save the retrained model and prediction outputs to the aipal_validation/r/ directory
Perform evaluation on the test set

Name		Name	Last commit message	Last commit date
Latest commit History 373 Commits
aipal_validation		aipal_validation
jupyter		jupyter
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
__init__.py		__init__.py
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
tmp.cmd		tmp.cmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AIPAL Validator

Local Setup

Prerequisites

Docker Setup

Project Structure

Importing Data from Excel Without a Firemetrics Server

Steps to Import Data

Outlier Detection

Model Retraining (on pediatric subset)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

UMEssen/aipal-validation

Folders and files

Latest commit

History

Repository files navigation

AIPAL Validator

Local Setup

Prerequisites

Docker Setup

Project Structure

Importing Data from Excel Without a Firemetrics Server

Steps to Import Data

Outlier Detection

Model Retraining (on pediatric subset)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages