AI-Aquatica is a comprehensive open-source Python library designed to analyze water quality data using advanced AI and statistical tools.
It facilitates preprocessing, modeling, visualization, and reporting of hydrochemical datasets with minimal effort – empowering researchers and professionals in hydrology, ecology, and environmental monitoring.
- ✅ Data Import: Load datasets from CSV, Excel, JSON, SQL, NoSQL, and APIs.
- 🧼 Data Cleaning: Remove duplicates and handle missing values via multiple strategies.
- 📏 Data Standardization: Normalize and standardize data (Z-score, MinMax, log, sqrt, Box-Cox).
- 🧠 Missing Data Imputation: Fill gaps with:
- Mean, Median, Mode
- KNN Imputer
- Regression Imputer
- Autoencoder Neural Network
- ⚖️ Ion Balance: Detect chemical inconsistencies and auto-correct based on ionic ratios.
- 📊 Statistical Analysis: Get descriptive statistics, correlation matrices, ANOVA, time series decomposition.
- 🤖 AI/ML Modeling:
- Regression & Classification (Logistic, SVM, Tree, RF)
- Clustering (KMeans, DBSCAN)
- Anomaly Detection (LOF, Isolation Forest)
- Synthetic Data (GAN-based generation)
- 📈 Visualization:
- Basic: Line, Bar, Pie, Scatter, Heatmaps
- Advanced: PCA, t-SNE, Interactive Bubble Charts
- 📝 Report Generation:
- Automatic HTML reports (statistics, ML evaluation, recommendations)
pip install ai-aquaticaOptional extras provide deep-learning and interactive visualization support:
# Install TensorFlow-powered utilities
pip install "ai-aquatica[deep_learning]"
# Install Plotly-based interactive charts
pip install "ai-aquatica[interactive]"
# Or grab everything
pip install "ai-aquatica[all]"Or from GitHub:
git clone https://github.com/TyMill/AI-Aquatica.git
cd AI-Aquatica
pip install -e .[all]Full guide: installation.md
Read the full documentation on GitHub Pages:
👉 https://tymill.github.io/AI-Aquatica/
Explore individual usage examples:
usage_data_cleaning.mdusage_data_loading.mdusage_missing_data.mdusage_statistical_analysis.md- ... and more!
from ai_aquatica.ml_analysis import train_classification_model
import pandas as pd
import numpy as np
# Create mock dataset
df = pd.DataFrame({
'NO3': np.random.rand(100),
'pH': np.random.rand(100),
'target': np.random.randint(0, 2, 100)
})
X = df[['NO3', 'pH']]
y = df['target']
# Train a random forest classification model using AI-Aquatica's helper
model = train_classification_model(X, y, model_type='random_forest')
print("Model trained successfully.")AI-Aquatica ships with ready-to-use Jinja2 templates stored in
ai_aquatica/templates. By default the report utilities render these files to create:
statistical_report.html(plus an accompanyingheatmap.pngchart),interpretation_report.html,further_analysis_report.html.
The templates are available immediately after installation, but if you installed only the
minimal dependencies make sure jinja2 is present:
pip install jinja2You can point the report functions to your own template directory by passing the template_dir
parameter. The directory should contain files named like the bundled templates so the engine can
find them.
from ai_aquatica.report_generation import generate_statistical_report
custom_templates = "/path/to/my/templates" # folder with statistical_report_template.html, etc.
generate_statistical_report(
data=df,
report_path="reports/wq_stat_report.html",
template_dir=custom_templates,
)Need more control? Copy the files from
ai_aquatica/templates, modify them, and pointtemplate_dirto the folder containing your customized versions.
Contributions are welcome! Please feel free to fork the repo and submit a pull request.
We especially welcome:
- New preprocessing or AI models
- Example notebooks / visual dashboards
- Dataset integrations
This project is licensed under the MIT License.
Special thanks to:
- Open-source contributors
- Environmental data science community
- University of Szczecin & BNP Paribas for ongoing support
📫 Questions? Suggestions? Open an issue or email the maintainer.