An intelligent tool to analyze PostgreSQL query workloads and recommend optimal indexes using a combination of:
- EXPLAIN plan analysis
- Heuristic pattern detection
- Machine learning insights
- Extracts top slow-running queries from
pg_stat_statements.
- Uses
EXPLAIN (FORMAT JSON)plans to suggest indexes. - Fallbacks to heuristic parsing (FROM + WHERE) if
EXPLAINfails. - Detects whether an index already exists before suggesting a new one.
- Outputs SQL
CREATE INDEXstatements when needed.
- Identifies indexes that have zero scans using PostgreSQL system catalogs.
Automatically generates:
| Chart Type | Description |
|---|---|
| 📊 Bar Chart | Count of index suggestions per table |
| 🥧 Pie Chart | % queries that need an index vs. already covered |
| File Name | Description |
|---|---|
index_suggestions.csv |
Recommended indexes per query |
top_queries.json |
JSON file of top slow queries |
unused_indexes.json |
Unused indexes in the current database |
index_suggestions_per_table.png |
Bar chart of suggestions per table |
index_suggestion_distribution.png |
Pie chart: index suggested or not |
- Clone the repo:
git clone https://github.com/dhmodh/ML-DBA.git
cd ML-DBA- Install dependencies:
pip install -r requirements.txtDependencies
psycopg2pandassqlparsematplotlibseaborn
- Update DB credentials in
run_advisor.py:
DB_CONFIG = {
"host": "your-db-host",
"port": 5432,
"dbname": "your-database",
"user": "your-username",
"password": "your-password"
}- Run the tool:
python run_advisor.pyML-DBA/
│
├── run_advisor.py # 🔁 Main execution script
├── models/
│ └── index_recommender.pkl # (optional) ML model for learning-based suggestions
├── output/
│ ├── index_suggestions.csv
│ ├── unused_indexes.json
│ ├── top_queries.json
│ ├── index_suggestions_per_table.png
│ └── index_suggestion_distribution.png
│
├── src/
│ ├── extractor_postgres.py # ⛏️ Extracts queries and EXPLAIN plans
│ ├── index_suggester.py # 🧠 Suggests indexes from EXPLAIN/heuristics
│ ├── index_reporter.py # 📋 Unused index detector
│ ├── sql_parser.py # 🔍 Extracts tables/columns from queries
│ └── visualizer.py # 📊 Generates visual plots from suggestions
│
└── README.md # 📖 You are here!
| queryid | table_name | suggested_columns | index_suggested | create_index_sql |
|---|---|---|---|---|
| 123456 | orders | user_id, created_at | 1 | CREATE INDEX ON orders(user_id, created_at); |
- Integrate with pg_stat_user_indexes to track usage patterns
- Web UI for visualization
- Support for MySQL, Cassandra, and other DB engines
We welcome contributions! Submit a pull request or open an issue 🚀
- Dishant Modh
- Inspired by AI + DBA best practices