Welcome to my GitHub!
I am an Applied Data Scientist and Analyst with a strong foundation in statistical modeling, data analysis, forecasting, and reproducible analytical workflows.
I combine PhD-level research experience with practical Python & SQL skills to build clear, interpretable, and well-structured data solutions that support decision-making.
I work extensively with Python (pandas, scikit-learn, OOP), SQL, Power BI, and modern analytical tools.
My interests lie at the intersection of Data Science, Applied Machine Learning, and Business Analytics.
Languages & Analytics:
Python • R • SQL • Econometrics • Time Series • Forecasting • Regression & Classification
Python Stack:
pandas • numpy • scikit-learn • matplotlib • OOP • ETL Pipelines • APIs
R Stack:
dplyr • tidyr • purrr • ggplot2 • lubridate • renv • panel data workflows • economic indicator construction
Data Ops & Workflow:
Git • GitHub • Reproducible Analysis • Data Cleaning • Data Preparation
BI & Visualization:
Power BI • Tableau
Statistical Tools:
Stata • GLM • Model Evaluation • Feature Engineering • EDA
A reproducible R-based workflow for integrating multi-source firm-level datasets, performing harmonization and validation, and generating sectoral & regional economic indicators.
Tech: R, dplyr, ggplot2, data cleaning, statistical integration, renv
A modern finance analytics solution integrating data engineering, profitability modeling, and customer churn prediction — designed for banking and financial institutions.
Tech: Python, SQL, data engineering, PowerBI
Supervised learning project building a transparent ML workflow to classify ESG indicators using feature engineering, model evaluation, and reproducible pipelines.
Tech: Python, pandas, scikit-learn, feature engineering, classification models
Modular, object-oriented pipeline for extracting ESG-related data using APIs, structured processing steps, and reusable components.
Tech: Python, OOP, APIs, ETL, data cleaning
End-to-end actuarial risk modeling using GLMs to predict frequency, severity, and pure premium.
Tech: Python, GLM, Negative Binomial, model diagnostics, visualization
Time series forecasting using SARIMA and feature engineering to evaluate tourism dynamics in European hotel markets.
Tech: Python, SARIMA, time series modeling, feature engineering
I am currently developing a Retail Data Intelligence project, combining:
- Python OOP pipelines
- DuckDB SQL modeling
- API-based data extraction
- Forecasting & clustering
- Power BI dashboarding
- Business-driven retail KPIs
Repository will be published soon.
🔗 LinkedIn: https://www.linkedin.com/in/golib-sanaev
📧 Email: gsanaev80@gmail.com
Feel free to reach out — I am open to opportunities in Data Science, Applied ML, Data Analytics, and Business Analytics roles.