Skip to content
View gsanaev's full-sized avatar

Block or report gsanaev

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gsanaev/README.md

👋 Hi, I'm Golib Sanaev

Applied Data Scientist & Analyst | Python • SQL • ML • Analytics • Econometrics

Welcome to my GitHub!
I am an Applied Data Scientist and Analyst with a strong foundation in statistical modeling, data analysis, forecasting, and reproducible analytical workflows.
I combine PhD-level research experience with practical Python & SQL skills to build clear, interpretable, and well-structured data solutions that support decision-making.

I work extensively with Python (pandas, scikit-learn, OOP), SQL, Power BI, and modern analytical tools.
My interests lie at the intersection of Data Science, Applied Machine Learning, and Business Analytics.


🔧 Tools & Technologies

Languages & Analytics:
Python • R • SQL • Econometrics • Time Series • Forecasting • Regression & Classification

Python Stack:
pandas • numpy • scikit-learn • matplotlib • OOP • ETL Pipelines • APIs

R Stack:
dplyr • tidyr • purrr • ggplot2 • lubridate • renv • panel data workflows • economic indicator construction

Data Ops & Workflow:
Git • GitHub • Reproducible Analysis • Data Cleaning • Data Preparation

BI & Visualization:
Power BI • Tableau

Statistical Tools:
Stata • GLM • Model Evaluation • Feature Engineering • EDA


📁 Selected Projects

A reproducible R-based workflow for integrating multi-source firm-level datasets, performing harmonization and validation, and generating sectoral & regional economic indicators.
Tech: R, dplyr, ggplot2, data cleaning, statistical integration, renv


A modern finance analytics solution integrating data engineering, profitability modeling, and customer churn prediction — designed for banking and financial institutions.
Tech: Python, SQL, data engineering, PowerBI


Supervised learning project building a transparent ML workflow to classify ESG indicators using feature engineering, model evaluation, and reproducible pipelines.
Tech: Python, pandas, scikit-learn, feature engineering, classification models


Modular, object-oriented pipeline for extracting ESG-related data using APIs, structured processing steps, and reusable components.
Tech: Python, OOP, APIs, ETL, data cleaning


End-to-end actuarial risk modeling using GLMs to predict frequency, severity, and pure premium.
Tech: Python, GLM, Negative Binomial, model diagnostics, visualization


Time series forecasting using SARIMA and feature engineering to evaluate tourism dynamics in European hotel markets.
Tech: Python, SARIMA, time series modeling, feature engineering


📌 Current Work

I am currently developing a Retail Data Intelligence project, combining:

  • Python OOP pipelines
  • DuckDB SQL modeling
  • API-based data extraction
  • Forecasting & clustering
  • Power BI dashboarding
  • Business-driven retail KPIs

Repository will be published soon.


📫 Contact

🔗 LinkedIn: https://www.linkedin.com/in/golib-sanaev
📧 Email: gsanaev80@gmail.com

Feel free to reach out — I am open to opportunities in Data Science, Applied ML, Data Analytics, and Business Analytics roles.

Pinned Loading

  1. business-data-integration business-data-integration Public

    A reproducible R pipeline for business data integration, quality checks, and economic indicator computation using synthetic firm-level datasets.

    R

  2. enterprise-financial-kpi-platform enterprise-financial-kpi-platform Public

    End-to-end financial analytics platform integrating synthetic data generation, DuckDB warehouse, profitability modeling, churn prediction, and Power BI executive dashboards.

    HTML

  3. esg-classification esg-classification Public

    A complete end-to-end framework for classifying corporate ESG performance using machine learning, SHAP, and interactive dashboards.

    Jupyter Notebook

  4. esg-llm-platform esg-llm-platform Public

    Hybrid ESG KPI extraction pipeline (regex + NLP + table parsing + optional LLM). Fully reproducible, schema-based, and tested on synthetic sustainability reports.

    Jupyter Notebook

  5. insurance-risk-modeling insurance-risk-modeling Public

    Actuarial data science with Python — GLMs, ML, and portfolio risk simulation

    Jupyter Notebook

  6. forecasting-explaining-hotel-demand-in-eu forecasting-explaining-hotel-demand-in-eu Public

    Forecasting and explaining hotel demand across EU countries (2015–2025) using econometrics and machine learning models, feature engineering, and data visualization in Python.

    Jupyter Notebook