Curriculum Vitae in English (PDF)
Curriculum Vitae in Russian (PDF)
Data Analyst / Data Scientist with experience in development, research, and analytics both in industry and academy. Strong knowledge of SQL, Python, and data analysis. Honest, responsible, driven, fluent Russian and English proficiency, honed by years of teaching and research.
- Analyzed parallel A/B tests, estimated sample size, empirical errors, confidence intervals, and performed error correction using Holm’s method (Jupyter Notebook 1, Jupyter Notebook 2)
- Analyzed how CUPED reduces variance in data and how it affects the p-value (Jupyter Notebook)
- Analyzed how removing different percentages of outliers affects statistical power (Jupyter Notebook)
- Analyzed how removing different percentages of outliers affects sensitivity (Jupyter Notebook)
- Analyzed different methods of calculating confidence intervals (Jupyter Notebook 1, Jupyter Notebook 2)
- Compared methods of introducing effects in synthetic A/B tests (Jupyter Notebook 1, Jupyter Notebook 2)
- Calculated experimental group sizes and MDE for A/B tests (Jupyter Notebook 1, Jupyter Notebook 2)
- Analyzed novel newsfeed recommendation algorithm, designed to improve the key metric (CTR)
- Performed A/B testing to demonstrate CTR deterioration with a new recommendation algorithm using: transformations of the initial data (Laplace smoothing, Poisson bootstrap, bucket transformation), normality criterias (Shapiro-Wilk, D'Agostino), distribution difference criterias (Student's T-test, Mann-Whitney U-test), SQL, ClickHouse, Python, pandas, matplotlib (Jupyter Notebook)
- Demonstration of increasing of a key metric sensitivity using the linearization method (Jupyter Notebook)
- A/A testing to check CTR consistency across different datasets (Jupyter Notebook)
- ETL-pipelines for sending reports to ClickHouse and Telegram using Apache Airflow, Python, SQL (Airflow Graphs)
- Pipeline for monitoring and sending a report in case of an anomaly in the metrics (Graph in Python)
- Pipeline of a report to Telegram on key metrics of two products in different time slices (Graph in Python)
- Pipeline of a report to Telegram on basic product metrics (DAU, views, likes, CTR) (Graph in Python)
- Pipeline for sending a report to ClickHouse about the basic product metrics in different slices (Graph in Python)
- Dashboards for visualization and analysis of key metrics using Apache Superset, ClickHouse, SQL (Dashboards)
- Dashboard for analyzing the abnormal drop in the active audience of the newsfeed (Dashboard)
- Dashboard for analyzing differences in the behavior of organic and advertising users (Dashboard)
- Dashboard for analyzing basic product metrics of the newsfeed (likes, view, CTR, etc.) (Dashboard)
- Dashboard for analyzing audience metrics of several products (DAU, MAU, WAU, etc.) (Dashboard)
- Transferring Pareto Frontiers across Heterogeneous Hardware Environments
- Designed and published a machine learning approach for approximating and transferring Pareto frontiers of systems' properties across different cloud environments, using Python ecosystem (pandas, scikit-learn, matplotlib)
- Repository
- Paper
- Video
- Slides
- Transferring Performance Prediction Models Across Different Hardware Platforms
- Designed and published a machine learning approach for generalizing performance prediction models of configurable systems across different hardware platforms, extensively using R (tidyr, dplyr, reshape2, ggplot2, etc)
- Repository
- Paper
- Empirical Comparison of Regression Methods for Variability-Aware Performance Prediction
- Designed and published a machine learning study on comparison of various performance prediction methods, while extensively using R ecosystem (tidyr, dplyr, reshape2, ggplot2, etc)
- Repository
- Paper