I'm a Data Analyst with 3 years of experience, specializing in transforming complex data into actionable business intelligence. I build end-to-end data solutions, from engineering scalable ETL pipelines to developing interactive dashboards that empower data-driven decisions.
My passion lies in leveraging the modern data stack to solve challenging problems and create value from data.
Here are a couple of projects I'm proud of. Please feel free to explore!
| Project Name | Description | Tech Stack | Links |
|---|---|---|---|
| ETL Churn Analytics PowerBI | An end-to-end analytics project identifying key drivers of customer churn. Features a full ETL process, a predictive ML model, and a comprehensive BI dashboard. | SQL Python Scikit-learn Power BI |
π GitHub π Live Dashboard |
| Unified Investment Portfolio Dashboard | A fully automated data pipeline and Power BI dashboard to track a diversified investment portfolio (Stocks, MFs, Crypto). Aggregates data via APIs for daily updates. | Python REST APIs Pandas Power BI Docker |
π GitHub π Live Dashboard |
| Netflix Data Pipeline using DBT, Snowflake, and AWS S3 | This is an end-to-end cloud-based data engineering and analytics project that demonstrates the modern ELT (Extract-Load-Transform) pipeline using popular cloud tools: Amazon S3 for data storage, Snowflake as the data warehouse, and DBT (Data Build Tool) for data transformation, testing, documentation, and orchestration. | Python SQL AWS S3 dbt Snowfalke Jinja |
π GitHub |
| NYC Taxi Data Pipeline using DBT, Databricks, and GCS | This project demonstrates the design and implementation of a modern, end-to-end data platform on the cloud. It ingests raw NYC Taxi trip data, transforms it using a robust analytics engineering workflow, models it according to the Medallion Architecture, and orchestrates the entire pipeline for production, making it ready for business intelligence and analysis. | Python SQL GCS dbt Databricks Jinja |
π GitHub |
You can also find some of my earlier work on analytics (mostly machine learning) projects/case studies in my other repositories listed below.
| Project Name | Description | Tech Stack | Links |
|---|---|---|---|
| Network Intrusion Detection System | A Binary and Multi-Class Classification Problem solved with the help of many machine learning algorithms. | Python NumPy Pandas Matplotlib Scikit-learn Keras |
π GitHub |
| Emotion Detection OpenCV Pytorch ResNet9 | A ResNet9 architecture based model was trained on a Dataset containing 28000+ images spreaded across 7 different classes of emotions. And then the Model was used in a Real Time Application made using Tkinter. | Python NumPy Pandas Seaborn Scikit-learn PyTorch |
π GitHub |
| Health Care Case Study | Solving series of Business Problems related to Health Care Domain using Descriptive and Predictive Analysis. | Python NumPy Pandas Matplotlib Scikit-learn XGBoost |
π GitHub |
| NLP Analyzing Online Job Postings | Providing solutions to various Business Problems using Supervised and Unsupervised Learning on Text Data. | Python Matplotlib Scikit-learn imblearn NLTK Keras |
π GitHub |
| Thyroid Disease Detection | Predictive model that estimates a patientβs risk of thyroid dysfunction (hyperthyroidism or hypothyroidism) from clinical and laboratory features to support early detection. | Python Matplotlib Scikit-learn Keras Flask CSS HTML |
π GitHub |
| Text Mining Bank Reviews Complaints Analysis | Analyzing Bank Reviews using Supervised and Unsupervised Learning with the help of Natural Language Processing. | Python NumPy Pandas Matplotlib Scikit-learn NLTK |
π GitHub |
| Segmentation of Credit Card Customers | Performing Unsupervised Learning on a Credit Card Data to Segment Customers into different clusters using KMeans Clustering. | Python NumPy Pandas Matplotlib Scikit-learn |
π GitHub |
| Walmart Store Sales Forecasting | A regression based problem solved through sophisticated machine learning algorithms. | Python NumPy Pandas Matplotlib Scikit-learn Keras |
π GitHub |
| Predicting Credit Card Spend Identifying Key Drivers | A regression based problem solved via statistical modelling. | Python Matplotlib Scikit-learn SciPy Statsmodels |
π GitHub |
| **** | Python NumPy Pandas Matplotlib Scikit-learn Keras |
π GitHub |
I believe in continuous learning and am always excited to explore new technologies. My current focus is on:
- Orchestrating production-grade data workflows with Apache Airflow.
- Developing modular, testable data models with dbt.
- Processing large-scale datasets with Apache Spark on Azure Databricks.
- Containerizing applications with Docker for reproducible and scalable deployments.
Connect with me on social media or check out my portfolio. I'm always open to new opportunities and collaborations.
